Downloadable Computational Toxicology Data
- High-Throughput Screening
- Rapid Exposure and Dose Information
- Animal Toxicity
- ACToR
- Ecotoxicology
- Chemicals and Chemistry Data
- CompTox Chemicals Dashboard
- Virtual Tissues
- Literature Mining
EPA’s computational toxicology research efforts evaluate the potential health effects of thousands of chemicals. The process of evaluating potential health effects involves generating data that investigates the potential harm, or hazard of a chemical, the degree of exposure to chemicals as well as the unique chemical characteristics.
As part of EPA’s commitment to share data, all of the computational toxicology data is publicly available for anyone to access and use. EPA's computational toxicology data is considered "open data", and thus all of the data below are free of all copyright restrictions, and fully and freely available for both non-commercial and commercial use.
High-throughput Screening
EPA researchers use rapid chemical screening (called high-throughput screening assays) to limit the number of laboratory animal tests while quickly and efficiently testing thousands of chemicals for potential health effects.
- ToxCast Data Download Page: High-throughput screening data and resources on thousands of chemicals. This page offers assay descriptions and screening data.
- High Throughput Transcriptomics (HTTr): Metadata for the HTTr signature level pathway information.
- High Throughput Phenotypic Profiling (HTPP): Metadata for the HTPP global- , category- , and feature-level mappings.
Rapid Exposure and Dose Information
EPA researchers develop and use rapid exposure and dose estimates to predict potential exposure for thousands of chemicals.
- SHEDS-HT
- Systematic Empirical Evaluation of Models (SEEM): To apply high throughput exposure (HTE) models in a human health risk framework it is necessary to quantify the uncertainty in the HTE predictions. One recent uncertainty quantification approach has been to treat chemicals for which monitoring data are available as representative of chemicals without such data. In this way, the uncertainty of HTE predictions for chemicals without biomonitoring data may be estimated from chemicals with biomonitoring data. The predictions can be compared with population exposure biomonitoring data via toxicokinetic modeling, for purposes of evaluation and calibration.
- CPDat Data Release Information: The Chemical and Products Database contains information mapping chemicals to a set of terms categorizing their usage or function in consumer product (e.g. shampoo, soap) types.
- Access CPDat Data: Current Data File
- Consumer and Product Categories (CPCat): CPCat was last updated in 2015. Archive of the file previously found in ACToR: Data File
- Dosimetry Data:
- High-throughput Toxicokinetics (HTTK): It is important to link the external dose of a chemical to an internal blood or tissue concentration, this process is called toxicokinetics. EPA researchers measure the critical factors that determine the distribution and metabolic clearance for hundreds of chemicals and incorporate these data into computer models. The high-throughput toxicokinetic data can be paired with the high-throughput screening data to estimate real-world exposures.
- Multimedia Monitoring Database (MMDB): MMDB contains data from 20 individual public monitoring data sources that have been extracted, curated for chemical and medium, and harmonized into a sustainable machine-readable data format for support of exposure assessments.
Animal Toxicity
-
Toxicity Reference Database (ToxRefDB): The Toxicity Reference Database (ToxRefDB) contains in vivo study data from over 5900 guideline or guideline-like studies for over 1100 chemicals. By employing a controlled vocabulary for enhanced data quality, ToxRefDB (v2.1, released August 2022) serves as a resource for study design, quantitative dose response, and endpoint and effect controlled vocabulary linked to the required, recommended, or triggered measurements indicated by corresponding guideline specifications. The database can aid in the validation of in vitro high-throughput screening of chemicals and support retrospective and predictive toxicology applications.
- The Toxicity Value Database (ToxValDB) is a large compilation of human health-relevant in vivo toxicology data, including data on both in vivo toxicity experiments and derived toxicity and guideline values. ToxValDB was designed to provide high-level summary data in a standardized format to facilitate comparison and data use across many individual databases. The current version of the database (9.5) contains 231,485 records covering 39,434 unique chemicals from over 40 sources.
- Latest version: ToxValDB v9.5
- Previous versions accessible here: Download Previous Versions of Database Package
Aggregated Computational Toxicology Resource (ACToR)
- The Aggregated Computational Toxicology Resource (ACToR) is the EPA's online aggregator of >1,000 worldwide public sources of environmental chemical data. The data includes information on chemical production, exposure, occurrence, hazard, and risk management. The publication can be referenced here.
Ecotoxicology
- ECOTOX: The Ecotoxicology Database (ECOTOX) is a comprehensive Knowledgebase that provides information on adverse effects of single chemical stressors to ecologically relevant aquatic and terrestrial species.
- Sometimes ECOTOX may not work properly in Chrome. It is recommended to clear your browsing data, cache, or history if you experience issues.
Chemicals and Chemistry Data
EPA researchers use chemistry data such as chemical structures and physicochemical property information to evaluate thousands of chemicals for potential health effects. Cheminformatics is the backbone of CCTE's ToxCast and the multi-agency Tox21 HTS screening programs.
- DSSTOX
- TEST Predictions
- Collaborative Estrogen Receptor Activity Prediction Project Data: Data and supplemental files from CERAPP (A large-scale modeling project). CERAPP combined multiple models developed in collaboration with 17 groups in the United States and Europe to predict estrogen receptor activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity.
CompTox Chemicals Dashboard
- CompTox Chemicals Dashboard Data: Data specific to the CompTox Chemicals Dashboard releases, including the mappings between the DTXSIDs and the InChIStrings and Keys, SDF files containing all chemical structures and relevant information, and a file containing CAS Number, Preferred Chemical Name and DTXSID file. Note that some CompTox Chemicals Dashboard associated data may be found in other areas within the Downloadable Computational Toxicology Data webpage. Data sources associated with updates to the CompTox Chemicals Dashboard are pointed to in the release notes (where available).
Virtual Tissues
EPA researchers develop virtual tissue computer models to simulate how chemicals may affect human development. Virtual tissue models are some of the most advanced methods being developed today. The models will help reduce dependence on animal study data and provide much faster chemical risk assessments.
- Tipping Point Data: EPA researchers develop mathematical models to predict perturbation of biological systems and determine when cellular systems are no longer able to recover. EPA researchers use these models to determine the “Tipping Point”, the point when biological systems are unable to recover from or adapt to chemical exposure. When cellular systems are unable to recover, chemical exposures could lead to adverse outcomes such as cancer.
Literature Mining
- Abstract Sifter: Contains the abstract sifter tool, database, and user guide. The Abstract Sifter is a Microsoft Excel based tool that greatly enhances literature searching in PubMed. The tool implements a novel “sifter” functionality for relevance ranking, giving the researcher a way to find articles of interest quickly. The Sifter assists researchers to triage results and keep track of articles of interest. The tool also gives researchers a view of the literature landscape for a set of entities such as chemicals or genes and makes it easy to dive deeper into areas of interest.