GenRA: Release Notes
On this page:
Latest Version
RELEASE NOTES FOR GENRA 3.2 - March 2023
GenRA Version 3.2 addresses a number of bug fixes, code quality issues as well as offering several new features specifically:
- Over 30 tweaks and minor bug fixes closed.
- Data updates: synced with DSSTox 2022-02-18, invitroDB 3.5 and ToxRefDB 2.1
- Major speed up from improved use of indexing
- A new chemical fingerprint has been added, the AIM CSRML. This is a re-implementation of EPA’s Analog Identification Methodology fragment set but captured in a Chemical Structure Markup Language format. More details of the AIM CSRML are described in the accompanying manuscript published in Computational Toxicology – see https://doi.org/10.1016/j.comtox.2022.100256 for further information.
- New download options from Panel 1. Download from Panel 1 has been enriched to provide not only the Top 100 source analogs and their pairwise similarities but also the associated chemical or biological fingerprint matrix. The fingerprint matrix is provided both as a bitstring field in one column and as additional individual columns.
- New sorting in Panel 4: Data matrix. Observed data and read-across results can now be sorted by number of positives/negatives, confidence in predictions (i.e. AUC and p values).
- New download options from Panel 4. Download from Panel 4 now permits ease of filtering and sorting on the basis of confidence in predictions, observation richness which will facilitate post-processing by end users.
- Neighborhood explorer graph visualization tool. Network visualization has been extended to enable neighborhoods to be viewed without filtering on the basis of ToxCast or ToxRefDB data.
Previous Versions
RELEASE NOTES FOR GENRA 3.1
GenRA Version 3.1 addresses a number of bug fixes, code quality issues as well as offering several significant new features specifically:
- 38 bug/tweak tickets closed.
- 40 code quality/maintenance/technical debt reduction tickets closed
- Ability to download the radial plot view and top 100 most similar analogs from Panel 1 in the application. The latter is particularly useful if the use case is simply to return the top 100 substances (DTXSID identifiers and Jaccard similarity scores) without the ToxRefDB filter and query the CompTox Chemicals Dashboard for additional information using the Batch search functionality.
- Physical Properties visualization and reporting. The ability to explore physicochemical similarity across candidate source analogues is now afforded by exploring the distributions of specific properties (properties are predictions from the OPERA software tool (https://github.com/kmansouri/OPERA) – LogKow, Vapour Pressure, Henry’s Law Constant, Melting point, Boiling point, Water Solubility as well as Molecular Weight). These are depicted as a series of boxplots/swarmplots launched as a pop up visualization from within Panel 1. Values are also tabulated in Panel 4’s datamatrix view. The view is intended to provide some additional context for analogue evaluation by exploring to what extent physical property values are consistent and comparable across analogues relative to the target chemical of interest.
- Neighborhood explorer graph visualization tool. This network tool enables a side-by-side exploration of the top 3 source analogues across different fingerprint (FP) types. This is a popup from within Panel 1. Source analogues and their next generation analogues can be compared on the basis of different FPs from one view. The data underpinning the network view can also be downloaded as a json blob object. This permits a user to analyze the network view with other 3rd party tools such as Cytoscape or Networkx.
- Vendor specific ToxCast fingerprints for a subset of vendors including Attagene and Bioseek. Some substances have been well tested across more assays than others – that may overestimate the similarity for chemicals that might have only been tested in a limited set of assays.
- Use of genra-py library library (Shah et al., 2021) to facilitate different data types. This allows for continuous and binary data to be used in the GenRA approach.
- Prediction of continuous values. Prediction of in vivo toxicity potencies are now feasible rather than binary toxicity predictions. The potency information being predicted relies on dose values aggregated from ToxRefDB and making use of the genra-py library. A proof-of-concept data matrix view has been developed to capture potency value ranges.
- Generalizing for multiple data streams. The infrastructure for additional predictions/aggregations in the future has been added. As a first use case, predictions on the basis of ToxCast hit calls is now possible. Panel 1’s radial plot to return those analogues with available ToxCast data outcomes is now possible to seed the subsequent panel views and facilitate an assay level prediction on the basis of chemical fingerprints.