Bioactivity API
Center for Computational Toxicology and Exposure
Source:vignettes/Bioactivity.Rmd
Bioactivity.Rmd
Introduction
In this vignette, CTX Bioactivity API will be explored.
NOTE: Please see the introductory vignette for an overview of the ctxR package and initial set up instruction with API key storage.
Data provided by the API’s Bioactivity endpoints are sourced from ToxCast’s invitrodb.
US EPA’s Toxicity Forecaster (ToxCast) program makes in vitro medium- and high-throughput screening assay data publicly available for prioritization and hazard characterization of thousands of chemicals.
The ToxCast pipeline (tcpl) is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, invitrodb . These assays comprise Tier 2-3 of the new Computational Toxicology Blueprint, and employ automated chemical screening technologies, to evaluate the effects of chemical exposure on living cells and biological macromolecules, such as proteins (Thomas et al., 2019). More information on the ToxCast program can be found at https://www.epa.gov/comptox-tools/toxicity-forecasting-toxcast.
This flexible analysis pipeline is capable of efficiently processing and storing large volumes of data. The diverse data, received in heterogeneous formats from numerous vendors, are transformed to a standard computable format and loaded into the invitrodb database by vendor-specific R scripts. Once data is loaded into the database, ToxCast utilizes generalized processing functions provided in this package to process, normalize, model, qualify, and visualize the data.
The Bioactivity API endpoints are organized into two different resources, “Assay” and “Data”. “Assay” resource endpoints provide assay metadata for specific or all invitrodb ‘aeids’ (assay endpoint ids). These include annotations from invitrodb’s assay, assay_component, assay_component_endpoint, assay_list, assay_source, and gene tables, all returned in a by-aeid format.
“Data” resource endpoints are split into summary data (by ‘aeid’) and bioactivity data by ‘m4id’ (i.e. both ‘aeid’ and ‘spid’). The summary endpoint returns the number of active hits and total multi- and single-concentration chemicals tested for specific ‘aeids’. The other endpoints return chemical information, level 3 concentration-response values, level 4 fit parameters, level 5 hit parameters, and level 6 flags for individual chemicals tested for given ‘AEIDs’, ‘m4ids’, ‘SPIDs’, or ‘DTXSIDs’.
Regular ToxCast/invitrodb users may find it easier to use tcpl, which has integrated ctxR’s bioactivity functions to make the API data retrievable in a familiar format. See the tcpl vignette regarding data retrieval via API for more information.
Functions
Several ctxR functions are used to access the CTX Bioactivity API data.
Bioactivity Assay Resource
Specific assays may be searched as well as all available assays that have data using two different functions.
Get annotation by aeid
get_annotation_by_aeid()
retrieves annotation for a
specific assay endpoint id (aeid).
assay <- get_annotation_by_aeid(AEID = "891")
get_annotation_by_aeid_batch()
retrieves annotation for
a list (or vector) of assay endpoint ids (aeids).
assays <- get_annotation_by_aeid_batch(AEID = c(759,700,891))
# return is in list form by aeid, convert to table for output
assays <- data.table::rbindlist(assays)
printFormattedTable(assays, c(4, 18, 19, 33, 51)) # printed using custom formatted table
Get all assay annotations
get_all_assays()
retrieves all annotations for all
assays available.
all_assays <- get_all_assays()
Bioactivity Data Resource
There are several resources for retrieving bioactivity data associated with a variety of identifier types (e.g., DTXSID, aeid) that are available to the user.
Get summary data
get_bioactivity_summary()
retrieves a summary of the
number of active hits compared to the total number tested for both
multiple and single concentration by aeid.
summary <- get_bioactivity_summary(AEID = "891")
get_bioactivity_summary_batch()
retrieves a summary for
a list (or vector) of assay endpoint ids (aeids).
summary <- get_bioactivity_summary_batch(AEID = c(759,700,891))
summary <- data.table::rbindlist(summary)
Get data
get_bioactivity_details()
can retrieve all available
multiple concentration data by assay endpoint id (aeid), sample id
(spid), Level 4 ID (m4id), or chemical DTXSID. Returned is chemical
information, level 3 concentration-response values, level 4 fit
parameters, level 5 hit parameters, and level 6 flags for individual
chemicals tested. An example for each request parameter is provided
below:
# By spid
spid_data <- get_bioactivity_details(SPID = 'TP0000904H05')
# By m4id
m4id_data <- get_bioactivity_details(m4id = 739695)
# By DTXSID
dtxsid_data <- get_bioactivity_details(DTXSID = "DTXSID30944145")
# By aeid
aeid_data <- get_bioactivity_details(AEID = 704)
Similar to the other _batch
functions,
get_bioactivity_details_batch()
retrieves data for a list
(or vector) of assay endpoint ids (aeid), sample ids (spid), Level 4 IDs
(m4id), or chemical DTXSIDs.
aeid_data_batch <- get_bioactivity_details_batch(AEID = c(759,700,891))
aeid_data_batch <- data.table::rbindlist(aeid_data_batch, fill = TRUE)
Conclusion
In this vignette, a variety of functions that access different types
of data found in the Bioactivity
endpoints of the CTX APIs
were listed. We encourage the reader to explore the data accessible
through these endpoints work with it to get a better understanding of
what data is available. Additionally, experienced ToxCast/invitrodb
users may find it easier to continue to use tcpl, which has
integrated ctxR’s bioactivity functions to make the API data retrievable
in a familiar format.