Integrated DataFrames of Most Prevalent Cancer Types - TopN [Top6/Top21]
(TopN Cancer Types)

Dataset Description
Dataset Description

This asset contains five files. The TopN DataFrames for the Cellular-Level Pilot combine drug response data, gene expression data, and drug molecular descriptors into a single DataFrame to support building binary classification or regression machine learning models to predict drug response. These DataFrames include top N cancer types that have the most cell lines with the RNA-Seq and drug response data available. For more information, refer to the following source links:

Source CCLE

https://portals.broadinstitute.org/ccle/data

Source CTRP

https://portals.broadinstitute.org/ctrp/

Source GDSC

https://www.cancerrxgene.org/downloads/bulk_download

Source NCI-60 – DTP

https://dtp.cancer.gov/databases_tools/bulk_data.htm

Source gCSI

https://pharmacodb.pmgenomics.ca/datasets/4

Content Type
Content Type
RNA-Seq
Drug Response
Drug Molecular Descriptors