Showing 1 Results
Showing 1-1 of 1
Dataset

To view details of each card, click icon

DATASET DESCRIPTION: Collection of metadata and DataFrames used by machine learning models in the Cellular-Level Pilot project to predict drug response in various cancer cell lines
Cancer Drug Response Prediction Dataset
CDRP
Short Description:

Collection of metadata and DataFrames used by machine learning models in the Cellular-Level Pilot project to predict drug response in various cancer cell lines

Long Description:

This dataset contains:

  • DataFrames and supporting metadata used by Combo, Single Drug Response Predictor (formerly P1B3), Uno, UNOMT, CLRNA, and benchmarking machine learning models in the Cellular-Level Pilot project to predict drug response in various cancer cell lines.
  • Gene expression and drug response data for cancer cell lines from the NCI-60 Human Cancer Cell Line Screen (NCI 60), NCI ALMANAC, NCI Sarcoma (SCL), NCI Small Cell Lung Cancer (SCLC), Cancer Cell Line Encyclopedia (CCLE), Genomics of Drug Sensitivity in Cancer (GDSC), Genentech Cell Line Screening Initiative (gCSI), and Cancer Therapeutics Response Portal (CTRP) studies, and molecular descriptors generated using Dragon 7.0 and Mordred software packages.
  • Relevant metadata for the cancer cell lines and drug compounds.
  • A list of genes from the Library of Integrated Network-Based Cellular Signatures (LINCS) 1000 study. The LINCS1000 gene set was used as a reference to filter cancer cell line data.

The TopN DataFrames for the Cellular-Level Pilot combine drug response data, gene expression data, and drug molecular descriptors into a single DataFrame to support building binary classification or regression machine learning models to predict drug response. These DataFrames include top N cancer types that have the most cell lines with the RNA-Seq and drug response data available. The models can be further evaluated and improved by using an empirical method, Learning curves. For more information, refer to the following links.

GitHub repository links:

CLRNA

https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Semi-Supervised-Feature-Learning-with-Center-Loss

Combo

https://github.com/CBIIT/NCI-DOE-Colab-Pilot1-Combo-combination-drug-response-predictor

Learning Curve

https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Learning-Curve

Single Drug Response Predictor

https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Single-Drug-Response-Predictor

Uno

https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Unified-Drug-Response-Predictor

 

Source links:

Aspuru-Guzik VAE

https://github.com/aspuru-guzik-group/chemical_vae

CCLE

https://portals.broadinstitute.org/ccle/data

CTRP

https://portals.broadinstitute.org/ctrp/

Dose Response AUC

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5753377/

GDC

https://portal.gdc.cancer.gov/

GDSC

https://www.cancerrxgene.org/downloads/bulk_download

LINCS1000

http://lincsportal.ccs.miami.edu/dcic-portal/

NCI ALMANAC

https://dtp.cancer.gov/ncialmanac/initializePage.do

NCI PDMR

https://pdmdb.cancer.gov/web/apex/f?p=101:41

NCI Sarcoma

https://sarcoma.cancer.gov/sarcoma/downloads.xhtml

NCI Small Cell Lung Cancer

https://sclccelllines.cancer.gov/sclc/

NCI-60 - CellMiner

https://discover.nci.nih.gov/cellminer/loadDownload.do

NCI-60 - DTP

https://dtp.cancer.gov/databases_tools/bulk_data.htm

gCSI

https://pharmacodb.pmgenomics.ca/datasets/4

 

VERSION: Version 1
CONTENT TYPE: RNA-Seq, Drug Response, Drug Molecular Descriptors, SMILES
CDRP Models & Software