Enhanced Co-Expression Extrapolation (E-COXEN) | Computational Resources for Cancer Research

Short Description

Extends the original COXEN method to select genes that are predictive of the efficacies of multiple drugs for building general drug response prediction models that are not specific to a particular drug.

User Community

Users interested in the following topics:

Primary: Cancer biology data modeling
Secondary: Machine learning, bioinformatics, and computational biology

Impact

Enables building of anti-cancer drug response prediction models using selected genes and drugs.

Description

The Enhanced Co-Expression Extrapolation (COXEN) method enhances the original COXEN method to select genes that are predictive of the efficacies of multiple drugs, for the purpose of building general drug response prediction models that are not specific to a particular drug. It was designed for the applications where the drug efficacy data of a set of cancer cases are used to predict the response of another set of cancer cases.

Hypothesis/Objective

The objective was to create a method that selects genes that are predictive of response to multiple anticancer drugs instead of one anticancer drug as done in previous work.

Resource Role

This resource could be used in drug response models, such as Single Drug Response Predictor, Unified Drug Response Predictor, and Combination Drug Response Predictor, for feature selection of genes.

Uniqueness

The original COXEN method has been successfully used in multiple studies to select genes for predicting the response of tumor cells to a specific drug treatment. The enhanced COXEN method selects genes that are predictive of the efficacies of multiple drugs for building general drug response prediction models that are not specific to a particular drug. It first ranks the genes according to their prediction power for each individual drug and then takes a union of top predictive genes of all the drugs, among which the algorithm further selects genes whose co-expression patterns are well preserved between cancer cases for building prediction models.

Usability

To use the software package in this repository for enhanced COXEN analyses, users must meet the following criteria:

Possess the basic skills to run Python scripts.
Able to process the gene expression data and drug response data into the data format accepted by the enhanced COXEN package.

Level of Documentation

Minimal

Components

The scripts folder in this repository (https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Enhanced-COXEN/blob/main/Scripts) includes the following Python scripts:

EnhancedCOXEN_Functions.py provides all the functions used by the enhanced COXEN method.
Example_Run.py provides example code demonstrating how to use the functions for enhanced COXEN analysis.

The data folder in this repository (https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Enhanced-COXEN/blob/main/Data) includes a small dataset, composed of the following data files, for demonstrating the utility of this software package:

Gene_Expression_Data_Of_Set_1.txt provides the gene expression data of cancer case set 1.
Drug_Response_Data_Of_Set_1.txt provides the drug response data of cancer cases in set 1.
Gene_Expression_Data_Of_Set_2.txt provides the gene expression data of cancer case set 2, for which drug response needs to be predicted.

Inputs

The required input data are:

Gene expressions of cancer case set 1 and cancer case set 2, and
The drug response data of cancer case set 1.

Input Data Type

RNA-Seq

Drug Molecular Descriptors

Input Data Format

Tabular

Results

The results demonstrate that genes selected by the enhanced COXEN method always provide a statistically significantly improved prediction performance (adjusted p-value ≤ 0.05) and increase the power of gene expression data for drug response prediction.

Primary Publication

Enhanced Co-Expression Extrapolation (COXEN) Gene Selection Method for Building Anti-Cancer Drug Response Prediction Models

Outputs

The output includes the indices of selected genes. For details of input and output data, such as data format, refer to the readme file, code comments, and example data included in the package.

AVAILABLE ON GITHUB

https://github.com/CBIIT/NCI-DOE-Collab-Pilot1-Enhanced_COXEN