AIDR Challenge Tier 1: McFarlin
(AIDR)

Short Description

This is a winning submission from the 2024 AI Data Readiness Challenge.

Description and Impact
Impact

Explores the AI Data Readiness of CRDC data.

Hypothesis/Objective

This asset contains the submission from Agnes McFarlin, the second place winner of the Tier 1: Single Modal Data Challenge. In this tier, participants must train an AI/ML model utilizing data from a single data class. 

Technical Elements
Uniqueness

Use case: Tier 1 (Single modal data), Category 4 (Diagnosis) 

General use case: Classify cancer cells versus healthy cells in a specific tissue 

Specific use case: Use of radiological images from Imaging Data Commons, Cancer Imaging Archive, and Cancer Data Access System to identify cancerous lung nodules 

Usability

A data scientist can run the provided scripts after obtaining the appropriate data from the CGC.

Components

The documentation, pre-processing, and model related files are available in Model and Data Clearinghouse (MoDaC). The data can be accessed via the Cancer Genomic Cloud (CGC).

Results
Outputs

Assessment of dataset readiness and model predictions.