Data for the Common Good applies our expertise in data sharing, as well as the streamlined and scalable infrastructure and processes that we have developed for the Pediatric Cancer Data Commons, to other areas where we can make a difference. Our portfolio of projects includes major government initiatives, a tool to facilitate clinical trials matching, data commons for rare diseases, and more. With our unique approach to data sharing, which prioritizes relationship-building, data quality, and sustainability, we are proud to work in partnership with researchers, clinicians, and patients to drive new science and improve lives.

Data Commons

PREDICT
Monogenic diabetes consortium and data commons

Monogenic diabetes, a subtype of diabetes caused by changes to a single gene, represents 1-4 percent of cases of diabetes in the US. Due to its rarity, a single source for patient data will be a critical resource for researchers to advance science and clinical practice. Working closely with the University of Chicago Kovler Diabetes Center, PREDICT (PREcision DIabetes ConsorTium) has brought together stakeholders from more than a dozen institutions to build a commons that will include clinical data, patient-reported outcomes, and data from wearable devices such as continuous glucose monitors. The consortium is applying D4CG methods to developing a data dictionary and implementing governance structures.

Monogenic Epilepsy Data Commons
EEG data commons for pediatric monogenic epilepsies

A new initiative launched in summer 2024 is extending D4CG data commons development to the field of pediatric monogenic epilepsies. With funding from the Chan Zuckerberg Initiative, we are poised to create the world’s largest electroencephalogram (EEG) data commons for these rare and challenging epilepsies caused by individual gene mutations. This project will bring together medical centers and patient advocacy groups to collect and harmonize EEG data, addressing the urgent need for a unified, high-quality, and highly-annotated data source.

Partnering with Dr. Doug Nordli, a leading pediatric epilepsy expert and co-director of the University of Chicago’s Comprehensive Epilepsy Center, D4CG will leverage the University’s expertise in the quantitative analysis of EEGs to build a data commons using the same methodologies that were successful in building the PCDC. This effort will integrate raw and analyzed EEG data with genomic information and crucial clinical details to enhance the richness of these data sources and open new research avenues. In future stages, we plan to collect cognitive and outcome measures, as well as treatment data via linkage with electronic health records, in order to further increase the value of the data commons and expand the scope of research it can support.

Sociome Data Commons
Studying the social determinants of health

The Sociome refers to the non-clinical aspects of life affecting health: social, environmental, behavioral, psychological, and economic factors. Integrating these social determinants of health with clinical data can create new opportunities for valuable research, but requires that they be comprehensively collected in a way that is suitable for large-scale analysis.  D4CG is part of a multidisciplinary consortium in Chicago working to build the Sociome Data Commons, a resource that will allow researchers to integrate the social context of disease with clinical and genomic data to better understand, predict, and treat numerous conditions and improve human health.

Learn more at the Institute for Translational Medicine (ITM)

Clinical Trials Matching

GEARBOx
Clinical trials matching tool

As part of The Leukemia & Lymphoma Society (LLS) PedAL initiative, a pillar of The Dare to Dream Project, we developed a clinical trials matching tool for clinicians to rapidly and accurately match children with relapsed acute myeloid leukemia to targeted treatments in North America. GEARBOx, launched in 2022, is a web-based tool that uses a matching algorithm to identify potentially appropriate clinical trials based on COG eligibility criteria and the patient’s clinical data, immunophenotype, and genomic profile. We are now working on improving GEARBOx by adding new features, integrating more trials, and extending this tool to additional tumor types.

Explore GEARBOx

National Cancer Institute Projects

D4CG and the PCDC are an important part of a national data sharing ecosystem through the National Cancer Institute’s Childhood Cancer Data Initiative (CCDI). We are proud to contribute to the CCDI’s efforts to improve cancer prevention, treatment, quality of life, and survivorship and to ensure that researchers learn from every child with cancer.

CCDI Data Federation

With a collaborative group of institutions, we are working to make the multiple data commons involved in CCDI able to interoperate by developing and implementing a common harmonized data model and API for queries across the PCDC, St. Jude, Seven Bridges, and the Gabriella Miller Kids First Data Resource Center.

We have received $722,292 in funding for this project, 100% of which is financed with federal money.

C3DC

We are participating in developing the Childhood Clinical Data Commons (C3DC), a data node of the CCDI that will act as the primary source of individual-level data describing participants’ demographic and clinical characteristics. C3DC will interoperate with other CCDI data type-specific nodes such as genomics, imaging, and proteomics.

We have received $1,039,349 in funding for this project with an Option Period to extend for additional funding in an amount of $823,921. The total anticipated budget for this project is $1,863,270, 100% of which is financed with federal money.

Past Projects

CCDH

The Center for Cancer Data Harmonization (CCDH) was developed to drive the interoperability and accessibility of the data within the NCI Cancer Research Data Commons (CRDC). From 2019-2021, D4CG co-led the development of the CCDH alongside four other institutions, with a focus on community development and data model harmonization.

PCDC-H

In 2022 we concluded an 18-month project which established the foundations for integrating the PCDC with the NCI Cancer Research Data Commons (CRDC). By developing and mapping data to a common PCDC-H data model aligned with the NCI Thesaurus, we made it possible to link PCDC data with data in other CRDC nodes across the country, creating a robust, integrated resource for pediatric cancer research.

DI-Cubed

In partnership with Leidos Biomedical, in 2019 we developed a process for integrating radiology images into a data commons as part of the NCI Data Integration and Imaging Informatics (DI-Cubed) Project. Our pilot project tested the feasibility of linking image data to clinical data in a commons environment and serving this information to researchers in real time, with the INRG Data Commons serving as the paradigm system.