The groundwork for building a disease-specific pediatric cancer data commons was first laid by the INRG Data Commons, which was launched in 2014 based on work dating back to 2004, and continued with the development of the INSTRuCT Data Commons in 2017-2018. Based on these successes, we have significantly streamlined the process of creating a new disease commons and have been able to rapidly establish work with additional disease areas, with more yet to come.

Our Progress

Visual tracker showing progress for eight cancer groups toward the following milestones: Stakeholders Engaged, Data Dictionary Established, Data Contributors Committed, Consortium MOU Signed, Cases in Commons, Analyses in Progress, and Papers Published. Acute Lymphoblastic Leukemia: stakeholder engagement complete; data dictionary in progress. Acute Myeloid Leukemia: stakeholder engagement complete; data dictionary V1 complete; 4 data contributors committed; consortium MOU in progress (INTERACT). Bone Tumors: stakeholder engagement complete; data dictionary in progress; 9 data contributors committed; consortium MOU in progress (HIBiSCus). Central Nervous System Tumors: stakeholder engagement complete; data dictionary in progress. Germ Cell Tumors: stakeholder engagement complete; data dictionary V1 established; 8 data contributors committed; consortium MOU complete (MaGIC); 13 analyses in progress; 33 papers published. Hodgkin Lymphoma: stakeholder engagement complete; data dictionary V1 established; 2 data contributors committed; consortium MOU in progress (NODAL). Neuroblastoma: stakeholder engagement complete; data dictionary V2 established; 4 data contributors committed; consortium MOU complete (INRG); 22,000 cases in commons; 10 analyses in progress; 17 papers published. Soft Tissue Sarcoma: stakeholder engagement complete; data dictionary V2 established; 5 data contributors committed; consortium MOU complete (INSTRuCT); 4,600 cases in commons; 7 analyses in progress; 4 papers published.

What do these milestones mean?

Stakeholders Engaged

In some cases, pre-established consortia have approached the PCDC to create a disease-specific commons. In other cases, we work closely with leaders in a specific pediatric cancer type to identify interested cooperative research groups who are willing to form a consortium to guide development of the commons and steer future progress. 

Data Dictionary Established

The PCDC data standards team joins a work group of researchers and clinicians with disease-specific knowledge to develop the data dictionary, by which all data in the commons will be standardized. This data dictionary is created using case report forms from existing studies, and is ballotted and agreed upon by disease group experts and statisticians. The work group also establishes a change control process for future updates.

The number in the progress tracker represents the version of the data dictionary currently in use.

Data Contributors Committed

The initial data in the commons is contributed by the participating cooperative groups, which have already collected clinical trials data from their part of the world in a centralized way. Each cooperative group contributing data to the commons signs a Data Contributor Agreement (DCA), a legal contract between the contributing institution and the University of Chicago.

The number in the progress tracker represents the number of committed data contributors.

Consortium MOU Signed

A Memorandum of Understanding (MOU) is agreed upon and signed by all committed parties, and the consortium is officially formed. An executive committee is established, composed of representatives from each cooperative group and experts from key disciplines. This committee is responsible for strategic planning as well as reviewing and approving new members, contributions of data to the commons, and requests from researchers for access to data. Operating procedures are established, and work groups are created to drive specific areas of work such as governance and data dictionary development.

Cases in Commons

Once the consortium and data dictionary have been established and DCAs have been signed, data can be ingested into the commons. Cooperative groups within the consortium provide mapped and harmonized data from their closed clinical trials. The PCDC team performs quality control checks and integrates each set of contributed data into the commons where it can be combined and cross-referenced with other data. Our technology team also sets up a cohort discovery tool to make it easier for researchers to explore the commons.

The number in the progress tracker represents the number of cases currently available in the commons.

Analyses in Progress

To receive access to data in the commons, investigators submit a project proposal to the consortium executive committee. If the proposal is approved, the University of Chicago and the requesting institution sign a Data Use Agreement (DUA), a legal document outlining how the data may be used. The relevant dataset is then securely provided to the researcher.

The number in the progress tracker represents the number of projects currently in progress.

Papers Published

Finally, investigators use data from the commons to conduct analyses and publish or present their findings. These discoveries, made from data that might otherwise have sat siloed and unused, can now inform future research and clinical practices such as risk stratification, ultimately leading to better outcomes for children with cancer.

The number in the progress tracker represents the number of published manuscripts resulting from data obtained from the commons.