Data commons can change how the world does research.

A collaborative data-sharing approach has the potential to change the entire landscape of cancer research. As we build the PCDC, our mission includes sharing our methods and lessons learned with the scientific community and assessing further opportunities to advance the state of the art in this field.

Automated Matching of Patients to Clinical Trials: A Patient-Centric Natural Language Processing Approach for Pediatric Leukemia
Kaskovich S, Wyatt KD, Oliwa T, Graglia L, Furner B, Lee J, et al
. JCO Clin Cancer Inform. 2023;e2300009.

We discuss the development of an automated tool for processing free-text clinical trial inclusion and exclusion criteria and matching patients to relevant clinical trials. Full text

Creating a data commons: The INternational Soft Tissue SaRcoma ConsorTium (INSTRuCT)
Wyatt KD, Birz S, Hawkins DS, Minard-Colin V, Rodeberg DA, Sparber-Sauer M, et al
. Pediatr Blood Cancer. 2022;e29924.

We discuss the genesis, evolution, and progress of INSTRuCT, including challenges and research priorities, the development of the consortium, and how INSTRuCT aims to address key research priorities. Full text

Mapping Pediatric Oncology Clinical Trial Collaborative Groups on the Global Stage
Major A, Palese M, Ermis E, James A, Villarroel M, Klussmann FA, et al. JCO Glob Oncol. 2022;8:e2100266.

We describe pediatric cancer clinical trial groups on the international stage, with the goal of identifying the structure and function of these consortia, as well as the clinical data sources they collect, to reveal opportunities for collaborative efforts within these regions. Full text

Pediatric Cancer Data Commons: Federating and Democratizing Data for Childhood Cancer Research
Plana A, Furner B, Palese M, Dussault N, Birz S, Graglia L, et al. JCO Clin Cancer Inform. 2021;5:1034-1043.

We present our experience constructing the Pediatric Cancer Data Commons to highlight the significance of developing a rich and robust data ecosystem for pediatric oncology and to provide essential information to those creating resources in other disease areas. Full text

Using big data in pediatric oncology: Current applications and future directions
Major A, Cox SM, Volchenboum SL. Sem Oncol. 2020;47(1):56-64.

We discuss the uses of big data in pediatric cancer, existing pediatric cancer registry initiatives and research, the challenges in harmonizing data to improve accessibility for study, and the future opportunities we see for innovation in this area. Full text

Data Commons to Support Pediatric Cancer Research
Volchenboum SL, Cox SM, Heath A, Resnick A, Cohn SL, Grossman R. Am Soc Clin Oncol Educ Book. 2017;37:746–752.

We describe current data commons and how they operate in the oncology landscape, and offer a practical paradigm for developing new commons. By centralizing data, processing power, and tools, there is a valuable opportunity to share resources and thus increase the efficiency, power, and impact of research. Full text

Tailoring Therapy for Children With Neuroblastoma on the Basis of Risk Group Classification: Past, Present, and Future
Liang WH, Federico SM, London WB, Naranjo A, Irwin MS, Volchenboum SL, Cohn SL. JCO Clin Cancer Inform. 2020;4:895-905.

In this review, the authors discuss the history of neuroblastoma risk classification in North America and Europe and highlight efforts by the International Neuroblastoma Risk Group (INRG) Task Force to develop a consensus approach for pretreatment stratification using seven risk criteria including an image-based staging system—the INRG Staging System. Full text

  • Watkins M, Furner B, Li M, et al. Hamonizing Genetic Data Element Modeling Across Cancer Trials. Presented at the 54th Congress of the International Society of Paediatric Oncology; October 2022.
  • Furner B, Graglia L, Sathar S, et al. Genomic Eligibility Algorithm At Relapse For Better Outcomes (GEARBOx): A decision support tool for matching children with relapsed acute myeloid leukemia to clinical trials. Presented at the 53rd Congress of the International Society of Paediatric Oncology; October 2021.
  • Graglia L, Sathar S, Palese M, Furner B, Volchenboum S. The Pediatric Cancer Data Commons: A Demonstration of a Novel Implementation and Extension of the Gen3 Infrastructure for Cohort Discovery and Data Sharing. Presented at AMIA 2021 Virtual Informatics Summit; March 2021.
  • Volchenboum S, Cohn S, Furner B, et al. INRG visualization and analytics platform. Presented at Advances in Neuroblastoma Research; January 2021.
  • Plana A, Palese M, Furner B, et al. The Pediatric Cancer Data Commons: A centralized system for aggregating and sharing pediatric cancer data. Presented at the 52nd Congress of the International Society of Pediatric Oncology; October 2020.
  • Plana A, Furner B, Palese M, Kolb EA, Nichols G, Volchenboum S. Building International Pediatric Cancer Data Commons: The Pediatric Acute Leukemia (PedAL) Initiative. Presented at the 51st Congress of the International Society of Pediatric Oncology; October 2019; Lyon, France.
  • Furner B, Oliwa T, Graglia L, et al. Linking clinical trials data with images via the Pediatric Cancer Data Commons and the National Biomedical Imaging Archive (NBIA). Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.
  • Furner B, Plana A, Palese M, Nichols G, Kolb EA, Volchenboum S. The Pediatric Acute Leukemia (PedAL) Initiative – an innovative platform for real-time matching of children with relapsed AML to early-phase clinical trials. Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.
  • Plana A, Furner B, Birz S, Palese M, Volchenboum S. A Synthesized Common Data Model and Data Standards for the University of Chicago’s Pediatric Cancer Data Commons. Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.
  • Plana A, Furner B, Palese M, Birz S, Hawkins DS, Volchenboum S. Rapid consensus building and development of the International Soft Tissue Sarcoma Consortium (INSTRuCT) data commons. Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.

Transforming the Way Researchers Share Data
Presented by Sam Volchenboum, MD, PhD

April 2022

FAQ for Researchers

How can researchers get data from the PCDC?

Users can explore and request data via the PCDC Data Portal.

For detailed information about what data are available and how to request a dataset, please see PCDC Data Access and Governance.

How can researchers contribute data to the PCDC?

Contributing data to the PCDC

Collaborators interested in contributing data for a disease area that is already part of the PCDC may contact our team to discuss key governance and data considerations as well as estimated project timelines. If approved, the process of creating, updating, and executing contracts, data sharing agreements, and Memoranda of Understanding (as applicable) typically ranges from one to six months depending on country, consortium, legal teams, and type(s) of data involved. After key governance processes have been completed, the collaborator works with PCDC developers to discuss data harmonization and integration into the commons and to resolve harmonization and quality control concerns. This data harmonization and integration process, dependent upon several factors, often takes roughly one to six months to successfully complete. Adding data to the commons may further depend on funding to support the transformation of data into a common data model.

Adding new cancers to the PCDC

Over time we plan to extend the PCDC to include additional pediatric cancers. Collaborators interested in creating a data commons for a new cancer should contact the PCDC team to learn more about the key steps, timeline, costs, and sustainability considerations. Time from initial discussions to the establishment of a disease-group consortium and creation of a Memorandum of Understanding (MOU) may range from three months to a year. Next, data use agreements (DUAs) are created, a data dictionary is established, and consortium leaders establish working groups to drive ongoing development and productivity. Once DUAs have been signed and the data dictionary has been established, PCDC developers can begin to map and harmonize the data and perform quality control checks. While the PCDC has been able to streamline the process of commons development, the ultimate pace at which a commons is developed and launched depends heavily on the frequency and intensity of involvement from collaborators, the number and diversity of consortium members involved, and the amount of initial funding available. The process from MOU sign-off to commons launch may take as little as six months to a year with substantial dedication from consortium leaders. Additional commons development, consortium meetings, and data integration continue at regular intervals from this point onward.

Can individual researchers or sites contribute data to the PCDC?

To date, the PCDC has only accepted data from multi-site cooperative groups or, in some cases, nationwide research organizations. Because of the time, resources, and effort involved in integrating each entity into the commons, the PCDC primarily engages with cooperative research groups. If you are an individual researcher or research site interested in joining the PCDC, please contact our team to discuss further.

What quality assurance and harmonization measures are taken when data is added to the commons?

Quality assurance is very important to the PCDC, so each data source is vetted for policies and procedures that help ensure high quality. In addition, the PCDC runs a series of QA/QC scripts to confirm internal consistency of the data. Our team works closely with data managers and stakeholders from all groups to help with the QC processes.

How are PCDC consortia formed?

In some cases, pre-established consortia have approached the PCDC to create a disease-specific commons. In other cases, the PCDC works closely with leaders in a specific pediatric cancer type to identify interested cooperative research groups who are willing to form a consortium to guide commons development and steer future progress. This process may take months of emails, calls, and occasional international face-to-face meetings. Eventually, all committed parties sign a Memorandum of Understanding and establish the necessary contracts, data sharing agreements, and governance mechanisms to guide the consortium moving forward.