Data commons can change how the world does research.

A collaborative data-sharing approach has the potential to change the entire landscape of cancer research. As we build the PCDC, our mission includes sharing our methods and lessons learned with the scientific community and assessing further opportunities to advance the state of the art in this field.

Using big data in pediatric oncology: Current applications and future directions
Major A, Cox SM, Volchenboum SL. Sem Oncol. 2020;47(1):56-64.

We discuss the uses of big data in pediatric cancer, existing pediatric cancer registry initiatives and research, the challenges in harmonizing data to improve accessibility for study, and the future opportunities we see for innovation in this area. Full text

Data Commons to Support Pediatric Cancer Research
Volchenboum SL, Cox SM, Heath A, Resnick A, Cohn SL, Grossman R. Am Soc Clin Oncol Educ Book. 2017;37:746–752.

We describe current data commons and how they operate in the oncology landscape, and offer a practical paradigm for developing new commons. By centralizing data, processing power, and tools, there is a valuable opportunity to share resources and thus increase the efficiency, power, and impact of research. Full text

Tailoring Therapy for Children With Neuroblastoma on the Basis of Risk Group Classification: Past, Present, and Future
Liang WH, Federico SM, London WB, Naranjo A, Irwin MS, Volchenboum SL, Cohn SL. JCO Clin Cancer Inform. 2020;4:895-905.

In this review, the authors discuss the history of neuroblastoma risk classification in North America and Europe and highlight efforts by the International Neuroblastoma Risk Group (INRG) Task Force to develop a consensus approach for pretreatment stratification using seven risk criteria including an image-based staging system—the INRG Staging System. Full text

  • Graglia L, Sathar S, Palese M, Furner B, Volchenboum S. The Pediatric Cancer Data Commons: A Demonstration of a Novel Implementation and Extension of the Gen3 Infrastructure for Cohort Discovery and Data Sharing. Presented at AMIA 2021 Virtual Informatics Summit; March 2021.
  • Volchenboum S, Cohn S, Furner B, et al. INRG visualization and analytics platform. Presented at Advances in Neuroblastoma Research; January 2021.
  • Plana A, Palese M, Furner B, et al. The Pediatric Cancer Data Commons: A centralized system for aggregating and sharing pediatric cancer data. Presented at the 52nd Congress of the International Society of Pediatric Oncology; October 2020.
  • Plana A, Furner B, Palese M, Kolb EA, Nichols G, Volchenboum S. Building International Pediatric Cancer Data Commons: The Pediatric Acute Leukemia (PedAL) Initiative. Presented at the 51st Congress of the International Society of Pediatric Oncology; October 2019; Lyon, France.
  • Furner B, Oliwa T, Graglia L, et al. Linking clinical trials data with images via the Pediatric Cancer Data Commons and the National Biomedical Imaging Archive (NBIA). Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.
  • Furner B, Plana A, Palese M, Nichols G, Kolb EA, Volchenboum S. The Pediatric Acute Leukemia (PedAL) Initiative – an innovative platform for real-time matching of children with relapsed AML to early-phase clinical trials. Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.
  • Plana A, Furner B, Birz S, Palese M, Volchenboum S. A Synthesized Common Data Model and Data Standards for the University of Chicago’s Pediatric Cancer Data Commons. Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.
  • Plana A, Furner B, Palese M, Birz S, Hawkins DS, Volchenboum S. Rapid consensus building and development of the International Soft Tissue Sarcoma Consortium (INSTRuCT) data commons. Presented at Childhood Cancer Data Initiative Symposium; July 2019; Washington, DC.

Transforming the Way Researchers Share Data: Lessons from the Pediatric Cancer Data Commons
Presented by Sam Volchenboum, MD, PhD

University of Chicago Department of Pediatrics Grand Rounds, November 2020

FAQ for Researchers

How can researchers get data from the PCDC?

The PCDC is designed to protect the data of research subjects while maximizing the benefit to researchers. Any researcher can freely register to use the PCDC. There, they can access aggregate data numbers, and soon visualizations, within the commons platform. Data sets with line-level data can be requested through the commons portals, and this access is governed by the respective data-contributing consortium. Once a request is submitted, the executive committee for the relevant consortium will approve or deny the request and our PCDC staff will follow up with the researcher accordingly.

Can researchers get data from the PCDC across multiple disease groups (e.g., survival data for a genomic finding found in both liquid and solid tumors)?

Yes, with the approval of the executive committee of each relevant disease group consortium. This approval process will be as streamlined as possible; the governance plan currently being developed and refined will include such project requests coming through the PCDC Executive Committee but requiring approval of the individual disease group executive committees. While this may appear onerous, it is the only way to ensure that each disease group retains their autonomy in deciding how their data are used for research. Thus far, the disease consortia have supported this vision.

How do collaborators join the PCDC Consortium?

Contributing data to the PCDC

Collaborators interested in contributing data for a disease area that is already part of the PCDC may contact our team to discuss key governance and data considerations as well as estimated project timelines. If approved, the process of creating, updating, and executing contracts, data sharing agreements, and Memoranda of Understanding (as applicable) typically ranges from one to six months depending on country, consortium, legal teams, and type(s) of data involved. After key governance processes have been completed, the collaborator works with PCDC developers to discuss data harmonization and integration into the commons and to resolve harmonization and quality control concerns. This data harmonization and integration process, dependent upon several factors, often takes roughly one to six months to successfully complete. Adding data to the commons may further depend on funding to support the transformation of data into a common data model.

Adding new cancers to the PCDC

Over time we plan to extend the PCDC to include additional pediatric cancers. Collaborators interested in creating a data commons for a new cancer should contact the PCDC team to learn more about the key steps, timeline, costs, and sustainability considerations. Time from initial discussions to the establishment of a disease-group consortium and creation of a Memorandum of Understanding (MOU) may range from three months to a year. Next, data use agreements (DUAs) are created, a data dictionary is established, and consortium leaders establish working groups to drive ongoing development and productivity. Once DUAs have been signed and the data dictionary has been established, PCDC developers can begin to map and harmonize the data and perform quality control checks. While the PCDC has been able to streamline the process of commons development, the ultimate pace at which a commons is developed and launched depends heavily on the frequency and intensity of involvement from collaborators, the number and diversity of consortium members involved, and the amount of initial funding available. The process from MOU sign-off to commons launch may take as little as six months to a year with substantial dedication from consortium leaders. Additional commons development, consortium meetings, and data integration continue at regular intervals from this point onward.

Can individual researchers or sites contribute data to the PCDC?

To date, the PCDC has only accepted data from multi-site cooperative groups or, in some cases, nationwide research organizations. Because of the time, resources, and effort involved in integrating each entity into the commons, the PCDC primarily engages with cooperative research groups. If you are an individual researcher or research site interested in joining the PCDC, please contact our team to discuss further.

What quality assurance and harmonization measures are taken when data is added to the commons?

Quality assurance is very important to the PCDC, so each data source is vetted for policies and procedures that help ensure high quality. In addition, the PCDC runs a series of QA/QC scripts to confirm internal consistency of the data. Our team works closely with data managers and stakeholders from all groups to help with the QC processes.

How are PCDC consortia formed?

In some cases, pre-established consortia have approached the PCDC to create a disease-specific commons. In other cases, the PCDC works closely with leaders in a specific pediatric cancer type to identify interested cooperative research groups who are willing to form a consortium to guide commons development and steer future progress. This process may take months of emails, calls, and occasional international face-to-face meetings. Eventually, all committed parties sign a Memorandum of Understanding and establish the necessary contracts, data sharing agreements, and governance mechanisms to guide the consortium moving forward.

Will non-pediatric cancer disease groups be included in the PCDC?

At the moment, the PCDC is focused on pediatric cancers. We recognize, however, that this approach to building commons is applicable across many other disease groups, especially other rare diseases. We encourage researchers working in other cancers or in rare disease specialties to contact us to discuss future opportunities and solutions.