Volchenboum Lab is housed within the University of Chicago Department of Pediatrics, where our director Dr. Sam Volchenboum is a pediatric oncologist and researcher. Our flagship project, the Pediatric Cancer Data Commons (PCDC), began as an initiative of the University of Chicago Center for Research Informatics, which was directed by Dr. Volchenboum from 2013 through early 2019. The CRI has unique deep expertise in data harmonization, standardization, and governance, as well as experience building innovative web-based technical tools to search, access, and share data between institutions. Our lab continues to collaborate closely with experts in the CRI, and PCDC data is stored on their secure, HIPAA-compliant, high-performance data infrastructure.

Building the PCDC

The origins of the PCDC lie in the INRG Data Commons, now the largest collection of neuroblastoma patient clinical data in the world. In collaboration with the INRG Consortium, the CRI launched the public INRG Data Commons and cohort discovery tool in 2013, leveraging consortium-building, regulatory, and data harmonization work that has been ongoing since 2004. The success of this project, which has resulted in more than 25 high-impact publications, led Dr. Volchenboum to explore expanding these methods across new disease areas.

The INRG Consortium and Data Commons, as well as other tracking and analysis platforms developed by the CRI, paved the way for rapid development of additional data commons. INSTRuCT, the second major initiative to become part of the PCDC, is an international consortium and data commons for clinical data related to soft-tissue sarcomas. Work on INSTRuCT began in 2017 and the data commons and cohort discovery tool launched in 2018 with funding by the Rally Foundation for Childhood Cancer Research.

While growing the INRG and establishing INSTRuCT, we also began to partner with existing research groups to apply the PCDC methodology to additional forms of pediatric cancer. In 2018, we began to work with acute myeloid leukemia data by establishing a partnership with the Leukemia and Lymphoma Society to develop the infrastructure for the Pediatric Acute Leukemia (PedAL) Initiative. In 2019, we partnered with the MaGIC consortium to update and enhance a data dictionary for germ cell tumors and transfer their consortium data to the PCDC, and we began similar data dictionary work with other groups addressing acute lymphocytic leukemia, Ewing sarcoma, osteosarcoma, and Hodgkin’s lymphoma.

In autumn 2019, thanks to funding from St. Baldrick’s Foundation, we began bringing together all of this related work to form the Pediatric Cancer Data Commons Consortium. The PCDC Consortium is now developing a common core data dictionary and common governance structure spanning seven pediatric cancers: neuroblastoma, soft tissue sarcoma, acute myeloid leukemia, acute lymphoblastic leukemia, germ cell tumors, bone tumors, and Hodgkin lymphoma. This work will enable innovative cross-disease research as well as set a standard for future cancer data commons endeavors.

Cancer Data and Research Ecosystem Projects

Alongside our core work of building and growing the PCDC, Volchenboum Lab takes pride in leveraging our expertise in data standards and infrastructure as part of nationwide efforts to make cancer data more accessible and impactful. In 2018, we partnered with the NIH-funded Gabriella Miller Kids First Pediatric Research Program to build the infrastructure for their large-scale pediatric cancer and structural birth defect data resource. In 2019, we received NIH funding to help lead the development of the Center for Cancer Data Harmonization alongside Oregon State University, Oregon Health & Science University, Johns Hopkins University and the University of North Carolina.


INRG Task Force formed


INRG database and cohort discovery tool launch


INSTRuCT formed; work begins on database and cohort discovery tool


CRI/Volchenboum Lab begin work on Gabriella Miller Kids First Pediatric Research Program

Volchenboum Lab pilots integration of radiology images into the INRG Data Commons with funding from NIH

Work begins on in-browser data visualizations for INRG Data Commons

With funding from Rally Foundation, INSTRuCT database and cohort discovery tool are completed for internal launch to INSTRuCT Executive Committee

Leukemia and Lymphoma Society Pediatric Acute Leukemia (PedAL) Initiative data infrastructure collaboration begins


Partnership established with the MaGIC consortium include germ cell tumors in the PCDC

AML data dictionary and data commons work expands to include European partners

Volchenboum Lab receives funding from St. Baldrick’s Foundation to create the Pediatric Cancer Data Commons Consortium

First PCDC All-Commons Workshop is held at the SIOP Conference in Lyon

Data dictionary work begins for ALL, Ewing sarcoma, osteosarcoma, and Hodgkin’s lymphoma

Volchenboum Lab receives NIH funding to create the Center for Cancer Data Harmonization with partner institutions

AML data dictionary complete