About Us

The Pediatric Cancer Data Commons (PCDC) is building the future of pediatric cancer research.

The PCDC is headquartered at the Center for Research Informatics (CRI) at the University of Chicago and works in collaboration with research institutions and consortia  throughout North America, Europe, and Japan. We bring together clinical, genomic, and imaging data from institutions around the world that are working alongside us to transform pediatric cancer research and outcomes. With other international leaders, we are working to develop and apply uniform data standards so that data collected from different sources can be combined, compared, and analyzed.

By harmonizing existing clinical research data and leading international efforts to standardize data collection, we are breaking down long-standing barriers that have held back advancements in research on rare diseases. Our aim is to enable new and meaningful discoveries about pediatric cancers that could not occur without a collaborative, consortium-based approach.

Why a data commons?

Cancer researchers are poised to make new discoveries thanks to a revolution in our ability to collect and analyze high-density genomic data. However, significant advancement in pediatric cancer research continues to be limited by the small number of patients in any given data source and a lack of available clinical data to study alongside genomic information. The rarity of pediatric cancers makes acquisition of sufficiently sized data sets a particular challenge for researchers. Data remain siloed in disease-specific databases, national and international registries, and institutional data warehouses, and data sharing can be cumbersome or even impossible due to a lack of common data standards.

Our Pediatric Cancer Data Commons offers a powerful solution by centralizing data from many disparate sources. The adoption of common data standards makes it possible for data to be shared easily and for different types of data — clinical, biospecimen, imaging, and genomic — to be connected and searchable in real time. An endeavor of this magnitude requires robust international partnerships, expertise in data standardization, close attention to data governance and regulatory requirements, and the ability to build and deploy effective tools for querying, accessing, and visualizing large amounts of data. The PCDC leverages University of Chicago expertise in clinical cancer research and the CRI’s robust data infrastructure and cutting-edge skills in application development to challenge how researchers think about the limitations of pediatric cancer data.

By collocating  previously siloed data types and sources, the PCDC is vastly improving the ability of researchers to study the causes, treatments, and long-term effects of pediatric cancer and to make meaningful contributions to scientific literature and clinical practice. Eventually, when consensus data standards are adopted by the medical community and applied prospectively as new trials are developed, the available data will be even more complete and robust, catalyzing even more pediatric cancer discoveries.

The PCDC Team

Sam Volchenboum, MD, PhD

Principal Investigator CRI Director Pediatric Oncologist


Monica Palese, MPH

Program Manager


Jian Tian

Senior Programmer


Julie Johnson, PhD, MPH, RN

Healthcare Business Analyst


Brian Furner, MS

Director of Applications Development


Luca Graglia, MS

Senior Programmer


Suzi Birz, MScMI

Regulatory and Data Governance Consultant


Alejandro Plana

Clinical Data Analyst 4th-year Medical Student


Our History

The PCDC is housed within the University of Chicago’s Center for Research Informatics (CRI). Founded in 2011 and directed by pediatric oncologist Dr. Samuel Volchenboum, the CRI has unique deep expertise in data harmonization, standardization, and governance, as well as experience building innovative web-based technical tools to search, access, and share data between institutions. Notably, the CRI has built the University of Chicago Clinical Research Data Warehouse (2012); SIMPL, a molecular pathology specimen tracking system (2017), and the tracking and analysis platforms that underpin the multi-institutional Genomic Assessment Improves Novel Therapy (GAIN) Consortium study (2015-17). The CRI maintains a secure, HIPAA-compliant, high-performance data infrastructure that is used for PCDC data.

The origins of the PCDC lie in the INRG Data Commons, now the largest collection of neuroblastoma patient information in the world. In collaboration with the INRG Consortium, the CRI launched the public INRG Data Commons and cohort discovery tool in 2013, leveraging consortium-building, regulatory, and data harmonization work that has been ongoing since 2004. The success of this project, which has resulted in more than 25 high-impact publications, led Dr. Volchenboum to explore expanding these methods across new disease areas.

The INRG Consortium and Data Commons, as well as SIMPL and GAIN, paved the way for rapid development of additional data commons. The PCDC’s second major initiative, INSTRuCT, is an international consortium and data commons for clinical data related to soft-tissue sarcomas. Work on INSTRuCT began in 2017 and the data commons and cohort discovery tool launched in 2018. The PCDC also established partnerships in 2018 with the NIH-funded Gabriella Miller Kids First Pediatric Research Program to build the infrastructure for their large-scale pediatric cancer and structural birth defect data resource, and with the Leukemia and Lymphoma Society.


INRG working group formed


University of Chicago Center for Research Informatics (CRI) founded


CRI’s Clinical Research Data Warehouse launches


INRG database and cohort discovery tool are developed


Work on SIMPL and GAIN begins


GAINCAST (GAIN tracking platform) launches


INSTRuCT formed, database and cohort discovery work begins
GAIN iCAT (GAIN analysis platform) launches
SIMPL launches


CRI work on the Gabriella Miller Kids First data commons begins
In partnership with Leidos, integration of radiology images into the INRG begins
INSTRuCT database and cohort discovery tool launch
LLS data infrastructure collaboration begins