Data for the Common Good (D4CG) is in the midst of our second year as a mentoring organization for Google Summer of Code (GSoC). After a successful inaugural year with the program last summer, we are excited to continue fostering the next generation of developers as they contribute to enhancing our projects and tools.
Google Summer of Code is a global, online program that connects students and new developers with open source organizations for 12+ week programming projects. For organizations like D4CG, the program is an opportunity to explore new development projects while providing valuable mentorship to emerging developers. The high level of interest in D4CG’s projects among GSoC applicants demonstrates the growing appeal of our mission-driven approach to technology: in 2024, our first year participating, we received 242 proposals and were allocated four contributor slots instead of the typical three for new organizations. This year, interest grew to 365 proposals, and we are again working with four contributors.
Our first year
In summer 2024, GSoC contributors Luwei Wang, Wenbo Qian, Zhen Jiang, and Paarth Agarwal worked with D4CG mentors on diverse projects, including:
- A configurable demographic table (Table 1) tool for the PCDC Data Portal
- A pipeline to automate testing procedures to support quality assurance and integration efforts
- A configuration file to improve the user-friendliness of the GEARBOx clinical trial matching platform
These 2024 GSoC projects provided valuable contributions to our technical infrastructure as well as meaningful open source development experience for our contributors.
This summer’s projects
Now, our 2025 contributors are working on a new set of exciting projects to enhance a variety of our healthcare and data analysis tools:

Developing Custom Jupyter Notebooks for AVRO File Processing and QA/QC Analysis
Contributor: Tushar Jamdade | Mentor: Steve Krasinsky
This project aims to create a custom Jupyter notebook that allows users to efficiently process and analyze the AVRO files that are provided by the PCDC Data Portal. It will include features for seamless data retrieval from various sources, file format validation, schema verification, and anomaly detection. With an interactive and user-friendly interface, this project simplifies AVRO file handling for both technical and non-technical users. 

Enhancing SMART-on-FHIR Frontend for Patient-Controlled Data Sharing
Contributor: Aryan Kumar | Mentor: Tianyun Zhang
This project aims to improve the front-end user interface for a SMART-on-FHIR application that enables patients to access, manage, and share their electronic health records (EHR). The focus is on making the UI more intuitive, accessible, and user-friendly, ensuring a seamless experience for users connecting to the back-end service.

Enhancing the Cohort Discovery Chatbot
Contributor: Regina Huang | Mentor: Jooho Lee
This project proposes an innovative solution to enhance the PCDC cohort discovery tool by developing a Large Language Model (LLM) based query agent to achieve more flexible and efficient cohort discovery. To prototype this query agent, this project focuses on refining natural language understanding (NLU), improving query accuracy, supporting more complex filters, and integrating feedback mechanisms to learn from user interactions. 
2024 contributor Wenbo Qian is also contributing to this project.

Developing a FHIR Resource Tabular Viewer for Efficient Data Exploration
Contributor: Manjula Kudapa | Mentor: Paul Murdoch
This project aims to build a FHIR Resource Tabular Viewer, an application that transforms complex, nested FHIR data structures into an easy-to-navigate tabular format. This tool will allow users—such as researchers, clinicians, and developers—to efficiently search, filter, and analyze FHIR resources, improving accessibility and usability of healthcare data.
As the work of our contributors and mentors progresses this summer, we are grateful to be part of the GSoC program. In this mutually beneficial relationship, we are able to explore promising initiatives like the ones above while contributors gain hands-on experience with meaningful, real-world projects that have direct applications in healthcare and research. By investing in mentorship, we have the opportunity to help build the next generation of developers who will work together for the common good.