Graham School News

Health Informatics Students Analyze Data to Optimize Disease Management in Capstone Project

Philip Baker

Erika Lin and Trevor Peters formed their capstone project based on their shared interest complementing their biology and laboratory experience with the ability to apply data analysis to study large amounts of data and draw conclusions through coding. The pair used clinical trial data in their Master of Science in Biomedical Informatics capstone project to conduct a patient-level meta-analysis comparing the efficacy of combination therapy relative to monotherapy in the treatment of Inflammatory Bowel Disease (IBD). 

With backgrounds spent predominantly in wet labs, the data-focused dry lab scenario provided them with a new experience as well as the challenge of putting their newly acquired coding skills to the test. 

“That’s exactly what Trevor and I came to the BMI program to learn,” Erika said, who saw the BMI program as a great way to acquire valuable new skill sets before moving onto medical school. “We both have primarily biology backgrounds, so needing to use R to draw conclusions from the data forced us to get up to speed with it really fast.”

“It’s the best way to learn it,” added Trevor, who had no prior experience with R—a statistical language and computational environment—before their capstone began. “In certain ways, it’s like learning a spoken language. Being immersed in an environment where you need to use it means you have no choice but to learn it.”

To assist them with the clinical side of their project, Erika and Trevor were introduced to Dr. Atsushi Sakuraba, MD, PhD, a gastroenterologist and assistant professor at UChicago Medicine who specializes in the diagnosis and management of IBD. His expertise was pivotal as they moved forward on their capstone project, which involved assessing whether the data from eight clinical trials reported better outcomes in reducing IBD-related inflammation using combination therapy or monotherapy. 

Specifically, their research centered around whether using a higher dosage of the immunomodulator Azathioprine in combination with Infliximab, a therapy commonly administered for the treatment of IBD, was more effective than using Infliximab on its own.

“In the beginning we were emailing Dr. Sakuraba constantly,” said Trevor. “Despite his extremely busy schedule, he was great at getting back to us. Plus, his familiarity with similar projects that had reported conflicting results in the past meant he could steer us in the right direction and answer questions we had about the data.”

The Challenges and Rewards of Real World Data

During the first quarter of the project, Erika and Trevor undertook a literature review of IBD while focusing on the specific areas and therapies their project would cover. Vital during this time was assistance they received from their science advisor Matthew Dapas, PhD, who also had past experience working in the area but whose R coding expertise served as a particular boon.

“By the end of the first quarter we’d put together the proposal for our project,” Erika said.  “We worked closely with Matt in laying out our game plan and he was especially helpful when it came to preparing us for the programming challenges up ahead. Getting the R script up and running was one of the biggest challenges of the project for us.”

The data for the eight clinical trials was provided by the Yale Open Data Access (YODA) Project, a data storage platform that makes clinical data readily available to researchers and physicians for research purposes. To access the clinical trial data, they would log on remotely while using UChicago computers located at NBC Towers, whose large screens were essential for reviewing the many columns of data. A key hurdle they encountered during this time revolved around missing data sets, which had them reach out to the YODA Project and request they track them down.

“But the missing data sets were only part of the challenge,” said Trevor. “More significant was that the eight clinical trials had different objectives and they organized their data in very different ways, so putting together the data tables was hard work. It was a great lesson in what working with real data entails.”

For Erika, too, this was a surprise and also a key learning point that she took away from the project. In fact, in her present position at the Ann & Robert H. Lurie Children’s Hospital, she’s creating a database from the ground up and paying particular attention that the variables she selects will be useful for anyone who uses the database in the future.

“That was really a major lesson,” she said. “No data is perfect. It brought home for me the need to think about the big picture when developing a prospective clinical trial. It’s a matter of organizing the data so that it will be most useful for anyone looking to use and understand it down the line.”

Drawing Conclusions and Looking Ahead

Having taken on the major challenge presented by the data during the second quarter, they were not entirely out of the woods once the third quarter began. As the YODA Project continued to track down and send them the missing data sets they’d requested, incorporating the new data at this late stage often came with unsettling consequences for results they believed were certain.

“It actually became a little exhausting,” Trevor said about the experience of writing and re-writing their capstone report to accommodate the new data. “But it was a great lesson, too. That’s what happens in data science work. No matter what you do, the variables and conclusions are liable to change at any time as new data come in. It’s definitely part of the value of the capstone project that you come away with lessons like that.”

In the end, they concluded that a high dosage of azathioprine had more cases of remission than low dosages of azathioprine in combination therapy. But, these results may be due to a different reason than they’d originally anticipated. Based on previous clinical trials investigating the efficacy of combination therapy, they expected to find that azathioprine would increase blood levels of infliximab. However, their results showed that there may be other explanations, which led Erika and Trevor to speculate about their findings, as well as future next steps. 

“I believe we used the data we had to its full potential,” Erika said. “As for what the next step could be, ideally a prospective clinical trial could be developed using uniform data capture practices. But securing the necessary funding and time isn’t easy, so, in lieu of that path, tracking down the missing data sets could be another way to yield additional significant results.”