P4C—Challenge Accepted: Understanding How Data Shape Medical Research

November 15, 2016

What are the technical and scientific capabilities of data in life science? How can we incorporate data in medical research in a way that is clinically significant and meaningful? John Wilbanks, Chief Commons Officer of Sage Bionetworks, and his fellow panelists on the “Challenge Accepted: Understanding How Data Shape Medical Research” panel sought to answer these questions. In this highly interactive session, three themes bubbled to the surface.

1. It’s data trends that make health data intriguing.

Wilbanks shared an anecdote from his experience on the Parkinson’s mPower project: “We run a small Parkinson’s study via iPhones and use a lot of these methods to find patterns within data. In looking at collected data alongside patient-input free text, we saw that in groups of patients, words like ‘race relations’ and ‘news’ were affecting daily health information; the culprit was stress.” This trend was much more telling than the variance in empirical data alone.

Environmental data is another interesting source of data trends. When we collect massive data sets from mapping a population, and include environmental data, we can passively create a longitudinal cohort. With this well-characterized data set, we can run studies retrospectively—a potentially quick and efficient method. Russ Altman, professor of bioengineering and genetics and medicine at Stanford University, described this future. A current example of this type of effort is the newly announced Echo Program, which National Institutes of Health Director Francis Collins highlighted from the audience.

2. Incorporating data scientists and outside thinkers into the medical system is key.

Organizations should involve data scientists early in a research project. As highlighted by Leslie Fine, vice president of data and analytics at Salesforce, “People are bringing in data scientists at the end and are saying: ‘I have a screw, a bolt and a block of wood. Tell me how I can create a car.’ To avoid this, you have to bring them in at the beginning.”

Challenge Accepted rachel kalmar2

Rachel Kalmar, fellow at Berkman Klein Center for Internet & Society, Harvard University

Organizations should involve not only data scientists, but also other people with diverse skills, such as pharmacists, to support physicians and patients to collect, curate and package medical data, according to Altman. The burden cannot be placed on physicians or patients alone. Physicians need help integrating data and information systems. And patients need new tools. If they have genomic or wearables data, they require tools to put it in a centralized place, clean it, and annotate it.

3. We have much to learn.

The personal health data that we have today—generated from sources such as wearable devices—do not always add up. Rachel Kalmar, currently a fellow at Berkman Klein Center for Internet & Society at Harvard University, and previously a data scientist at Misfit Wearables, is often asked questions about why step counts from various devices do not sync. Every tiny difference in an environment—the device used, the sensor and algorithms in the device—affects your personal health data. To extract meaning from this data, it’s important to understand its source and to thoughtfully calibrate the questions to help you understand it.

Perhaps the greatest source of learning for medical researchers is the consumer world. Companies such as Salesforce can create holistic views of customers. Fine shared that Salesforce is starting to look at people and their health information just like companies look at customers. She said, “When we have a holistic view of people, we have a better characterized population, which can help facilitate things like improved targeting for clinical trials.”

Although the challenges to using data science to augment medical research are great, beginning to understand how to tackle these challenges is an important first step toward leveraging data to accelerate cures.