Skip to main content

How Your Electronic Health Records Could Change Genetics Research

A study of Neanderthal genes demonstrates the research power of electronic health records.
Can we finally get rid of these? (Photo: newtown_grafitti/Flickr)

Can we finally get rid of these? (Photo: newtown_grafitti/Flickr)

Last month, scientists published yet another study about the modern genetic legacy of prehistoric interbreeding between humans and Neanderthals. The study presented new evidence that genetic variants inherited from Neanderthals affect the health of people today, by increasing their risk for medical conditions ranging from depression to skin lesions. While the story about how our distant ancestors shape who we are today is fascinating, the study's most important result isn't really about Neanderthals.

The study's true significance involves something that may sound mundane, but is, in fact, a powerful new tool in biomedical research: electronic health records. Though not explicitly designed for research, electronic health records promise to solve a tough problem in medical genetics—as long as we figure out how to use them securely and ethically.

Electronic health records are something we typically associate with medical bureaucracy, not cutting-edge science. They're designed to help medical practitioners do a better job keeping track of a patient's care, thereby make medicine cheaper, safer, and more efficient. But it turns out that electronic health records have another important use as well: They may help solve a critical research problem, one that has hampered our ability to take full advantage of the much-hyped DNA technologies that are supposed to transform 21st-century medicine.

Thanks to the plummeting costs of sequencing DNA, we're drowning in genetic data, but there has been no comparable reduction in the cost of gathering large-scale health data.

The problem is that the field of medical genetics suffers from a massive data imbalance: Thanks to the plummeting costs of sequencing DNA, we're drowning in genetic data, but there has been no comparable reduction in the cost of gathering large-scale health data. It is now roughly 10,000 times cheaper to sequence a human genome than it was in 2006, and scientists have collected DNA data for tens of thousands of human subjects. But this DNA data is only half of what you need for a successful genetic study.

To discover links between the risk for certain diseases and particular genetic variants or mutations, you also need data about the health of the study participants. But the cost of collecting that health data has not become dramatically cheaper. This means that, despite stunning advances in DNA technology, it is still very difficult and expensive to carry out the large, statistically powerful studies that will be necessary for learning how our genes affect our health. This is especially true when it comes to the very large studies needed to study common diseases like diabetes, where complex genetic and environmental factors play a role.

And yet extensive, long-term health data for hundreds of millions of people in the United States is out there. Each time we visit a doctor, take a laboratory test, or stay at a hospital, our medical providers make a record of it. What if that data—with our consent—could be easily passed on to researchers? This is not feasible with traditional, paper-based medical records, but networked electronic health records make this possible. Researchers are already able to collect and analyze DNA samples from tens of thousands of volunteers for a reasonable cost; if those volunteers could also easily share their long-term health data, then very large genetic studies suddenly become much more affordable.

The recent Neanderthal study demonstrates how powerful this approach can be. Earlier studies had suggested that genetic variants inherited from Neanderthals might affect certain health-related traits among certain populations today, but these studies lacked the data to show this conclusively. To get the necessary data, the researchers turned to an existing project—led by scientists at Vanderbilt University—that is piloting the use of electronic medical records in research. The scientists used medical billing codes in the electronic health records of nearly 30,000 volunteers to gather information on more than 1,000 different health conditions. Because these volunteers had also contributed DNA data to the project, the researchers could efficiently look for links between Neanderthal genes and health conditions in their subjects. They found statistically significant associations between the genes and several health conditions, including nicotine addiction, skin lesions, a blood coagulation disorder, depression, and urinary tract problems.

The exact results of this study are mainly of interest to biologists who study the role of Neanderthals in recent human evolution. But the general idea of combining electronic health records with genetic data will soon play a big role in a much, much larger biomedical research project—the current administration's new Precision Medicine Initiative, the centerpiece of which is a giant study cohort of one million people. The National Institutes of Health has announced that it plans to recruit the first 79,000 participants this year. This enormous biomedical research project will necessarily rely heavily on electronic health records to track the long-term health of the participants.

But are we ready to do this? From a scientist's point of view, giving researchers access to electronic health records sounds like a great idea. However, as with any other tool used in research that involves human subjects, we need to ensure that proper safeguards are in place.

Last fall, an advisory committee for the Precision Medicine Initiative noted that "the success and longevity of the [Precision Medicine Initiative] will be heavily influenced by the laws, regulations, and policies surrounding research, data security and privacy, and access and interoperability of [Electronic Health Records]." Right now, those laws, regulations, and policies are still evolving. What security standards will researchers and data repositories need to meet, for example, to avoid the kinds of massive security breaches which hit Target and Home Depot? Though identifying data are typically stripped from health records when they're shared with researchers, we already know that it's possible to identify someone using anonymous genetic data—and that means a security breach of even anonymous electronic health record data could be extremely damaging.

And then there is the issue of who is included in studies that use electronic health records. Biomedical research in the U.S. has a well-known problem of producing studies that focus too heavily on white men. We need to ensure that understudied minority populations, who may lack access to health-care networks with state-of-the-art electronic health records, aren't left out. Thanks to new federal requirements, health-care networks are rapidly transitioning to electronic health records, but differences in the quality or interoperability of electronic health record systems could affect who is included in studies.

For nearly two decades, scientists have been making promises about the great benefits that would flow from new, DNA-focused medical research. But it's been difficult to deliver on those promises, in part because scientists haven't been able to collect health data on the same scale as they generate DNA data. Electronic health records may finally fix that imbalance.