With Great Health Data Comes Great Potential for Bias

Technology is putting health data directly in the hands of individuals, allowing them to gain new insight into their bodies and minds. Mobile apps and wearable devices, for instance, collect and share data on everything from a person’s diet and exercise routines to fluctuations in their mood or changes in their medications. Online disease communities, meanwhile, provide support for people navigating complex health concerns. The data derived from these sources and the corresponding content this data produces for users offer a treasure trove of opportunities for improved care and personal wellness.

On the flip side, the volume and sensitivity of health data generated by these apps, devices, and platforms raise important privacy questions about the boundaries of personal space. Equally important, but less researched, are questions around whether the health-related content and recommendations given back to consumers are reflective of different cultures and communities.

In other words: Are we creating health management systems that are responsive to differences, or ones that spit back content riddled with historical bias and discrimination? This question has implications far beyond an individual’s health; the data generated by consumer-facing health technologies frequently ends up informing broader health practices and public-health research.

Most of the personal health data in the commercial environment isn’t regulated, meaning most of it is fair game for unrestrained collection, use, and sharing. You might not be surprised to hear that companies from Google to Facebook to Apple are actively curating personal health data from a variety of sources, including social media posts, discussions in online support communities, and mobile health apps. But what happens next may be less familiar. This data—along with a wide range of other personal information about consumers’ purchases, income, and behavior, as well as demographic information on race and ethnicity—is then used to develop fine-grained data profiles of individuals. To further squeeze value out of this data, companies then apply automated processes, like data analytics, that harness statistics, algorithms, and other mathematical techniques to convert it into useable knowledge, such as recommendations about dieting or exercise, or to predict (and target advertising on) potential health conditions.

While statistical analyses like these have become ubiquitous in daily life, the data and technology powering them is often opaque and can produce unexpected results, amplifying historical biases in obscure ways that are difficult to identify or monitor. For instance, data sets that are analyzed for predictive purposes might calculate the likelihood of depression or chronic illness in your future, which might then be used to determine your eligibility for life insurance. If the data being used is biased or has some other flaw, it’s likely to be skewed against already vulnerable or underrepresented groups. The reality that a population may be quite diverse may be lost, generating recommendations that skew “Western, educated, industrialized, rich, and democratic”—or WEIRD.

It’s worrying, then, to consider that the bias in this data, after it’s been used multiple times and in multiple studies, might impact not only an individual’s treatment, but also our overall understanding of a population’s health. Traditional health-care providers see the benefit of the torrent of data flowing from consumers and are beginning to rely on commercial data, and algorithms, to personalize treatment in a more granular way. At the doctor’s office, for example, automated systems informed by commercial data sets may lead your provider to advocate for one medication or treatment over another, taking into account your overall characteristics, costs associated with your care, and potential outcomes. Indeed, the line between traditional medicine and personalized health has become increasingly muddied by the introduction of commercial services into the ecosystem and the co-mingling of private and public health information.

This isn’t to suggest that data is, inherently, dangerous. The potential for technology from private companies to improve individual- and population-level health is immense, from providing basic incentives for wellness to tailoring health interventions to spurring the medical and international aid community to respond to the spread of diseases more rapidly. Even so, this potential can be undermined when data results in disparate outcomes for different populations. The marriage of private data with public health care makes it even more critical for us to understand how bias is introduced in the commercial health data supply chain.

Unfortunately, there’s no easy way to dig into these processes without more transparency from companies (though not necessarily algorithmic transparency, which is difficult at best and nonsensical at worst). Instead, we need to pull back the curtain on the design, building, implementation, and testing of automated processes. Apple’s introduction of HealthKit, an interface designed to allow users to track their health in conjunction with its Health app, is a well-known example of how bias can inadvertently influence product design. To the surprise of many, the tech giant didn’t include a way for women to track their menstrual cycles in HealthKit. Though Apple eventually corrected the problem, it didn’t provide broader clarity or insight into how such a basic health metric was ignored. Companies have a moral and an economic incentive to participate in finding solutions to these concerns; in addition to limiting their ability to understand the health of certain populations, inadvertent bias might also raise accusations about how a product or service is automating and mechanizing inequality and health disparities.

When a company with enormous reach decides not to show certain information, makes a change to a policy or a practice, or launches a new product or service, the impact is a meteor-sized crater—the meteor just happens to be unseen. Commercial entities operating in the health space ought to be mindful of our multi-cultural world, the sensitivity of health data, and the possibility that, without checks and balances, they may be reinforcing harmful bias. Adding transparency and other safeguards against bias and discrimination becomes, in this, more than just a market imperative—it becomes an essential public good.

This story originally appeared in New America’s digital magazine, New America Weekly, a Pacific Standard partner site. Sign up to get New America Weekly delivered to your inbox, and follow @NewAmerica on Twitter.

With Great Health Data Comes Great Potential for Bias

Related Posts

What Do Monsanto and a Vegan ‘Meal Kit Start-Up’ Have in Common?

Viewfinder: Lionel Messi Shoots and Scores as Argentina Advances in the World Cup

The Other Thing That Happened Last Thursday