Skip to main content

Artificial Intelligence Will Be as Biased and Prejudiced as Its Human Creators

The reason is simple: AI learns from us.

By Nathan Collins


(Photo: Jeff J Mitchell/Getty Images)

The optimism around modern technology lies in part in the belief that it’s a democratizing force—one that isn’t bound by the petty biases and prejudices that humans have learned over time. But for artificial intelligence, that’s a false hope, according to new research, and the reason is boneheadedly simple: Just as we learn our biases from the world around us, AI will learn its biases from us.

There was plenty of reason to think AI could be unbiased. Since it’s based on mathematical algorithms, AI doesn’t start off with any explicit preference for white-sounding names or a belief that women should stay at home. To guard against implicit biases—biases that programmers might build into AI without realizing it, sort of like how standardized tests are biased in favor of whites—some have recommended transparent algorithms, more diverse development teams, and so on.

But while such approaches may help, Princeton University computer scientists Aylin Caliskan-Islam, Joanna Bryson, and Arvind Narayanan argue they couldn’t possibly do enough. “We document machine prejudice that derives so fundamentally from human culture that it is not possible to eliminate it through strategies such as they above,” the team writes.

Just as we learn our biases from the world around us, AI will learn its biases from us.

The computer scientists employed something of a novel approach to reach that conclusion: In essence, they gave a computer algorithm a series of implicit association tests (IATs), a now-standard psychological method for assessing racial and gender bias, and then proceeded to replicate the results of IATs given to real-life humans. (There’s more about how IATs work here.)

The algorithm of choice was GloVe, a state-of-the-art method for extracting word meanings based on their contexts—that is, how often they appear with other words in a body of text. According to GloVe and related approaches, the more often two words appear in similar contexts, the more closely related they are. For example, if “woman” appears frequently with words related to family, then the concept “woman” is closely related to the concept “family.”

Using a version of GloVe trained on 840 billion words of text from the Internet, the team set about replicating some of the most famous demonstrations of human prejudices, among them a 2002 experiment that used IATs to reveal—perhaps unsurprisingly—that we tend to associate typically female names with family and typically male names with careers. Caliskan-Islam, Bryson, and, Narayanan’s word-similarity approach yielded the same result—that is, names like Amy and Joan were more similar to family words like home and parents. In contrast, John and Paul were more similar to career words such as corporation and salary.

The researchers also found that white-sounding names were more similar to pleasant words like joy and and peace, while African-American-sounding names were more similar to unpleasant words like agony and war—a conceptual replication of a 2004 experiment that showed companies were more likely to hire job candidates with white-sounding names.

It’s important to note that the results don’t mean GloVe or related algorithms are inherently biased—indeed, it’s hard to see how they would be. Instead, it’s our language and culture that’s biased, and as long as that’s true, Caliskan-Islam, Bryson, and Narayanan write, artificial intelligence “can acquire prejudicial biases from training data that reflect historical injustice.”