Psychology Experiments' Questionable Results

Reproducibility is one of the cornerstones of modern science. If one scientist can’t reproduce another’s results, then at the very least there’s something in the research worthy of further analysis. Apparently, there’s something fishy going on in the field of psychology: In one new study, researchers were able to replicate fewer than half of the 100 studies placed under scrutiny.

“Reproducibility is a central feature of science,” says Brian Nosek, lead author of the new paper and a professor of psychology at the University of Virginia, but “there’s growing concern that reproducibility may be lower than expected or desired.” Yet, Nosek says, that could be an opportunity.

Fears that something was rotten in academic psychology came to a head in 2012, when (formerly) respected psychologist Daryl Bem claimed to have found evidence, since debunked, that extrasensory perception was real. That same year, Nobel Prize winner Daniel Kahneman, worried in particular that certain social psychology research wasn’t up to snuff, challenged his fellow psychologists to improve their work, and, in particular, make a greater effort to replicate each other’s results.

“We so much want to have clear answers, but science does not provide certainty, at least not immediately.”

Concerns about reproducibility led the Association for Psychological Science to create the Registered Replication Reports project in 2013. That was the same year Nosek launched the Center for Open Science, which organized a gargantuan effort to learn what it would take to replicate past research. Nosek and his team’s results will be published tomorrow in Science.*

In all, 270 researchers attempted to replicate 100 studies culled from three top journals, including the Association for Psychological Science’s flagship Psychological Science. Nosek’s team, the Open Science Collaboration, worked with the original studies’ authors to replicate their experiments as closely as possible, using criteria including statistical significance and effect sizes to determine whether the studies’ results were reproducible.*

Nosek says he only expected to replicate about half of the experiments. In any case, there’s always potential for false positives (finding a psychological effect where there is none, through no fault of the researchers) in the original studies, and false negatives (failing to find a real effect) in the new ones. Indeed, the team expected to replicate only 89 studies, due solely to the likelihood of false positives and negatives.

The results didn’t look nearly that good. “[W]e were able to reproduce less than half of the original 100 findings,” Nosek says. By the statistical significance measure, the team managed to replicate just 35 studies, mostly ones with larger effect sizes and stronger statistical significance.

Still, those results don’t mean science is somehow broken. “We so much want to have clear answers, but science does not provide certainty, at least not immediately. Science is a process of uncertainty reduction,” Nosek says, and no single study will be the final word.

Indeed, a failed replication could mean that researchers don’t understand the psychology and conditions of each original study well enough to replicate them—in which case, the failure could be an opportunity for researchers to come to a deeper understanding of their results, Nosek says.

“That’s the way science works,” adds Association for Psychological Science executive director Alan Kraut, noting that some core ideas in psychology were once treated as flukes. “The early work in cognitive dissonance was plagued” by failures to replicate, he says, but has since become universally accepted by psychologists. “It’s not only a core concept in psychological science, it’s one of those concepts that’s become a core in our culture more generally.”*

Other researchers have praised Nosek and his colleagues’ study. “This is … one for the history books—the first large-scale systematic and empirical study of reproducibility, in any area of experimental science,” Alex Holcombe, an associate professor of psychology at the University of Sydney and co-editor of APA’s Registered Replication Reports, writes in an email. The study’s approach, which included publicly posting experimental methods and data analysis procedures, “is the way forward to improve our sciences,” Holcombe writes.

“It’s going to raise more questions than it answers, and that’s good,” says Sanjay Srivastava, an associate professor of psychology at the University of Oregon. “It’s a huge achievement.”

*UPDATE — August 27, 2015: This article had mistakenly attributed the Registered Replication Reports project, Psychological Science, and executive director Alan Kraut to the American Psychological Association.

Psychology Experiments’ Questionable Results

“We so much want to have clear answers, but science does not provide certainty, at least not immediately.”

Related Posts

Bringing Back the Zoot Suit

What the Language You Speak Says About You

Reader Feedback: Are We All Confident Idiots?