In recent years, we’ve been hearing that science—especially biomedical and social science—is plagued by flawed, irreproducible results. This claim has been anxiously discussed in scientific journals and widely covered in the media. Last year, the two top officials at the National Institutes of Health agreed that flawed science is a major problem. “The recent evidence showing the irreproducibility of significant numbers of biomedical-research publications,” they wrote, “demands immediate and substantive action.”
The stunningly high attrition rate of experimental drugs and therapies—90 percent fail to successfully traverse the path from initial development to Food and Drug Administration approval—is often cited as evidence of the problem. While a drug can fail for many reasons, the poor quality of published studies is now widely believed to be a major factor. In 2011, scientists at the pharmaceutical company Bayer claimed that the company’s scientific teams could not replicate more than two-thirds of published findings which they had considered promising starting points for drug development. Scientists at Amgen, who tried to replicate studies in cancer research, reported even less success, leading them to conclude that “a significant contributor to failure in oncology trials is the quality of published preclinical data.”
Behind these calls to clean up science is the hope that we can more efficiently translate discoveries in the lab into treatments in the clinic, thereby lowering the costs of drug development and increasing the returns on the government’s investment in research. But according to bioethicists Alex London and Jonathan Kimmelman, reducing failure in science is not necessarily something we should be aiming for. Writing in eLife, they claim that “the failure of well-designed studies benefits both researchers and healthcare systems.” Failure, in other words, isn’t always a sign of bad science—it can also be a sign that science is working as it should.
Exploratory investigations are meant to push the boundaries of our knowledge—if such studies don’t produce some ideas that turn out to be wrong, then we’re not pushing the boundaries hard enough.
Why should we embrace failure in science? Because, London and Kimmelman argue, failure “is actually a necessary part of a rational and efficient approach to building a robust understanding of the diseases we are trying to treat.” To understand the valuable role of failure in medical research, they explain, we need to take a broader view of what counts as success. The purpose of clinical and pharmaceutical research is not simply to produce more drugs—it’s also to produce knowledge about how diseases and drugs work. Drugs by themselves are not useful; they are only therapeutic when we understand how and when they should be applied. While some of this knowledge can be worked out in animal studies, ultimately “in vivo studies in humans represent the only way of confirming or discrediting emerging theories.” Without failures, “clinicians lack the knowledge of how far they can extend the application of a new drug while preserving its desirable effects.” If designed and reported properly, a clinical trial that fails to produce an effective new treatment can still be successful by improving our knowledge of disease and guiding drug development in the future.
Such an optimistic view of failure might seem hard to accept—drug development programs and clinical trials cost money, take time, and involve risks to real human beings. And, of course, uninformative failures due to sloppy, low-quality research are inexcusable. To the degree that recent efforts to root out irreproducible results focus on poor scientific practices, they should be encouraged. However, the goal shouldn’t be to reduce failure; rather, it should make failure more efficient and productive.
One way to do this, according to London and Kimmelman, is to clarify the distinction between exploration and confirmation in preclinical research. Exploratory research is necessary because “identifying promising interventions is akin to exploring a vast, multidimensional landscape of agents, doses, disease indications and treatment schedules.” In order to explore this landscape in a way that’s economically feasible, researchers rely on smaller, methodologically flexible studies that tend to use fewer lab mice and aren’t always designed to test one specific hypothesis. As a result, even good exploratory studies are statistically limited and prone to false positive results. To achieve statistical rigor, exploratory studies should be followed by larger, statistically powerful “confirmatory” studies that conclusively test the clinical relevance of new hypotheses and theories. Because new hypotheses typically reflect the cutting edge of our knowledge, we should expect many of them to fail—and when they do, we can improve our understanding.
Too often, exploratory studies are mistaken for confirmatory studies and then wrongly flagged as a source of irreproducibility in biomedical research. As Kimmelman and his colleagues argued in a paper published last year, “We suggest that the ostensibly poor performance of many preclinical studies may in fact reflect strengths and intrinsic properties of what we call ‘exploratory investigation’….”
Exploratory investigations are meant to push the boundaries of our knowledge—if such studies don’t produce some ideas that turn out to be wrong, then we’re not pushing the boundaries hard enough. The problem is not that some findings are wrong, but that they’re not seen as provisional. Unfortunately, researchers and scientific journals tend to blur the lines between exploration and confirmation, often hyping the potential medical significance of an exciting new finding of an exploratory study before it has been rigorously confirmed. Both scientists and policymakers, Kimmelman argues, need to do a better job of recognizing the limitations of exploratory research, and of encouraging less glamorous but equally essential confirmatory studies.
As new graduate students quickly learn, failure is integral to science. The world is complex, and our knowledge is incomplete. And so, new ideas are usually wrong, and new experiments usually fail. The pioneering microbiologist Oswald Avery used to tell his students that “disappointment is my daily bread.” The Nobel Prize-winning chemist Linus Pauling, who published a famously wrong structure of DNA two months before Watson and Crick published the correct one, used to tell colleagues that, “The best way to have good ideas is to have lots of ideas and throw away the bad ones.” This advice applies not only to individuals, but also to scientific communities as well.
As we worry about how to improve reproducibility in science, it’s critical that we recognize the important role of failure as researchers sift through a vast set of possible hypotheses about how the world works. We shouldn’t tolerate avoidable errors or fraud, but failure is acceptable.
Inside the Lab explores the promise and hype of genetics research and advancements in medicine.