Is Psychological Science Bad Science?
Psychology researchers failed to replicate over half of 100 published studies.
Posted Aug 28, 2015
Scientists from dozens of laboratories recently attempted to replicate one hundred studies that had been published in three top psychology journals in 2008. The results were so startling that they have been reported all over the media: They found that only about 1 in 3 studies could be replicated, and the overall size of the reported effects was about half of that found in the original studies.
So does this mean psychological science is just a bunch of nonsense? Decidedly not. And here is why.
It isn't just psychological science
The researchers pointed out that they chose to investigate the reproducibility rate of psychology not because there is something special about psychology, but because they themselves are psychologists. But concerns about reproducibility are widespread across many scientific disciplines.
Consider, for example, biomedical research, which directly impacts the lives and health of millions. More than half of biomedical findings cannot be reproduced. For example, pharmaceuticals company Bayer recently reported that it failed to replicate about two-thirds of published studies identifying possible drug targets (Nature Reviews Drug Discovery, vol 10, p 712). During the decade he served as head of global cancer research at pharmaceutical company Amgen, C. Glenn Begley and his team sought to replicate 53 landmark papers on cancer research published in top journals and conducted by reputable labs. They found that 47 of the 53 could not be replicated (Nature, vol 483, p. 531).
In 2012, researchers from Nanjing University published a paper on genetics that showed a microRNA in rice could regulate genes in the liver of mice that had eaten the rice (Cell Research, 22:107-26, 2012). The result was of enormous importance in the field of transgenic crops. But the result could not be replicated by other labs. The researchers concluded that the published findings must have resulted from a nutritional imbalance as a result of the experimental diet fed to the mice.
It's the nature of the scientific beast
The researchers pointed out
Because reproducibility is a hallmark of credible scientific evidence, it is tempting to think that maximum reproducibility of original results is important from the onset of a line of inquiry through its maturation. This is a mistake. If initial ideas were always correct, then there would hardly be a reason to conduct research in the first place. A healthy discipline will have many false starts as it confronts the limits of present understanding.
Scientific American blogger Jared Horvath describes three famous replication failure cases from the history of science:
At the turn of the 17th century, Galileo rolled a brass ball down a wooden board and concluded that the acceleration he observed confirmed his theory of the law of the motion of falling bodies. Several years later, Marin Mersenne attempted the same experiment and failed to achieve similar precision, causing him to suspect that Galileo fabricated his experiment.
Early in the 19th century, after mixing oxygen with nitrogen, John Dalton concluded that the combinatorial ratio of the elements proved his theory of the law of multiple proportions. Over a century later, J. R. Parington tried to replicate the test and concluded that “…it is almost impossible to get these simple ratios in mixing nitric oxide and air over water.”
At the beginning of the 20th century, Robert Millikan suspended drops of oil in an electric field, concluding that electrons have a single charge. Shortly afterwards, Felix Ehrenhaft attempted the same experiment and not only failed to arrive at an identical value, but also observed enough variability to support his own theory of fractional charges.
Science proceeds by these fits and starts, and replication failures don't always spell doom for a scientific endeavor.
In fact, Dr. John Ioannidis, a professor of medicine at Stanford, has argued for years that most scientific results are less robust than researchers believe. In a recent interview with the Washington Post, Ioannidis praised this large scale study on replication, and claimed it should have repercussions beyond the field of psychology.
We like to read about and invest in whiz-bang results.
Science is an expensive endeavor. It requires commitment of funds from the public and private sectors. And attracting that funding typically means persuading non-scientists who hold the purse strings that a line of research is worthy of investment. Work-a-day progress in a discipline rarely does the trick. Instead, "whiz-bang", never seen before, startling results are what attract attention. The end result is that scientific journals and popular media increasingly prioritize novelty over replication when deciding which papers to publish. (The journal Psychological Science is notorious in this regard). Positive results are a must; negative results rarely see the light of day.
Scientists must publish or perish more so now than ever.
Over coffee with a colleague recently, the conversation turned to the pressure newly minted PhDs face in finding ever-vanishing tenure track science positions in academia. These positions have shrunk by more than 50% in the past decade or so. And that is where the majority of publicly-funded scientific research is conducted. He pointed out that it can now take close to eight years to complete a PhD in neuroscience, and dozens of publications are absolutely necessary to be considered competitive even for temporary post-doctoral fellowship positions.
That conversation reminded me of one I had early in my career in the late '80's. My senior colleagues openly admitted that they would never have made tenure had they been held to the standard to which they were required to hold their junior colleagues. And these were men whose wives were full time homemakers, meaning they themselves rarely needed to cook dinner, wash clothes, or look after children.
Things have only gotten worse over the past three decades. We have raised the bar for hiring, tenure, promotion, and grant seeking to such dizzying heights that the pressure to produce a 100+ publication vita for tenure, promotion, and grant money has led to the inevitable: sloppy work and/or cheating. As the replication researchers point out, much of the problem is the outcome of "a skewed system of incentives that has academics cutting corners to further their careers." Studies are conducted with small sample sizes, time is rarely taken to replicate an effect in one's own lab before rushing to publication, and, in rare but disconcerting cases, outright fraud is committed in the form of data tampering.
In my opinion, these unrealistic standards of scientific productivity lie at the heart of the "gender gap" in scientific recognition. Here is how it goes: A committee is formed to put together a scientific symposium, panel, or conference. The first order of business is to attract "big names" as keynote speakers, and that means scientists who have the vaunted 100+ publication vita combined with multi-million dollar grants. And, inevitably, very few women satisfy those criteria.
Why? Because most female scientists refuse to sacrifice their reproductive effort to science. To put that in less scientific terms, women will dial back their productivity in order to have (or adopt) and raise children. Once their children are older, they swing back into gear. But those "dialed down years" mean fewer publications and grants than the macho men who put career ahead of everything. Make no mistake: Female scientists are equally brilliant, and the work they accomplish greatly informs and enhances the body of knowledge in their fields. But their contributions look "smaller and weaker" than their male counterparts because of this commitment to family.
The simplest way to address the "failure to replicate" crisis in science is to allow researchers the time necessary to replicate their own studies in order to ensure that the results are real BEFORE rushing to publication. The way to address the "gender gap" is to bring our standards of scientific excellence down from the clouds to something more humanly realistic. And that means recognizing that productivity will decline from its current dizzying heights.
Meanwhile, here are some findings from psychological science that have been replicated so many times that they are considered facts:
About 65% of us will obey an authority's orders to harm an innocent person
Problem-solving is a search for means to transform current states into goal states, and this process can be automated
It is possible to learn without forming conscious memories
Perception isn't veridical; it is your brain's best decision about the input it received from your senses
The accuracy of memory is a U-shaped function (first and last events remembered better than the middle) regardless of whether you are trying to remember a short list or a series of events taking place over years.
When we are given minor inducements to act in ways that are at odds with our beliefs, we change our beliefs to bring them more in line with our actions.
Probability and utility are processed by separate areas of the brain.
Babies are not born as tabulae rasae. They divide the world into agents and objects, and friends and foes. They understand that the behavior of objects is constrained by simple physical principles, and that agents are motivated by internal states. They prefer agents who help others and agents who show preferential treatment to their own in-group members.
Copyright Dr. Denise Cummins August 28, 2015
Dr. Cummins is a research psychologist, an elected Fellow of the Association for Psychological Science, and the author of Good Thinking: Seven Powerful Ideas That Influence the Way We Think.
More information about me can be found on my homepage.
My books can be found here.
Follow me on Twitter.
And on Google+.
And on LinkedIn.