The Decision Tree

Decision-making from all perspectives.

The Statistical Error Everyone Makes

Half of published studies make one elementary error.

Most people have nightmares about showing up for class naked. Scientists have nightmares about making elementary statistics errors in their published work. Publication in a scientific journal is supposed to be ironclad. Publication is part of a dialogue going back to, let's say, Aristotle. Every statement is supposed to be backed up with a citation or with new data. And every piece of new data is supposed to be (at a very minimum) analyzed correctly. Making a mistake there is bad dream material.

Now it seems the nightmare is real.

Sander Nieuwenhuis and two colleagues, all Dutch neuroscientists and psychologists, have written a paper in this month's issue of Nature Neuroscience exposing the prevalence of one particularly heinous statistical error. They surveyed neuroscience and psychology papers in the 5 top neuroscience journals, and found about half committed the error.

What was the mistake?

In almost all experiments, we compare the effects of some manipulation on an experimental group and a control group. The manipulation could be a drug to make rats smarter, it could be a math problem so difficult it reduces self-control in undergraduates, or it could be a drug that makes undergraduates smart enough to solve the math problem without compromising their self-control.

Then we test for significance, and we look for the all-important significance criterion p<0.05. Often the experimental group and the control group will show the same effect, but it achieves significance only in the experimental group. Often, the effect is weaker, and does not cross the threshold for significance in the control group. A statistically naïve person might stop there and conclude the manipulation had an effect. Not so fast. You actually cannot make any conclusions unless the effects themselves are significantly different. And half the studies in the survey did not even test for this. (We are excluding from consideration studies where this type of error is not possible because a different approach was taken).

This is brain science, but it's not rocket science. It's Stats 101. 

This does not mean that these papers are all wrong though. Nieuwenhuis and colleagues think that about 1/3 of the erroneous papers would have shown a significant difference in effect sizes had they bothered to do the tests. Many of the others may have been fine too - we just don't know. And that's a problem.

What's the solution here? Making authors be more clear about their mathematical methods is one thing. Hiring statisticians to review papers might help too. Requiring all grad students to take a serious statistics course or two is also good. Departments could all hire a Statistics Consultant to have on staff. These things would be difficult and expensive, but they wouldn't be a nightmare.

For more:

http://www.nature.com/neuro/journal/v14/n9/full/nn.2886.html#/t1



Subscribe to The Decision Tree

Ben Hayden, Ph.D., is an Assistant Professor of Brain and Cognitive Sciences at the University of Rochester.

more...