False Positive Psychology
Researchers make unacknowledged decisions that may increase false positives.
Posted Nov 20, 2011
You've got to accentuate the positive,
Eliminate the negative,
And latch on to the affirmative.
Don't mess with Mister In-Between
- Johnny Mercer (1944)
This essay is not about positive psychology, false or true, although I hope the deliberately ambiguous title drew some of you readers here.
The essay is instead about a recent and provocative article by Joseph Simmons, Leif Nelson, and Uri Simonsohn (2011) on the problem of false positives in research. A false positive is a conclusion that some effect occurred when in fact it did not.
False positives, when reported in the research literature, have a cascading cost. The erroneous conclusion may persist because subsequent researchers do not challenge it or - if they do - because they have trouble publishing "no findings" (even when correct). Scientific journals, at least in psychology, are biased against publishing papers describing exact replications and moreover against papers reporting no differences. The field and the larger society suffer because resources are wasted following up a false finding and perhaps making ill-informed policy changes on its basis.
All of us who have taken a research methods or statistics course in college know that false positives can occur simply because of the luck of the draw of a particular sample. Remember the statistical significance level of p < .05, which means that a given finding could have occurred by chance only one out of twenty times. Five percent of findings published in journals can be expected to be false positives, and when we consider how many thousands and thousands of studies are published every year in psychology journals, the number of erroneous conclusions is obviously large and disconcerting.
But things can be even worse. The article by Simmons and colleagues addresses reasons for false positives above-and-beyond the well-known one in twenty risk. When conducting a study and analyzing its results, a researcher makes many decisions - usually unacknowledged in the final research report - that may increase the likelihood of false positives.
None of these entails deliberate fabrication of data. That's an altogether different matter. Rather, these are ostensibly innocent decisions that inadvertently bias the conclusions, often in a direction that produces false positives.
So, a researcher may focus on some variables that were measured in a study and not others, or on some experimental conditions and not others. Sometimes the variables or conditions not on focus are not even mentioned in the final report. False positives are obviously more likely when the sole focus is on the variables and conditions that yield statistically significant results. Given the biases of journals, this practice is likely common. Along these lines, a researcher may control statistically for the effects of some "third variables" but not others, again with a bias toward yielding statistically significant results.
Or a researcher may drop the data from certain participants because they were too extreme or because the scientific protocol seemed not to have been followed for a given participant. There are legitimate reasons for excluding data, but these should be spelled out in the final research report, and Simmons and colleagues urge researchers to report results with and without excluded data.
Or a researcher may stop gathering data in a study if the research appears to be working as intended - i.e., yielding statistically significant results. Obviously, a study needs to stop at some point, but the rule for stopping should be specified before the study starts and not be a function of what the early returns seem to show.
Analyzing the data from a study is complex, and it can take different forms. One cynical description I have heard distinguishes among data analysis (the procedures that are taught in a research methods or statistics class); data fishing; data mining; data massaging; and data mugging. These differ along a dimension of the likelihood of false positives.
I believe that researchers should accentuate the positive, meaning what was found in a study, but also the negative, meaning what was not found. And researchers should certainly be more careful than they sometimes are about keeping the positive and the negative straight by describing their procedures - not just for gathering data but for analyzing data - thoroughly. Journal editors should not expect perfect results or those that make one exclaim "gee whiz!" upon reading about them. These expectations encourage data fishing and mining as well as data massaging and mugging.
The recommendations by Simmons and colleagues are worth heeding, but all researchers need to follow them for any meaningful changes to occur. As Simmons and colleagues observed, the credibility of psychology as a scientific field is at stake.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359-1366.