Theory of a Scientific Revolution
Is the Credibility Revolution guided by anything?
Posted Nov 26, 2020
The new sociology manuscript on the current scientific reform movement ends with a damning sentence: “Metascience has been a frantically productive empirical field devoid of theory. A final question: How much longer can that last?” As I’ve reviewed elsewhere, I think Peterson and Panofsky raise some valid critiques of Metascience as a field, particularly the idea that any recommendations could apply to All Science (as opposed to just swaths of social science and medicine). Are they right?
First, consider the normal pattern for working psychologists trying to understand some topic. They often start by publishing research papers reporting an effect—which we can think of as a fact about the relationship between two things. For example, activating smile muscles puts you in a better mood. These facts are then used to support broader attempts to build theory. For example, we understand how exclusion works by looking at all the things exclusion has been found to influence.
If you read a lot of psychology papers, you find that they generally have this form. You can infer that what psychologists are trying to do is establish facts and build an understanding of how those facts fit together. I’ll call these the Unstated Goals of Psychologists. The current reform movement, because it has come bottom-up from the ranks of young working psychologists, understands these unstated goals. Rather than prescribing what goals psychologists should have, the reform movement takes these as given. It then examines how successfully research practices serve these goals.
Often this evaluation is based on statistics applied to the most common experimental designs. Psychologists say that they are using tests with only a 5% false positive rate, which suggests that the facts they establish will only be wrong 5% of the time. Yet the False Positive Psychology paper demonstrated—using both simulations and real data the researchers collected—that a combination of commonly used statistic “tweaks” could lead to a false positive rate of over 60%. More than half the time, researchers could be setting up experiments with no effect, but still be reporting that they found one. As the authors of the paper later put it, “we thought it was like jaywalking, but it was more like robbing a bank.”
The theory, in this case, is that researchers want to use their normal statistical methods to establish true facts about the world. However, mathematical simulations show that these normal methods lead to many errors. The theory is tested by going out and seeing if redoing studies without allowing for these kinds of statistical tweaks—but instead using precisely the recipe proscribed in the published literature—leads to many of these facts turning out to be unreliable. In 2015, the Open Science Collaboration found that, in the case of 100 studies, the theory turned out to be right. Only about 37% of the results replicated.
Many objections were raised to this conclusion, and potential alternative explanations to rule out. For example, some researchers suggested that there might be differences in the effect sizes due to the specific labs running the studies. The Many Labs 2 project sought to address this, by running 28 identical replication studies in 125 different labs. There was very little variability found in the effect studied across the different sites.
Researchers who use undergraduate students in studies often think students who sign up for studies later in the semester are less reliable, because they aren’t paying as close attention to the study instructions. Maybe this influenced replication. The Many Labs 3 project addressed this, finding that when in a semester an individual signed up had little effect on psychological effects.
Some researchers believe that subtle factors known only to experienced researchers in a domain (and not reported in original manuscripts) are needed to get results to work. The Many Labs 4 and 5 projects both examined this question, by having research groups run studies with and without the consultation of experts. Results in both projects indicated that having access to expert advice had little effect on whether a result replicated or not.
Metascientists could have raised new theoretical objections throughout their last decade of inquiry. Should psychologists really be spending all their time establishing new effects, or should they instead be carefully homing in on mathematical models that explain exactly how one effect works? Should we worry about whether established effects are reliable, or should we instead consider whether we’re measuring things that matter? Should we test whether there really are hidden tricks experienced researchers use to get studies to work, or should we argue that failing to report methods needed to get a study to work is scientific malpractice?
Instead of arguing for what the goals of psychology research should be, the psychology-focused metascience research group at the Center for Open Science (COS) has sought to listen to and respect the claims being made by researchers in their area, and to conduct studies that evaluate these claims. Research questions arose organically, often by attending to the concerns of critics. This line of work demonstrates that (at least in psychology), there is a theoretical through-line to metascience. It is an approach that should be familiar to anthropologists and sociologists conducting ethnographic research: Listen to what a community says is important, and treat this as your object of study.
There is definitely room to ask broader questions about what psychology research’s goals should be. Indeed, I believe the next wave of scientific reform will likely be focused on these deeper questions. It also makes sense to question how broadly insights gleaned from psychology and medicine apply to other sciences. Yet to conclude that Metascience is atheoretical (and therefore headed for a trainwreck) dismisses the work of hundreds of researchers trying to better understand and improve the work of their own scientific community.