Maybe the world is constantly improving, or maybe we just think it is. Certainly, scientists seem to subscribe to "Whig history," the idea that historical change involves inevitable and inexorable progress. Take, for example, the way they talk about their work.
In 1974, one in 50 journal abstracts employed complimentary descriptors to describe research. By 2014, such praise was featured once in every six abstracts, an increase of nearly 900 percent. The term “innovative” alone had become 2,500 percent more common—without any obvious indications that the research described was 25 times more groundbreaking. It appears that scientists perceived the caliber of their outputs to steadily improve with every passing year.
However, the decades between 1974 and 2014 were almost precisely those during which disquiet about the quality of published science reached fever pitch. Concerns about unmarshalled publication bias, underpowered sampling, and many other problems led observers to question the standing of published research. Several landmark papers appeared, such as the John P. A. Ioannidis classic, “Why Most Published Research Findings Are False.”
It seems that the more we learn about the weakness of our research, the stronger we think it is. This cognitive habit is surely troubling. We should bear it in mind whenever we are told that psychology's replication problems are being solved. Unfortunately, we cannot simply wish those problems away.
* * *
In fact, science—including psychological science—might be getting worse and worse, instead of better and better. When the worth of a university employee is counted in grant dollars and citations, what is good for the individual researcher is not necessarily good for their research. An obsession with output quantity tends to render rigor maladaptive, and instead to favor the natural selection of bad science.
Given the recent surge in media interest in psychology's iffy replication record, it is easy to form the impression that our state of crisis is something new. In fact, our field has been grappling with several interwoven crises for decades:
- theoretical fragmentation (a paradigmatic crisis)
- reductionism (a measurement crisis)
- sloppy approaches to significance and effect sizes (a statistical crisis)
- a tendency to focus on a tiny sliver of the human population (a sampling crisis)
- premature optimism about the progress made by psychology, both in basic science and in resolving its reproducibility problems (an exaggeration crisis, if you will).
In my new book about this topic, Psychology in Crisis, I systematically dissect each of the crises above and several others.
I learned a lot from writing Psychology in Crisis. For example, I have learned that whenever anyone mentions the word “crisis,” there will be people who ask, “Crisis? What crisis?” There will always be folks desperate to wish the crisis away.
In psychology, public clashes between self-flagellators and their rose-tinted colleagues have inevitably led one headline-writer to quip that psychology is now “in crisis over whether it’s in crisis,” a literary flourish that carries more than a ring of truth.
* * *
Psychologists have made significant progress in tightening up the field, and it is important to acknowledge that. Nonetheless, in my view, we seriously need to avoid being lulled by optimism. We cannot let our guard down just yet.
This is because, despite our efforts to improve things, we have done little or nothing to address the fundamental force that feeds our replication problems—the perverse incentives that cultivated the natural selection of bad science in the first place.
- Journals continue to prioritize statistically significant results over the reporting of null effects, thereby encouraging sloppy practices such as p-hacking and 'HARKing' (hypothesizing after the results are known), and perpetuating the file-drawer effect. They do this because it makes them more successful. The market hungers for statistical significance; the journals feed that hunger in order that they themselves can survive.
- Citations (and h-indexes) are still routinely used to assess the output of individual researchers even though everyone knows that such measures say nothing about (and so fail to promote) research quality. In fact, metrics often indicate the opposite of quality—truly bad studies regularly go viral. This persistence with using citations as a measure of researcher prowess—on the part of tenure committees and grant agencies, among others—encourages salami-slicing, gratuitous self-citation, and other destructive habits that serve to distort research.
- Despite many condemnations, Journal Impact Factors remain the main unit of currency by which journals are valued. The adverse effects of JIFs are very well documented. They put pressure on journal editors to turn blind eyes toward salami-slicing and citation-padding. They also encourage publishers to hold papers ‘in press’ or ‘online ahead of print’ for months (if not years) on end, further clouding the eventual record of research outputs in a given field. A competitive form of criterion-chasing, rather than a genuine desire to maximize the quality of science, is what drives behavior in this industry. Its crises aren't going away any time soon.
- Several problematic authorship conventions continue unimpeded. Many relate to free-loading (the assignment of authorship credit where it is not warranted). Again, this is enabled by arbitrary conventions and industry expedience. For example, in standard résumés and personal profiles (such as on Google Scholar), individual author metrics fail to control for the fact that most psychology papers are team efforts. For a given paper, each co-author is thus credited with having produced one full publication (rather than a share of one) and deemed to have attracted all of its citations (rather than just a portion). In any other work-productivity context, the output of a team would count, logically, as one single output. It would never be counted as one-multiplied-by-the-number-of-team-members outputs. In psychology, as elsewhere in science, a kind of infinite scalability drives extensive authorship freeloading (‘honorary’ or ‘ghost’ authorships are still a thing), creating a slippery slope towards a generalized contempt for research ethics. (Consider: if it is okay to ignore the ethical norms on authorship, then what other ethical norms is it okay to disregard?) Such sloppiness is the very antithesis of rigor, and perniciously feeds bad science in all its forms.
* * *
Pre-registration of research protocols will surely help deal with the file-drawer problem. However, by and large, the registration of research remains optional rather than compulsory. Psychologists can easily pursue research programs without bothering to pre-register.
When it comes to tenure or promotions, few universities (if any) provide bonus points for publishing pre-registered studies as opposed to the traditional, unregistered kind. So while registered reports are important for good science, the incentives needed to encourage scientists to produce them remain extremely weak.
I am not aware of comprehensive statistics on the matter, but I would be surprised if pre-registered research makes up even 1 percent of what will be published in psychology journals this year. From a baseline of zero not so long ago, that represents progress. But let's not get carried away celebrating our bold new world just yet.
* * *
Failure to dismantle the distorted reward architecture that shapes research in psychology (and other sciences) ensures that we will continue to see the same dynamics that, over the past century, led us to our current disarray.
Claims that we have fixed our problems (or, more subtly, that we have overstated them) are counterproductive because they lull us into unwarranted optimism. They make us take our eyes off the prize.
Instead, we should invest effort in keeping our focus razor-sharp. Let's not celebrate the end of the crisis prematurely. Let’s not succumb to crisis-denial or get bogged down disputing the premise. Let’s try to avoid the tailspin of cognitive dissonance, optimistic self-delusion, and wonky reinforcement that caused the crisis in the first place.
Otherwise, we could end up in an ever-deeper type of turmoil, one entirely of our own creation—in crisis about whether we’re even in crisis over whether we’re in crisis.