Alexander Danvers Ph.D.

How Do You Know?

Scientific Pressure and the Yale COVID Poop Study

Flaws revealed in another new COVID study from an elite university.

Posted May 27, 2020

 Photo by Hafidz Alifuddin from Pexels
New research tries to predict COVID outbreaks from analyzing sewage.
Source: Photo by Hafidz Alifuddin from Pexels

When a person with COVID-19 poops, some of the virus’ genetic material can be detected through chemical testing. You can also detect COVID-19 RNA in an area’s sewage. This led Yale researchers to come up with a clever idea: What if you could predict outbreaks of COVID-19 from analyzing an area’s sewage? As Yale researcher Jordan Peccia and colleagues report in a new preprint, you can. And not only does it work, but poop analysis *nearly perfectly* predicts outbreaks three days later.

This finding is already starting to make the rounds of the hype train, with a Fox News headline and nearly 2 million views of the study on Twitter (according to a researcher who tweeted out a link to the study). But—and stop me if you see a theme here—the statistics were incorrect and misleading, hugely overestimating the ability of poop analysis to predict COVID-19 outbreaks. As in several recent studies out of Stanford, the need to make headlines seemed to trump the need for caution in reporting results.

First, a brief, not-too-technical explanation of the issue: The authors report that the amount of COVID-19 RNA in poop predicts the number of cases in an area with an R2 = .99. This statistic represents how much of the variability in an outcome (in this case, number of COVID-19 cases) can be predicted by other variables (in this case, the amount of RNA in poop). The scale is 0-1, meaning that you can roughly interpret the value reported by the Yale researchers as saying their prediction is 99% accurate.

Screenshot by A. Danvers
Screenshot of a tweet promoting the key graph--with smoothing--in the COVID poop study.
Source: Screenshot by A. Danvers

That’s insanely high. In psychology, we’d consider an R2 = .10 to be high. That’s about the relationship you get between someone saying they’re extraverted and how much time they spend talking in their daily life. The Yale researchers reported “too good to be true” results. And like most “too good to be true” results, they weren’t true.

The problem was that, instead of calculating the correlation using the actual data points they collected, the Yale researchers first smoothed the data—which removes variability and makes it much easier to predict. Instead of analyzing their raw data—the actual measurements they made, without any manipulation—they decided to analyze data that removed most of the unpredictable parts of their measurement. This is a misleading and inappropriate approach.

You can see this in the screenshots I've got here: The one with just the curves makes it look like all the data lines up nearly perfectly. But that's not a representation of the actual measurements. The actual measurements are the dots in the screenshot below—and they don't line up perfectly on the curves at all. What should have been analyzed are the measurements (the dots), not the curves.

Screenshot by A. Danvers
A more accurate representation of the data (the dots) shows a much less perfect relationship.
Source: Screenshot by A. Danvers

This was immediately pointed out by other researchers on twitter. A small group even tried to reverse engineer the data (through automatic extraction of data from the graph in the paper) and calculate what the appropriate result would be. Their conclusion: somewhere in the neighborhood of 15-40% of the variance in COVID-19 cases explained. In other words, the accuracy reported was off by potentially 84%.

Why do we see Yale researchers, like recent work by Stanford researchers, using sloppy statistics to make overhyped claims? In part, it’s because of what it takes to become a Yale or Stanford researcher. Being a research professor at an elite U.S. university means continually churning out groundbreaking research findings. Researchers at elite universities work under the Paradigm of Routine Discovery: Their colleagues, university officials, granting bodies, and even they themselves seem to expect that big scientific findings can be reliably generated over and over by the same small group of elite researchers. The implicit idea is that a Big Name researcher at a Big Name university can figure out a system or program to routinely push the outer bounds of human knowledge—and if these researchers can’t, the university can always hire another set of researchers who can.

Yet expecting researchers to deliver groundbreaking new effects every year is out of line with the pace of insight and progress in science. In his book Representing and Intervening, philosopher of science Ian Hacking gives a ballpark estimation that all of progress in physics (from Newton up to 1983, when the book was published) was based on 40 effects. Coming out with even one of these in an entire scientific career, therefore, would be a huge accomplishment, never mind one a year (or more) until you’re up for tenure.

My concern is that, when this kind of productivity pressure is placed on researchers, sloppiness becomes inevitable. I believe that you can have poorly done science without having a lot of “bad” scientists. Given the wrong incentive system, you can blindly select for research that is poorly done. In that environment, each level of promotion in the academic chain is potentially weeding out more people who work slowly but carefully—and this effect may be accentuated at elite universities.

The key takeaway here isn’t that the Yale researchers who conducted the poop study are bad. The work is thoughtful and creative, and even being able to predict COVID-19 outbreaks with 15% accuracy is useful. The takeaway is that the behavior of scientists is governed by incentives, and these need to be better aligned with the production of high-quality science. Until they are, people need to bear these incentives in mind when they read about new findings.