Skip to main content

Verified by Psychology Today

Aaron Fisher Ph.D.
Aaron Fisher Ph.D.
Behaviorism

Can We Get It Right Without Needing to Be Right?

Thoughts on making a good argument and why we make bad ones.

Hello. Welcome to my blog.

For those familiar with my work, the title Personal and Precise shouldn’t come as much of a surprise. My research is focused on developing methods for better measuring, modeling, and understanding human behavior on a person-by-person basis. I believe that pursuing research at the individual level—i.e. Idiographic Science—is a powerful approach that, among other things, can help us to expand the idea of Precision Medicine to include mental and behavioral health. I’ll have much more on that specific position in an upcoming post, so stay tuned.

Of course, personalization and idiography aren’t the only things I care about. Perhaps, more than anything else, I’m concerned about the many ways in which science—and scientists—devote(s) time and attention to matters immaterial to innovation, discovery, convergence, consensus, and progress. I want to start this blog focusing on two related issues that I believe are particularly immaterial and obstructive to scientific progress: partisanship and (poor) argumentation.

In the spirit of strong argumentation, let me establish my position and see if I can argue it effectively.

1. I believe that partisanship—devotion to an intellectual position due to group affiliation and/or personal interests—is an obstacle to scientific progress. At the risk of sounding naïve or impractically idealistic, I would argue that we should be exclusively concerned with accuracy vs. inaccuracy in our scientific pursuits. Of course, accuracy and inaccuracy are debatable positions, and there are certainly many times when competing models can make equal claims to legitimacy. What I mean to say is that we should eschew self-interest and political posturing as guiding principles in scientific debate. Our interest should be in the collective merit of our ideas. We should see science as an endeavor of and for the common good and not a means for self-promotion or personal enrichment. We should be invested in the idea of accuracy as a prerequisite, and strive for impartiality in our debates, regardless of the passion for and degree of investment in our initial position. Quite simply, we should want to get it right, not be right.

How do we get better? How do we learn more? How do we leave something meaningful and useful for those who come after us? Unfortunately, the staunch defense of intellectual camps, the utilization of prestige and power dynamics to dismiss or overwhelm viable arguments, the employment of bad faith arguments, and the reliance on fallacy too often stand in the way of productive intellectual debate.

2. To the lattermost point, above, I believe that strong argumentation is necessary in all areas of disagreement and debate, and any argument lacking in a strong underlying argumentative structure is automatically and inherently insufficient. Moreover, the structure of an argument is oftentimes more important than the content. You may be correct in your position, wholly and unequivocally, but if you leave holes in your argument, your opponents will distort, misrepresent, and undermine your position to the point of irrelevance. Sound, thoughtful argumentation with pertinent and compelling evidence is worth getting right. Kurt Lewin once said, "there is nothing as practical as a good theory,” because good theory guides hypothesis generation, study design, and interpretation. However, as we look to disseminate the products of our science, I would offer that there is nothing as powerful as a strong argument.

Let’s take an example. Because I’ve already stuck my neck out on the issue, I’ll focus on a recent paper by Funder and Ozer, which deals with the practice of squaring correlations. The practice in question relates to ways in which scientists try to understand the magnitude of their findings—how important or meaningful they are, and how much of the world they explain. Funder and Ozer contend that the practice of squaring the correlation is 'actively misleading.'

I addressed my statistical concerns with this premise in a recent Twitter thread, and I’ll omit most of those points here. All the reader needs to know is that Funder and Ozer argued that squaring the correlation is inappropriate and misleading, and that I counter-argued that it is neither. I showed my work. Twitter consensus largely agreed with me. However, some were quick to point out that Ozer had previously written a separate paper in 1985 which more effectively and soundly argued the position. Apparently, the 1985 paper argues that there are specific conditions under which it is inappropriate to square the correlation. Outside of those conditions, it is otherwise appropriate.

I haven’t had a chance to read the 1985 paper. But let me do this: Let’s assume for argument’s sake that it is completely, totally, and inarguably correct. Seriously. That is better for the point I want to make here. So, under those conditions, my tweetstorm was only conditionally correct and otherwise incorrect. On statistical grounds. But let’s talk about argument.

As many of you know, a correlation reflects the strength and direction of association between two variables. A correlation of 1 means that X and Y move up and down in perfect unison. A correlation of -1 means that X and Y are likewise perfectly associated, but in opposite directions, as X goes up, Y goes down to an identical degree. Squaring this value tells you how much they explain each other. A correlation of .50, means that you can explain 25% of the information in X by knowing the state of Y and, conversely, you can explain 25% of the information in Y by knowing the state of X. [For a more technical, detailed explanation, see my thread.]

What should be clear here, is that the basis of this argument—one way or the other—is mathematical. Yet, in the opening paragraph of their four-paragraph argument, Funder and Ozer take issue with word “only.”

“For example, an r of .30, squared, yields the number .09 as the ‘proportion of variance explained,’ and this conversion, when reported, often includes the word ‘only,’ as in ‘the .30 correlation explained only 9% of the variance.’”

The issue at hand is whether it is mathematically appropriate to square the correlation value. Yet, the basis of this argument is that we shouldn’t square the correlation because it changes the perspective of the finding (and allows others to emphasize this shift in perspective with the word ‘only’). This is a poor argument and an example of special pleading. That is, squaring the correlation is an accepted and statistically-supported practice. The authors call for an exemption to that practice on semantic (and not empirical or argumentative) grounds.

Then things get especially problematic. They continue:

“We suggest that this calculation has become widespread for three reasons. First, it is easy arithmetic that gives the illusion of adding information to a statistic. Second, the common terminology of variance explained makes the number sound as if it does precisely what one would want to it do, the word explained evoking a particularly virtuous response. Third, the context in which this calculation is often deployed allows writers to disparage certain findings that they find incompatible with their own theoretical predilections.”

On the surface, this may seem to be a compelling set of arguments, but let's pull at some of the threads...

“It is easy arithmetic that gives the illusion of adding information…”

This is a bad faith argument that employs ambiguity and begging the question. The terms ‘easy arithmetic’ and ‘illusion’ rely on ambiguity to make what seems like a substantive point. Thus, the conclusion they want us to draw is presupposed, that this procedure does not provide additional information. Yet, this isn’t supported or substantiated.

“…the word explained evoking a particularly virtuous response.”

This is an odd framing for a scientific argument. Funder and Ozer seem to be implying that other researchers pursue a course of action because of perceived virtue. The assertion here is that other researchers are basing their choices on a fallacy—that they perceive virtue in an act that, in actuality, is not virtuous. Virtue, of course, is not a well-operationalized term. This is both a straw man and red herring. The presence or absence of virtue is immaterial to the position.

“The context in which this calculation is often deployed allows writers to disparage certain findings that they find incompatible with their own theoretical predilections.”

First and foremost, this is an ad hominem attack: Other writers and researchers do not square the correlation in pursuit of sound science, they do so to disparage. Beyond poor argumentation, the hypocrisy of this statement is rather jarring. The authors are demonstrating the very action they are agitating against, using semantic framing to dismiss an ostensibly viable position—literally disparaging something they find incompatible with their own predilections.

Again, let’s take as a given that Ozer (1985) makes an unassailable and entirely accurate case for the conditional inappropriateness of squaring the correlation. I was somewhat disingenuous when I said I haven’t had the chance to read this paper. It’s only eight pages long. I could read it right now. But I don’t want to run the risk that I disagree with it! I want it to be right for my purposes here. So, again, Ozer is correct, and I am conditionally incorrect. No problem. But if that is the case, why all of this poor argumentation? Why not simply say that it has been previously demonstrated that squaring the correlation is inappropriate under condition A and appropriate under condition B? The bad faith arguments and fallacies in the recent paper allow someone like me to thoroughly undermine the intended position. So why take such a risk?

To answer this question, I risk making a hasty generalization or worse, an ad hominem attack of my own. Let me say then that what follows is a matter of conjecture, a measured, educated assumption, but an assumption nevertheless.

It seems to me that this is an issue of partisanship and self-interest. Much like many of the disappointing responses to the Open Science Movement and replication crisis, Funder and Ozer appear to be trying to give cover to their own work. The argument about squaring the correlation is actually a fairly minor part of their paper. The majority of the work seeks to establish benchmarks for interpreting correlational effects. In so doing, Funder and Ozer assert that not only should correlations of .20 be considered “medium,” but that effects larger than .40 should be considered a “gross overestimate” that “will rarely be found.” Coincidentally, Funder and Ozer work in an area of psychology that routinely produces correlations smaller than .40.

I would like to refer readers to a paper I published last year in the Proceedings of the National Academy of Sciences. This paper contained six separate samples from the United States and the Netherlands and included clinical, naturalistic, self-reported, and physiological data. The paper exclusively contains bivariate correlations, measured across nine variable pairs. Moreover, the paper includes comparisons between repeatedly sampled cross-sectional estimates (i.e. nomothetic) and intraindividual estimates (i.e. idiographic). Finally, for the idiographic data, we provided raw correlations and correlations after removing all serial dependence. Importantly, correlations larger than .40 (and smaller than -.40) abound. They are present at both the nomothetic and idiographic levels, in clinical, naturalistic, and physiological data.

Of course that is one paper. But it should be stressed that the fields of clinical psychology and psychiatry are full of correlations greater than or equal to .40. Depression and anxiety are famously and consistently correlated somewhere between .50 and .75, depending on the nature of the sample and the operationalization of the constructs (e.g. clinical disorders, dimensional measures, etc.). Simply put, the assertions that correlations larger than .40 are overestimates or rare are not supported by the data.

Finally, let’s address the notion that a correlation of .20 should be considered medium-sized. I would like to formally disagree. First, let’s consider the conditions under which the explanatory value of this correlation should be appropriately measured by squaring the correlation. This tells us that the X explains only 4% of Y and Y explains only 4% of X. This leaves 96% of the information in each variable unexplained. Now, let’s take Ozer (1985) at face value and proceed with an interpretation that uses the raw correlation to reflect the explanatory value. Under these conditions, X explains 20% of Y and Y explains 20% of X. This is much better, but still leaves 80% of the information in each variable unexplained. Eighty-percent. Unexplained. Your model doesn’t tell me 80% of what’s going on and I’m supposed to accept that that is a medium effect? I’m incredulous! Which, for the record, is another logical fallacy—the personal incredulity fallacy. So, my reaction does not merit a rejection of the premise. Really, the assertion that anything is small, medium, or large is unfalsifiable, so I’m not sure how we could support or reject the assertion. To Funder and Ozer’s credit, they acknowledge as much and work to use existing data to establish their benchmarks.

Psychology has, for many years, tried to shake the ‘soft science’ moniker, a classification that belies the statistical and computational rigors of our field. One of the purported strengths of the ‘hard’ sciences is their explanatory power. Perhaps unfairly, physics is often held as the measuring stick. After all, special relativity yielded the atom bomb, and that is just one of the many impressive explanatory feats of the field (albeit, a morally ambiguous contribution to humankind). Psychology has made worthy contributions to society. Personally, I believe operant conditioning has been our most lasting and important contribution, namely because the principles of punishment and reinforcement have demonstrated impressive explanatory power. Principles of behaviorism, generally, have been useful, underlying effective treatments such as exposure therapy for fear-based disorders. But these contributions are decades old now. Meanwhile, the prevalence of depression has remained stable over the past 50 years, and treatments for distress disorders (e.g. generalized anxiety disorder, major depressive disorder) are only effective for approximately 50% of patients. Thus, I don’t believe we should be extolling the virtues of small effects. I think we should be greatly interested in uncovering big effects, predictive effects, explanatory effects. If Funder and Ozer want to argue that small effects make an important contribution, I won’t argue the point. However, I don’t think we should be satisfied with such middling explanatory power.

I understand that there are those who will argue that there is a place for academic work to be just that, academic. But I will risk embarrassing myself with earnestness when I say that I believe we owe the world more than that. We have too many theories, too many terms, too many redundancies and reinventions of the wheel. And, frankly, we have too many fictions.

I am never happy to be wrong. I’m terribly thin-skinned and defensive. Being wrong is an unpleasant experience for me, as I’m sure it is for most people. But I understand that I need to be wrong sometimes or else my work is not falsifiable, and of little value. I try to give myself opportunities to be wrong—for instance, I share my data and code online whenever I publish a paper. As a recent example, I may need to soon change my perspective on multilevel modeling. I’ve long maintained that the homogeneity assumptions of multilevel modeling are violated by nonergodicity. But recent comments from Aidan Wright, John Medaglia, and Donnie Williams on Twitter have led me to consider changing my tune. There’s no denying that, in many cases, a multilevel approach would be far more efficient. Idiographic methods require generating and interpreting models person-by-person. In large samples this can be quite time-consuming. Thus, if the multilevel approach can produce accurate results, there is no reason for me to stand in the way. The only reason would be to preserve my own self-concept or self-interests. In fact, at this very moment I can feel myself unwilling to change my mind on the issue simply because I’ve felt so right about it for so long.

As I said at the top, I’m concerned about the many ways in which scientists devote time and attention to matters immaterial to innovation, discovery, convergence, consensus, and progress. More than anything, progress is what I’m most interested in. I feel incredibly lucky to have a tenured position at a well-regarded institution, and I am aware of the privilege that provides. Waxing philosophical about eschewing self-interest and self-promotion is a luxury. Some may not be heard at all without persistent self-promotion. And sometimes, the line between promoting your ideas and promoting yourself is hard to see. Still, I leave you with this appeal: where matters of substance are involved, where we can set the bar higher if we abandon our self-interests and predilections, where we have the opportunity to promote the common good, this should be our goal. Sometimes being wrong is the best way to make sure we get it right.

advertisement
About the Author
Aaron Fisher Ph.D.

Aaron Fisher, Ph.D., is an associate professor of psychology at the University of California, Berkeley, where he heads The Idiographic Dynamics Lab.

More from Psychology Today
More from Psychology Today