J. Krueger
Source: J. Krueger

If your experiment needs statistics, you ought to have done a better experiment. ~ Ernest Rutherford

Many psychology students (and readers of Psychology Today) hate statistics, p < .05. During my first semester of studying psychology at the University of Bielefeld in what was then West Germany (1977), my fellow students and I learned that there were to be two mandatory courses in statistics, one in semester II of year I and another in semester I of year II. Most of my classmates were hoping to specialize in clinical psychology or counseling and thought they wouldn’t, couldn’t, and shouldn’t be bothered with statistics. Statistics, they thought, is about numbers, applied math, and soulless aggregates. Psychology is (or should be, in their view) about people, and only individuals are real people. Studying statistics would be a waste of time, particularly for all those who aspired to be working in the world of one-on-one encounters. Add math phobia to that and you have a potent cocktail of discontent and resistance.

I didn’t care much at the time because I had no interest in becoming a clinician or counselor. I saw my future in organizational psychology and statistics seemed to be of some relevance there. So I figured ‘Bring it on.’ And they did. Professor Ulrich Schulz, who had come to Bielefeld from Marburg, a bastion of quantitative psychology, took a no-nonsense approach to teaching. He may not have been the most approachable person, but we considered him tough and fair. He had us delve quite deeply into the world of Fisher, Kolmogoroff, and Pearson. Chebyshev’s inequality cast a big shadow over the semester. The first stats course in particular was so demanding and time-consuming that we would joke about having ended up in a dual-degree program in psychology and statistics.

When the two stats courses were completed and most of us had passed, we needed a third methods course and a comprehensive exam. A popular third course was Wolf Nowack’s ‘test construction,’ which was light on the math but strong on hands-on experience. Mathias Geyer’s ‘test theory’ was far less well attended mainly because (I think) it was firmly grounded in Marburg math. My own attitude toward stats and methods gradually improved, not because I struggled (or perhaps because of that if dissonance reduction played a role), but because I thought that stats might be the thing that could make psychology hard and respectable. Many seminar discussions were so free-flowing that it seemed any point of view could be defended. With stats, I figured, bad ideas could be weeded out.

A seminar on social cognition, hosted by Andrea Abele, brought a whole new perspective. We read the hot-off-the-press Human Inference by Nisbett & Ross (1980), a tractate with a driving force somewhere between Marx & Engels’ Communist Manifesto and the Apocalypse according to John of Patmos. Channeling Tversky & Kahnman’s then-recent work on heuristics and biases, Nisbett & Ross presented a new view of social cognition. Our perceptions and judgments are fundamentally flawed, they asserted, not because we are inert or emotional, but because we fail to think like statisticians. All of a sudden, what seemed to be two different worlds of psychology and statistics in semester II merged into one, and statistics ruled. Statistics set the standard; it provided hypotheses of how people should think, which could then be rejected using statistics. To me, this was a decisive moment. I had stumbled into a research paradigm that made psychology hard in theory (stats as the norm) and in practice (stats as the tool), and it generated a wealth of phenomena (errors and irrationalities) that had both surprise value in conversation and the promise to make something happen (educate people).

Since then – as some of you may know – my enthusiasm for the heuristics and biases school has declined, mainly because I realized that its focus on the downside of heuristic thinking neglects many of its successes. One might even say that this research strategy produces its own systematic error. We might call it success neglect, where success refers to the adaptive and rewarding judgments and decisions that can be made with the use of heuristics that do not meet the unforgiving criteria of statistical rationality. 

More importantly, though, it became clear that there is no ‘statistics’ in any singular, conceptual sense. There are, and always have been, competing – even warring – schools of thought in the statistical field. They agree on very little, not even on the meaning of their foundational term: probability. To be sure, statistics can work very well within a particular school and within a particular frame of reference. This is true for many kinds of scientific endeavor. Expert work within relativity or within quantum theory can yield insights that are astonishing and useful, aesthetical even. But as soon as you get the adherents of these paradigms to debate fundamental assumptions, paradise (and tranquility) is lost (Felin, Koenderink & Krueger, 2016).

And here is my point (it is not an original one): The selection of paradigm and the attainment of any semblance of consensus among ‘experts’ is a social process. If you think Kuhn, I think Fleck, who was Kuhn’s Vordenker [predecessor in thought]. Ludwik Fleck (Austrian, Jew, Pole, and master of the German language) coined the now-forgotten word Denkstil, which is a way of thinking, a way of perceiving, and a way of asking questions. In Kuhn’s hands, Denkstil became paradigm. No matter, statisticians make deep assumptions about what varies: the data or the hypotheses, and whether probability is ob- or subjective – among other dolores de cabeza. They can then accuse one another of ignorance, when what is the case is a failure to share core assumptions.

Of late, I have gotten back into the game of thinking about statistics. Patrick Heck and I put a paper on the heuristic value of p in inductive statistical inference out on Frontiers in Psychology. The p value, which is a probability that falls out of most statistical tests has taken much on its little chin because it isn’t perfect. Nothing is in the world of induction. If you don’t like it, do deduction, although you won’t learn anything new. But p has ‘heuristic’ value. It does a pretty good job overall, while accepting systematic bias. In other words, the p value behaves like any other psychological heuristic.

And so the circle closes. From a time when statistics claimed to be the royal path to truth and that the human mind better got on board with that we have come to a time where we see that statistics cannot get off the ground with a psychology providing energy and direction.  

Felin, T., Koenderink, J., & Krueger, J. I. (2017). Rationality, perception, and the all-seeing eye. Psychonomic Bulletin & Review. Online first, December 7, 2016. DOI 10.3758/s13423-016-1198-z 

Krueger, J. I., & Heck, P. R. (2017). The heuristic value of p in inductive statistical inference. Frontiers in psychology. https://doi.org/10.3389/fpsyg.2017.00908

Nisbett, R. & Ross, L. (1980). Human inference. Englewood-CLiffs, NJ: Prentice-Hall.

You are reading

One Among Many

Testing Free Will

I dare you.

Significance and Value

Probing the mind of the P-tester

Going Native

Life among the savages