Robyn Dawes

Robyn Dawes 1936 - 2010

There is no evidence that [unstructured] interviews yield important information beyond that of past behavior - except whether the interviewer likes the interviewee, which is important in some contexts.

~ Dawes, 1988, p. 206, emphasis his

I had the good fortune of counting Robyn Dawes among my mentors in graduate school and among my friends later. One of my (few) regrets is that I did not take his course on judgment and decision-making when I had the chance. In 1988, Robyn published Rational Choice in an Uncertain World, a book that I believe emerged from his lecture notes. Robyn taught that decisions are rational if they are coherent, that is, if they do not involve contradictions. Of course, some decisions are rational but morally bad. Facing this inconvenient truth, Robyn argued that the Kommandant of the Auschwitz death factory could be seen as a rational man. Robyn cared deeply about good decisions, and especially decisions that avoid harm to others. He tirelessly ex- and opposed shoddy, self-serving, and irrational practices in the fields of mental health, education, and the law. His papers revealing the dangers of clinical judgment and prediction are celebrated today, at least in academic circles. Robyn also showed that there is a better way, and that this way is quite simple. Once empirically valid predictors are identified, they can be used in a spreadsheet for forecasting. In today’s world, there ought to be apps. At the core of Robyn’s efforts lay the insight that past behavior is the best (if regressive) predictor of future behavior. People may change, but not wildly so, and the way in which they change is hard to predict.

Now, three years after his passing, Jason Dana published Dawes’s last paper (Dana, Dawes, & Peterson, 2013). It shows that unstructured selection interviews can actually degrade the quality of prediction: not just fail to provide incremental validity, but degrade validity. Understanding the poverty of the unstructured interview is important because it is so widespread and because practitioners are so enamored with it. Many flatter themselves of being able to discriminate between winners and losers after spending 15 minutes with them in conversation. The illusion of validity has many fathers. E.g.:

[1] Affect. In getting-to-know-you interviews, as in most situations of interpersonal contact, people respond emotionally, and they typically accept emotions as valid responses to something that matters [I always had a strange feeling about Jack, but love Mary]. Discounting one’s own emotions feels like self-betrayal.

[2] Sampling. If those who are interviewed are already highly selected with regard to valid and quantitative criteria (e.g., test scores), those who are rejected after interview would have been as good on the job as those who were accepted, but the interviewers will never know. Seeing those whom they selected succeed, will tempt interviewers to [irrationally] infer that they are maestros of interviewing.

[3] Memory. Some of the selected individuals will do poorly. That is so inasmuch as the quantitative predictors are not perfectly valid – and how could they; hence regression to the mean. Looking back, interviewers will most likely be able to recruit memories that postfirm the failures, which lay in the future then but are known now.

[4] Self-justification. Interviewing is a costly enterprise. It devours time, money, and effort. Interviewers who find a way to justify these investments can shore up their professional self-respect and reputation.

Dana, Dawes & Peterson (2013) conducted 3 experiments to expose the validity-eroding effect of unstructured interviewing. They provided their participants with the interviewees’ GPAs and asked them to predict their current GPAs. The correlation between the two is quite high (about .5, with variation from study to study). In theory, interviewers could do better if they asked the right questions. In practice, they did worse. Although their predictions did correlate with past GPAs (i.e., they did not ignore that information), the correlations between predictions and the current GPAs were about half the size. In other words, the interviews diluted the effect of a predictor (past GPA) that would be quite valid if left alone. In an interesting twist, Dana et al. provided some interviewers with random responses, that is, with responses that were guaranteed to be worthless even if the non-random responses had some validity. If interviewers were tuned into the quality of the information they elicited – as most of them thought they were – then those in the random condition should have relied less on the interview and more on the past GPAs. But they did not. The correlations arising in the random condition were similar to the correlations in the baseline condition. Dana et Dawes et Peterson conclude that interviewers engage in online sensemaking. They assimilate whatever information emerges into an evolving schema and a coherent story about the person. Sensemaking is a kind of coherence that may seem rational, but it amplifies inaccuracy if there is no validity to begin with.

A few years ago, I did an hour-long interview with Robyn. It is published on DVD as part of his Festschrift (Krueger, 2008; see also here). The experience was rather emotional as many personal encounters are. It also produced a wealth of information, most of which helped me make sense of Robyn’s contributions. The interview was not intended to make any kind of forecast. It could not reveal, for example, the appearance of Dawes’s last paper in 2013.

Afteranalysis. The data can also be looked at in terms of partial correlations. When the effect of prior GPA on current GPA is statistically controlled, the correlations between respondents’ predictions and current GPA are near zero (with a little variation over conditions). In other words, what little validity post-interview predictions have is entirely explained by the fact that interviewers did not ignore prior GPAs. Conversely, the correlations between prior and current GPAs remain virtually unchanged when interviewers predictions are controlled.

We can turn this around and ask how large the unique validity of the interview would have for the correlation between interviewers’ predictions and current GPA to be equal to the correlation between prior and current GPA. Indexing predictions as 1, current GPA as 2, and prior GPA as 3, we find that r[1,2] = r[2,3] if the partial r[1,2.3] = (r[2,3] – r[1,3] x r[2,3])/√((1-r[1,3]^2 * (1-r[2,3]^2)). If, for example, prior GPA is correlated with current GPAs at r[2,3] = .65 and is correlated with interviewers’ predictions at r[1,3] = .5 – as in Dana et al.’s first study – the partial correlation between predictions and current GPA, r[1,2.3] must be greater than .47 for the interview-based predictions to be better than the predictions based on prior GPA alone. This result may be surprising if one thought that as long as the interview has some unique validity and as long as prior GPA scores are not entirely ignored, predictions that combine the two will outperform predictions based on prior GPA scores alone.

Dana, J., Dawes, R. M., & Peterson, N. (2013). Belief in the unstructured interview: The persistence of an illusion. Judgment and Decision Making, 8, 512-520.

Krueger, J. I. (2008). Rationality and social responsibility: Essays in honor of Robyn M. Dawes. New York, NY: Psychology Press.

You are reading

One Among Many

A Game of Lunch and Love

The Duero Dilemma is a model of shyness.

The Art of War, Theban Style

Epaminondas took the Spartans by surprise. Surprise!

How Not to Believe

Nelson Mandela deserves better, and so does Paul Feyerabend.