The Age of Centaurs

Instead of building smarter machines, let's build machines that make us smarter.

Posted Oct 06, 2017

The media keep serving up the tiresome theme of man versus machine, human intelligence versus Artificial Intelligence, Kasparov versus Deep Blue.

But look at what Garry Kasparov did after he lost his chess match against IBM’s Deep Blue computer. As Clive Thompson tells the story in his 2013 book Smarter Than You Think, Kasparov became fascinated with ways to partner with chess computers so that the combination of [grandmaster plus computer] would be better than either one alone. This kind of hybrid intelligence is sometimes referred to as a “centaur,” a composite of two different entities. These centaurs blend human intuition and creativity with the brute force calculation of moves and countermoves that computers do so easily.

And now there are chess tournaments, called freestyle tournaments, pitting different centaurs against each other. Clive Thompson describes one top centaur, Team Anson Williams, which is a three-person team consisting of a data analyst, a software engineer, and a math whiz—none of whom are outstanding chess players. They use a chess computer for things like handling the openings (a fairly rote part of the game) and then the humans collaborate with the computer when they get into the middle game where strategy comes into play. Computers like Deep Blue don’t do strategy, they just examine enormous numbers of possibilities. It’s easy to get overawed by the number of positions a chess computer examines, but most of these positions are blind alleys. Humans know what to ignore, but chess computers don’t. During one stretch, Team Anson Williams won 23 games, drew 27 games, and lost only one game. They went 2-0 against grandmasters.

In another freestyle contest that Thompson presents, the winning team consisted of two young amateurs working with three run-of-the-mill computers running inexpensive software. The team knew the strengths and weaknesses of each of their computers, and sometimes selected low-ranked moves just to confuse their opponents. They beat a version of Hydra, the strongest chess computer in existence at that time (and stronger than Deep Blue).

The humans don’t add much to the computers in blitz chess, there’s just not enough time. Even with 60-minute games, the centaurs don’t outperform the machines. But at longer time intervals, the centaurs dominate.

Now let’s shift to weather forecasting, another area in which computer analytics have made tremendous strides, relying on nonlinear partial differential equations. I am basing this discussion on the wonderful new book by Hoffman et al. (2017) Minding the Weather: How Expert Forecasters Think. The machine forecasts keep getting better and better. However, computer models do not produce weather forecasts. They generate predictions of the values of certain atmospheric parameters such as surface temperatures and wind directions at various altitudes. The humans are using the model outputs plus lots of other data to churn out the forecasts.

Weather forecasters run several computer models to see if they give the same analyses. The forecasters also judge the quality of different computer runs. For example, the forecasters might take into account the time since the last weather balloon went up. The weather balloons, carrying telemetry instrument packages, only get launched twice a day, so computer models become less accurate as the day goes on. As the skilled weather forecasters learn the idiosyncrasies of a computer model, they discover how to compensate for the tendencies—perhaps we should call them biases—baked into that model.

But this is not a situation in which it is getting harder and harder for the human is to “improve” on the computer outputs, or produce forecasts that “beat” the computers. It is not a competition. Forecasters use the computer models for what they are, tools in a very large toolkit. As the computer models get better, the human forecasters become more skilled at interpreting them, comparing them, modifying them, and using them. For weather forecasting, much of the value added of the human comes from handling high-impact events and from recognizing when the machines are making larger errors.

That’s why skilled weather forecasters have an added value, and can improve on the computer outputs by 10 to 20 percent, sometimes by more than 20 percent, sometimes just by 5 percent. Even a 5 percent gain makes a difference, over time. In a chess game, if the humans on the centaur team improved on the computer recommendation by 5 percent for critical moves, and there were perhaps 10 critical moves in a game, the cumulative effect would be 50 percent a game, which would clearly give the centaur a winning edge.

Further, the value-added research in fields such as weather forecasting examines the average improvement of the human+computer team over the computer alone. The average may be less relevant than the maximum. Some forecasters must be more effective centaurs than others, and it is their value added that we should aspire to.

In response to the question, “Will computers replace human forecasters?” Hoffman et al. state that “the question is not about man versus machine but about man plus machine,” (p. 335).

And that raises a few challenges. First, can we develop better methods for preparing the human part of the centaur to better manage the machine intelligence? One reason why humans don’t have a greater value added is that intelligent systems get replaced more and more frequently, giving humans less time to learn their foibles. We need to speed up that learning curve. Mueller, Klein & Burns (2009) developed an Experiential User Guide for just that purpose.

A second challenge is to build intelligent machines that are easier to learn and that explain their reasoning better. A few decades ago, when expert systems were coming into play, attempts to build expert systems for physicians failed because the physicians didn’t understand the way the devices thought—and that was in the day of production rules and straightforward logic. Today, with machine learning and deep neural nets, even the developers don’t know how their systems think. The systems have become much more inscrutable. (I am working with Robert Hoffman and Shane Mueller on a DARPA project called XAI, Explainable Artificial Intelligence, aimed at making these systems less inscrutable.) Therefore, we would expect that physicians would be less likely to rely on the systems than before, and the same holds for any profession in which the human is personally, professionally and legally responsible for making the decisions.

When the consequences are important, the human part of the centaurs have to be firmly in control because they will be responsible for the decisions that get made.

The centaurs of the future will consist of humans who are quicker to grasp how the machines think, and machines that are more easily understood by their handlers. As stated above, instead of building smarter machines we need to build machines that make us smarter.


Hoffman, R.R., LaDue, D.S., Mogil, H.M., Roebber, P.J., & Trafton, J.G. (2017). Minding the weather: How expert forecasters think. Cambridge, MA: MIT Press

Mueller, S.T., Klein, G., & Burns, C. (2009). Experiencing the tool without experiencing the pain: Concepts for an Experiential User Guide. Proceedings of NDM9, the Ninth International Conference on Naturalistic Decision Making, London, UK. 

Thompson, C., (2013).  Smarter than you think: How technology is changing our minds for the better.  New York, NY: The Penguin Press.