When Human Judgment Works Well, and When it Doesn’t

My last post here, the descriptively-titled “Big Data’s Biggest Challenge? Convincing People NOT to Trust Their Judgment,” generated a fair amount of commentary. So I think it’s worthwhile to devote a couple follow-on posts to the reactions, questions, and objections raised in response to my contention, which was (and is) that we should generally be relying a lot less on the judgments, diagnoses, and forecasts of human ‘experts,’ and a lot more on the outputs of cold, hard, data-driven algorithms.

A good place to start is with the simple question of where this contention comes from — why am I so convinced that we should be relying less on experts and more on algorithms? The simple answer is that both the theory and the data support this conviction.

Let’s take the data first: In my previous post I highlighted that there have been a raftload of studies comparing the predictions of human experts vs. those of algorithms, and that in the great majority of them the algorithms have been at least as good as or significantly better than the humans. In a meta-analysis conducted by William Grove and colleagues of 136 research studies, for example, expert judgments were clearly better than their purely data-driven equivalents in only eight cases.

Most of these studies took place in messy, complex, real-world environments, not stripped-down laboratory settings. Commenter Sean Kennedy pointed out that “… many of our decisions have to be made under much less than ideal “big data” conditions. Data is often lacking, low-quality, or conflicting.” This is true, and what’s amazing is that these are exactly the conditions under which algorithms do better than people.

Why is this? Let’s turn to the theory.

A number of people noted that Nobel prize-winner Daniel Kahneman’s work, nicely summarized in his 2011 book Thinking Fast and Slow, influenced their thinking a great deal. Me, too: Kahneman made gigantic contributions, and his book should be required reading for anyone seeking to understand how to make themselves and their organizations work better.

For our purposes here, Chapter 22 is paydirt. It’s titled “Expert Intuition: When Can We Trust It?” Kahneman conducted a lot of the work underlying it with Gary Klein, who was and is quite fond of experts and their intuitive abilities — much more so than Kahneman. What’s really interesting, though, is that the two of them ended up in complete agreement about the conditions required for good intuition to develop. There are two of them:

  • an environment that is sufficiently regular to be predictable
  • an opportunity to learn these regularities through prolonged practice

Medicine meets the first of these criteria, since human biology changes very slowly, but (Kahneman contends) the stock market doesn’t — it’s just too chaotic and unpredictable. And within medicine, some specialities provide better and faster learning opportunities (the second criterion) than others. As the chapter states, “Among medical specialties, anesthesiologists benefit from good feedback, because the effects of their actions are likely to be quickly evident. In contrast, radiologists obtain little information about the accuracy of the diagnoses they make and about the pathologies they fail to detect. Anesthesiologists are therefore in a better position to develop useful intuitive skills.”

Kahneman drives this point about learning home with his conclusion that “Whether professionals have a chance to develop intuitive expertise depends essentially on the quality and speed of feedback, as well as on sufficient opportunity to practice.”

With this background, we can now see two main reasons why algorithms beat people. The first is that, as Kahneman writes, “Statistical algorithms greatly outdo humans in noisy environments for two reasons: they are more likely than human judges to detect weakly valid cues and much more likely to maintain a modest level of accuracy by using such cues consistently.” In other words, people often miss cues (i.e. data) in the environment that would be useful to them, and even when they are aware of such cues they don’t use them the same way every time. In other words, the fact that most real-world environments are messy and noisy does not favor human experts over algorithms; in fact, just the opposite.

The second reason is that fast, accurate feedback is not always available to a human expert. To continue Kahneman’s example, a radiologist won’t always know if the lump she was looking at eventually turned out to be cancer (the patient might have moved on to another care provider, for example), and she certainly won’t know quickly. Similarly, an interviewer won’t always get the feedback that the person he hired flamed out on the job two years down the road.

But well-designed algorithms can and do incorporate feedback and results over a long time frame, which helps explains why algorithmic approaches to pathology and talent management work so much better.

So where does this leave us? Well, if Kahneman’s theory is right, and if people don’t have any inherent data collection or processing superiority over automatic means, then we’re in this situation:


But if there is still something special about our innate data collection and/or processing abilities (and I think there is, at least for now), then we’re here:


Which one do you think it is?

This entry was posted in Leadership. Bookmark the permalink.

Comments are closed.