Statisticians, it has been shown, tend to leave their brains in the classroom and engage in the most trivial inferential errors once they are let out on the streets.
Speaking, of course, is Nassim Taleb (The Black Swan, chapter 5 ‘Confirmation Shmonfirmation!’), never missing an opportunity to rail against the Evil Worshippers of the Gaussian. But for once, it’s not just a baseless accusation that the reader is supposed to accept without proof or argument. No, this is a rare occasion where Taleb actually makes a serious attempt at substantiating his claim, by referring to a specific and verifiable scientific study.
The study is one of the famous Kahneman and Tversky (K&T) studies conducted in the 1970’s, revealing our cognitive biases when we make decisions in a context of uncertainty. Also, the studies show that we employ so-called “heuristics”, mental shortcuts if you wish, causing us to make systematic errors against probability theory and statistics.
For example, in the experiment cited by Taleb, the subjects – professors of statistics according to Taleb – were asked the following question:
Assume that you live in a town with two hospitals – one large, the other small. On a given day 60 percent of those born in one of the two hospitals are boys. Which hospital is it likely to be? (Taleb, ibid.)
If you know the Law of Large Numbers (LLN); I mean, if not just you’ve heard about it or are able to parrot its definition, but if you really understand what it means; then you know the answer must be the smaller hospital. That’s because the LLN tells us large deviations from the mean (here: mean = 50%) are more likely in smaller samples than in large ones. Incidentally, it’s a bit ironic that Taleb, while complaining that statisticians overrely on the normal distribution and its underlying principle, the LLN, here takes them to task for exactly the opposite: of not using it where it is appropriate.
There are a few pertinent questions that an alert reader might ask, even before checking the facts. Like, how can a classroom experiment, asking subjects to answer a hypothetical question (what Tales calls a ‘stylized problem’), prove that the subjects “leave their brains in the classroom when let out on the streets”? Shouldn’t they be tested on the streets, rather than in the classroom? Secondly, you don’t need to know a lot about statistics to feel a little bit suspicious about Taleb’s assertion. If you can’t ask a professor of statistics what the correct answer to a statistical question is, who is to decide what it should be? Fat Tony perhaps? The fictional character (probably Taleb’s alter ego) that embodies the anti-statistician, anti-nerd glorified in The Black Swan?
We don’t need to rely on mere logic or arguments about plausibility. As it happens, the plain facts are in everyone’s reach. Many of K&T’s oft-cited papers are compiled in the book Judgment under uncertainty: Heuristics and biases (Cambridge University Press, 1982). Given the dearth of rebuttals to Taleb’s strong claims, I wonder how many people actually read any of these papers. The large-or-small-hospital test is described in chapter 3 (‘Subjective probability: A judgment of representativeness’), page 44. On page 43 we discover who the test subjects were:
Subjects were 97 Stanford undergraduates with no background in probability or statistics (emphasis added).
So much for professors of statistics. When I use this example as a clear demonstration of Taleb’s indifference to facts, people sometimes make an attempt at rescuing him. Granted, they argue, perhaps in this particular study the subjects were not statisticians. But K&T have emphasized that sophisticated scientists are prone to the same type of errors when asked to make intuitive judgments. Take for example the study described in chapter 2 of the book, ‘Belief in the law of small numbers’. Subjects in that study were “[participants] at meetings of the Mathematical Psychology Group and of the American Psychological Association”. If I was in a mood for nitpicking, I could object that being a member (or a participant at a meeting) of such a group does not make you a professor of statistics. Besides, all throughout the chapter, K&T refer to the subjects as “psychologists”, not as statisticians, much less professors of statistics. Anyhow, I’m happy to concede that at least some (especially in the Mathematical Psychology Group), if not all, can be reasonably expected to have had a training in probability and statistics. But there’s something much more devastating to Taleb’s assertion than a mere argument about semantics. Namely: the precise question that was posed to the respondents. I quote, verbatim, the very first paragraph of ‘Belief in the law of small numbers’:
Suppose you have run an experiment on 20 subjects, and have obtained a significant result which confirms your theory (z = 2.23, p < .05, two-tailed). You now have cause to run an additional group of 10 subjects. What do you think the probability is that the results will be significant, by a one-tailed test, separately for this group?
Gulp… I can imagine psychologists, perhaps even statisticians, giving an incorrect answer to this question. But is this “a trivial inferential error”? Made “on the streets”? The large-or-small-hospital question, that was trivial. This one is more in the realm of 5-star-difficulty puzzles for probability prodigies, quite the opposite of trivial. And again, the 1-million dollar question remains: if you can’t ask a professor of statistics, who’s to say what the correct answer is?
Whereas Taleb’s mantra is that we should shun everything that’s presented with mathematical symbols or equations (while at the same time himself (ab)using the language of mathematics to support his theses), K&T come to a totally different conclusion (chapter 3):
We surely do not mean to imply that man is incapable of appreciating the impact of sample size on sample variance. People can be taught the correct rule, perhaps even with little difficulty.
Fair enough, they do insist that experienced researchers are prone to the same biases as lesser mortals – when they think intuitively. The qualification at the end is important. Statisticians are humans too. Studying probability theory helps you refine your intuition about uncertainty. But the whole point of mathematics is to serve as a tool assisting the mind in cases where intuition fails. (This is a topic I intend to elaborate further in future blog posts). Studying statistics does not turn you into an android that spews out precise answers off the top of his head. But it does provide you with the skills to work out the correct answer the mathematician’s way: slowly, painstakingly, very often using long, complex formulas. That’s what mathematics is: a prosthetic devise that’s extremely useful in organizing our thoughts, in protecting us against cognitive pitfalls or a blind reliance on intuition. And it works: statisticians are much better at avoiding basic mistakes than non-statisticians. One final quote from K&T:
Statistical tests, therefore, protect the scientific community against overly hasty rejections of the null hypothesis (seeing a significant result where none exists, JV).
So then, where are the statisticians in the K&T experiments? As should be clear now, they are everywhere. They came up with the test questions. They provided the benchmark (normative model, in K&T’s words) against which to evaluate the subjects’ responses. Finally, you will find the book very difficult to read if you’re not familiar with basic concepts from statistics and probability theory, such as prior probability, p-value, z-score, binomial distribution, sample variance, etc. Could that be one of the reasons Taleb dissuades his readers from studying statistics? Because doing so would allow them to actually check his dubious claims?