Nassim Taleb’s Army Of Straw Men

[Taleb’s] entire claim to fame and genius consists of continually yelling at the world that Not All Distributions Are Gaussian, thereby bravely correcting the erroneous belief to the contrary that had been held by, approximately, none people.

Sonic Charmer, “Which Taleb output isn’t a hoax” ¹

This chapter is about Taleb’s foes: historians, statisticians, economists, quants, risk managers, professors of finance, bankers, nerds, people who wear ties and many others suffering from one or other cognitive impairment. Not only are they intellectually inferior to über-intellectual Nassim Taleb, their professional activities have disastrous consequences for society. Fortunately we now have a brave hero to protect us. All quotes from Taleb in this chapter are from the second edition of The Black Swan (TBS), Penguin, 2010.

The Great Quixotic Joke

Taleb obsessively condemns the normal, or Gaussian, distribution; one of the best known probability distributions in statistics. He calls it the Great Intellectual Fraud. The Gaussian distribution should not be used in social science, he says (and at points he seems to be saying the same of the physical sciences too). One example to illustrate his point is the distribution of wealth, a severely skewed distribution with very heavy tails. Suppose you have a sample of 1000 poor fellows, and calculate their average wealth. Then add Bill Gates to your sample and recalculate the sample average. Your previous estimate of average wealth now looks ridiculous. Bill Gates is an outlier, but that’s exactly his point: unlike Gaussian distributions, wealth or income distributions are extremely sensitive to outliers. Then there are other “things that can’t be modeled by a Gaussian”, like the size of cities, popularity of websites, book sales etc. But his wealth/income example seems to carry most of the argumentative weight: it occurs many times throughout The Black Swan.

This could all be interesting, actually, were it not for the fact that Taleb has other things in mind than educating his statistically untrained readers. The reason he writes about these topics with such indignation is that, in his mind, he belongs to a small circle of thinkers who are aware you can’t fit wealth distribution to a bell-curve. Everybody else (statisticians, economists….) uses only Gaussians. In a book of nearly 400 pages, you would think there is ample space for a specific example of a statistician inappropriately using a Gaussian. Alas, the only thing the reader gets, ad nauseam, are vague, general accusations like: “The bell curve is not ubiquitous in real life, only in the minds of statisticians”. Now here’s a challenge for Taleb (or for you, if you love Taleb): find one example – just one – of a paper, article or textbook where the author uses a Gaussian to describe the income or wealth distribution of a non-cherrypicked group of people (cherrypicked would be, e.g. the machine operators of Acme’s plant X). All the usual examples would do: GDP of countries, income distribution within countries, among self-employed, artists,… I’m waiting, but I’m not holding my breath.

Ever wondered why the phrase ‘median income’ is much more common than ‘average income’? (Even when people mention ‘average’ income, they often mean median income.) Because the skewness of income distributions is generally recognized by even the least bright among economists; because everybody knows, long before Taleb, the median is a much more stable number describing skewed distributions than the mean (= average). It’s such a basic fact it’s even taught in high school (at least I learned it in high school).

It’s astonishing that Taleb can make such a preposterous joke of economics and get away with it. Really, if you are convinced Taleb is on to something, do the test: find some information about income distributions (or sizes of cities or book sales or…) on the internet, in textbooks or wherever. You may actually learn something interesting.
The Myth of “once-in-a-billion-years”

Anyone who has studied finance has come across the following example. Suppose stock market returns follow a normal distribution, and suppose you estimate the average return and standard deviation from past returns. Now that you’ve fully specified the probability distribution, you can work out the odds of, say, a return of less than -5% in one day, or less than 10%, and so on. It turns out those probabilities are ridiculously low. For example, using an annual volatility of 20% (which is already quite high for a broad market index), the probability of seeing a one-day return of less than -5% is only 0.0036%. Assuming 250 trading days in a year, such a negative return should occur only once every 110 years. In reality it happens far more often. The probability of Black Monday (1987 Wall Street crash)? Probability of 2.57 × 10^-72. Should happen only once every 1.54 × 10⁶⁹ years, vastly longer than the age of the universe. And it just happened when we were around, ain’t that tough luck!!

The example is meant to point out that no, stock market returns do not follow a normal distribution, and it’s often used in classrooms or textbooks to introduce more realistic models. True, Mandelbrot was one of the first who studied the empirical distribution of stock market returns. But Taleb’s assertion that Mandelbrot was ignored or even ridiculed in the world of finance is a blatant denial of history. Here’s a short excerpt from Donald MacKenzie’s superb ‘An Engine Not A Camera’:

Early in 1962, Mandelbrot was in Chicago and took the opportunity to visit the Graduate School of Business. Merton Miller and his colleagues were impressed –”We all wanted to make him an offer [of a professorship]” (Miller interview)–and a productive collaboration, albeit an informal and largely long-distance one, ensued. “Benoit had a great influence on a lot of us, not just the fat tails but the first really rigorous treatment of expectations [Mandelbrot 1966], and so on… I’m, to this day, a Mandelbrotian.”

Many economists learned about the fat tails of financial markets from Eugene Fama, a Nobel Prize laureate whose influential Ph.D. thesis (1964) focused on the behavior of stock market prices. Fama makes no secret of it that he’s heavily indebted to Benoit Mandelbrot, who was sort of an unofficial advisor of his Ph.D. research. Later research by Fama (and by others) offered alternatives to Mandelbrot’s extreme infinite variance solution, which may have driven a wedge between Mandelbrot and mainstream economists. It’s important to appreciate the difference between these two things: the uncontroversial fact that financial markets exhibit fat tails on the one hand, and the question what distribution best describes them on the other hand. Taleb makes it seem as if the recognition of fat tails immediately leads to Mandelbrot’s stable distribution, which is not true. If there’s still a debate going on, it’s about the latter point, not the first. So it’s disingenuous, to say the least, to allege that economists still haven’t embraced the fact that stock market returns don’t follow a normal distribution.

Why Mandelbrot’s solution was found very interesting initially, but later rejected in favor of less extreme models, is a question I will return to in one of the following chapters. But I can already say this: the explanation, witnessed by countless papers referring to Mandelbrot, contradicts Taleb’s unsubstantiated narrative about otherworldly academics. And it’s certainly not this:

According to the circumstances of 1987 [sic], people accepted that rare events take place and are the main source of uncertainty. They were just unwilling to give up on the Gaussian as a central measurement tool —”Hey, we have nothing else.” (TBS, emphasis added)

Since Taleb doesn’t name any of those “people”, I assume they’re made of straw. He does quote Paul Cootner, who (in the 1960s, as Taleb mentions correctly) expresses his dissatisfaction with Mandelbrot’s hypothesis, indeed warning his colleagues of its dramatic consequences on centuries of econometric work. But that was not the main reason for Cootner’s discomfort. It characterizes Taleb that he doesn’t mention any of Cootner’s material criticisms, but just takes one short out-of-context quote from the concluding paragraphs of Cootner’s paper. In it, Cootner denounces Mandelbrot’s lack of rigor in presenting his evidence, and his failure to investigate plausible alternative explanations for non-Gaussian behavior in commodity prices (namely, non-stationarity due to seasonal price factors).

Besides, by 1987 many economists and other intelligent researchers had developed advanced analytic techniques to mitigate the “dramatic consequences”: even if the economic variables follow a stable distribution, there are now plenty of statistical tools to analyze such data. But most importantly, although Cootner disagreed with Mandelbrot’s specific diagnosis (stable distributions), he agreed with the observed symptoms (fat tails). He argued the data were just not convincing enough to abandon other – more benign – solutions. Later research confirmed some aspects of stable distributions to be inconsistent with the data.

Much of Taleb’s anger is targeted at the inventors of the famous Black-Scholes-Merton option formula, two of whom were awarded with a Nobel Prize (Fischer Black was already dead when Myron Scholes and Robert Merton were nominated). If I was a psychoanalyst of sorts, I would trace back Taleb’s anger and frustration to that infamous 1997 Nobel Prize. A profound envy led to a troubled psychology characterized by “noisy ape behavior with little personal control” (his own words, except referring to his adversaries rather than to himself). But don’t worry, I’m not a psychoanalyst.

Anyway, it was none other than Fischer Black who wrote “The Holes in Black-Scholes“, pointing out the defects in the model, and suggesting remedies. Many of the later criticisms against the model, including Taleb’s, may simply have been copied from Black’s article, directly or indirectly. Amusingly, at some point Black refers to the formula as the ‘Black & Holes’ formula, demonstrating his humor too was more developed than Taleb’s. In any case, it shows Taleb’s intellectual enemies are not the ivory-tower theorists he makes of them. More than anyone else, they are aware of the limitations of their models. (Speaking of ivory towers, it’s rather Taleb’s antiquated philosophy that’s useless when dealing with real-life problems, as I will argue further on).

In a footnote in the chapter called The Bell Curve, That Great Intellectual Fraud, Taleb claims:

One of the most misunderstood aspects of a Gaussian is its fragility and vulnerability in the estimation of tail events. The odds of a 4 sigma move are twice that of a 4.15 sigma. The odds of a 20 sigma are a trillion times higher than those of a 21 sigma! It means that a small measurement error of the sigma will lead to a massive underestimation of the probability. We can be a trillion times wrong about some events (emphasis added).

Misunderstood by whom?? By the anonymous statistician, no doubt, that fictional character relentlessly agonizing Taleb’s mind. By the way, what’s the importance of the difference between a 20 sigma and 21 sigma event? Both probabilities are equal to 0%. Well, not ‘exactly’; one differs from zero in the don’t-know-how-many-hundredth decimal, the other in the don’t-know-how-many-hundredth-more decimal. Maybe it is a trillion times more, can’t verify because excel doesn’t go beyond the hundredth decimal. But come on, who cares? Unless the distribution is far from Gaussian, but then this talk of sigmas is meaningless anyway.

Turning back to the “once-in-a-billion-years” illustration: when I first heard it, I immediately recognized it for what it was: an empirical argument following a modus tollens logic:
- IF Gaussian, THEN low frequency of rare events
- Fact: High frequency of rare events
- Therefore: Not Gaussian
I wonder if there was anybody in the classroom who understood it differently. Perhaps something like this:

Teacher: “Suppose financial returns follow a Gaussian distribution…”

Student: “Hmmm, OK, I have to believe financial returns follow a Gaussian distribution, if I want to pass the exam.” (FYI: student didn’t pass).

Hippie on acid (very strong acid): “Wow this is amazing!! Everybody out there believes the Gaussian reigns over financial markets, but I’m privileged to be admitted into this secret society of sages who have privileged access to transcendent truths and… and, like, other cool stuff!! Hell yeah, let’s kick some economist’s ass!!! Hey look, there’s a pink swan!”
Fat Tony & the Mediterranean Comedian

Fat Tony is a fictional character that Taleb pits against Dr. John (also fictional), exactly Tony’s opposite. Dr. John – no I don’t feel insulted, because I haven’t got a Ph. D. – is a caricature of the nerdy engineer. Even his physical appearance matches standard stereotypes: “thin, wiry, wears glasses and a dark suit” (Taleb hates people with suits, even though he’s often caught wearing one). Dr. John is “a master of the schedule”; “as predictable as a clock”; “reads the paper on the train to Manhattan” (aha! the train!), and “neatly folds his newspaper”; “packs his sandwich every morning”, sorry, I forgot an important modifier: “meticulously packs his sandwich”. And of course, there’s a “fruit salad in his plastic container”. How else could he stay so thin?

Taleb carefully pumps the reader’s intuition, inflating any seed of prejudice to grotesque proportions. His goal is obvious: the reader has to dislike Dr. John; the reader has to believe, uncritically, that people like Dr. John (engineers, nerds or whatever) are out of touch with reality.

So, very predictably (as usual in this sort of brainless comedy), Dr. John is going to say something stupid, and Taleb – sorry, Fat Tony – is going to make fun of him. They are both asked this question by professor Taleb:

Assume that a coin is fair, i.e., has an equal probability of coming up heads or tails. I flip it ninety-nine times and get heads each time. What are the odds of getting tails on the next throw?

Surprisingly, Dr. John gives the correct answer: 50%, because, as assumed, the coin is fair. Draws are independent, meaning that past draws don’t affect future ones. To Taleb (or his alter ego Fat Tony), however, the answer is wrong! Says Fat Tony to Dr. John:

“You are either full of crap or a pure sucker to buy that “50 pehcent” business. The coin gotta be loaded. It can’t be a fair game.”

The question is a nice example of a stylized problem, to borrow Taleb’s own words. Stylized problems, though rarely directly applicable to the real world, are very useful to sharpen one’s thinking. They help students understand concepts and practise analytic methods before moving to the deep part of the swimming pool. But Taleb has no patience with them. Even though fictional, the story reveals an interesting trait of Taleb’s: he is wont to change the rules in the middle of the game to make himself look good, and his opponent look bad. What I mean by that is this: first, Taleb asks us to imagine a fair coin. And then he asks us to imagine that same fair coin landing heads up 99 times in a row. Since this is just a thought experiment, there’s nothing really difficult about it. Imagining a fair coin that lands heads up 99 times in a row is just as easy as imagining a flying pig or 17 fairies on the back of a unicorn. A 6-year old can do it. But then suddenly, we’re supposed to imagine the coin is not fair. Well that’s a degree of fantasy attainable only to experts of Alice-in-Wonderland logic: nothing is what it is, because everything is what it isn’t. And contrary-wise, what it is, it ain’t.

OK, maybe Taleb means something like this: imagine you have a coin that, superficially, looks fair. But then you toss it 99 times, and it lands heads up each time. How would you assess the probability of the next toss being heads up as well? Fruit salad notwithstanding, Dr. John is sufficiently familiar with very basic sampling theory and hypothesis testing to give the right answer. The null hypothesis is that the coin is fair. Under the null hypothesis, the sample is not just extremely, but astronomically unlikely. So it can be discarded. Based on the sample frequencies, the probability is about 100%. If you don’t want to discount the possibility of the coin being fair altogether, then give it some non-zero prior probability and apply Bayesian reasoning. The result will still be close to 100%. To repeat, elementary statistics, one of the first chapters in introductory textbooks. What’s really stretching the reader’s imagination here, is that a Ph. D. wouldn’t be acquainted with it. Undoubtedly the reason why Taleb has to resort to fictional characters, for want of examples from real life.

Imagine a student on a physics exam, responding to the question: “how many seconds does it take for an object at a height of 10 meters to reach the ground, if there’s no opposing force from air resistance?”. Student’s response: “You stupid teacher, stop asking such silly questions. Do you seriously think there’s no air resistance? On what planet do you live, for Christ’s sake?!” Taleb would praise the student for thinking “out-of-the-box”. But this cheap imitation of out-of-the box thinking (any idiot can do it) gets you nowhere, unless your ultimate goal in life is to annoy people as much as you can.

The following story is also purely fictional. Any resemblance to other fictional persons, living or dead, is purely coincidental.

Fat Tony is an army recruit learning how to shoot. After a day’s training, his fellow recruits tell their family at home about what they learned: how to hold their rifles steady, how to aim and focus, and so on. But Fat Tony complains to Dr. John: “I don’t get it… the army officers must be total morons. Can you imagine, they think the enemy is a round, flat object with circles painted on it!!”. Dr. John writes in his report: “Patient is unfit for service. Advise to relieve Tony Corpulent of his duties”.
The fallacy of the Ludic Fallacy

Fat Tony and Dr. John make their appearance in the chapter The Ludic Fallacy, or the Uncertainty of the Nerd. ‘Ludic’ is derived from the Latin word for ‘game’, ludus. Introductory probability courses typically start with examples from gambling, like throwing dice or tossing coins. But also more advanced concepts, for example in stochastic processes, often rely on such examples. The examples are important tools to teach formal, abstract concepts in an intuitive way. More than that, they are also important benchmarks for analyzing data and testing hypotheses. How can you tell if a coin is fair? Well, first you need to know what, in terms of probabilities, you can expect from a fair coin. Then you can compare the observations of the real coin with the expectations of the fair one. Only if the observations are too different can you reject your assumption of the real coin being fair.

It often happens that college freshmen completely miss the point of such examples. “But Mrs. White, how do you know the probability of H and T is really 50%? Just assuming it won’t make an unfair coin fair!” “Be patient Tony boy, we will relax that assumption in lecture 5. We must first learn to walk before we can run.” Nassim Taleb, despite being called a statistician by reporters obviously oblivious to such basic insights (I assume the same reporters refer to Deepak Chopra as a physicist), also misses the point entirely. He believes statisticians (and economists and so on) are prone to the Ludic Fallacy, the illusion that “the attributes of the uncertainty we face in real life [are connected] to the sterilized ones we encounter in exams and games”. As usual, not a single specific case to prove his point, just vague generalizations and unverifiable anecdotes.

In real life, Taleb warns us, “you don’t know the odds; you need to discover them”. Hey, isn’t that exactly what Mrs. White teaches her students in lecture 5 of her introductory statistics course? But Taleb and his friend Tony dropped out after the first lecture; they are now seeking wisdom in the library of the Philosophy department.

Taleb seems to take the distinction ‘classroom v. real life’ very seriously, because in a footnote in the same chapter, he gives another example of the Ludic Fallacy:

Organized competitive fighting trains the athlete to focus on the game and, in order not to dissipate his concentration, to ignore the possibility of what is not specifically allowed by the rules, such as kicks to the groin, a surprise knife, et cetera. So those who win the gold medal might be precisely those who will be most vulnerable in real life. (emphasis added)

In that case, practitioners of Taekwondo, the martial art with the most restrictive rules of competition (no holds, no throws, no low kicks, no hits with the kneels or elbows, no punches to the head, …) must be extremely easy to defeat in a real fight. Kick ’em in the groin, the suckers won’t expect it! But you should first get close to them, which can be quite dangerous given their intense practicing of fast, powerful and difficult to dodge high kicks, in the artificial environment of the dojang.

Unsurprisingly, Taleb can’t give any examples of martial artists being defeated in real fights due to their exposure to “artificial, sterilized environments”. However, continuing the footnote, he takes the point even further, making the reader wonder, surely you must be kidding Mr. Taleb??

Likewise, you see people with huge muscles […] who can impress you in the artificial environment of the gym but are unable to lift a stone (emphasis added).

To be sure, strength isn’t necessarily embodied in huge muscles, and muscles can be further pumped by illegal substances, but huge muscles without strength, that sounds more like Photoshop than gym.

Sure, tennis players had better not spend their entire training time in the gym. But does that mean that any time in the gym hinders their overall performance? Nobody’s saying that all classroom knowledge is directly applicable to the real world. But it doesn’t follow that it’s completely irrelevant. On the contrary, the domain-unspecificity of abstract knowledge, often illustrated in the classroom by means of easy and intuitive examples, is what lends it its generality and applicability in a host of diverse areas.

Wrapping up, where are all the “idiots” (economists, professors of finance, statisticians, …), and what exactly did they do wrong? Let’s not forget Nassim Taleb distrusts induction, the inference from many cases to all. He prefers ‘intellectual thinking’. That must be the reason why he generalizes from a sample of zero, and lets his imagination produce evidence in support of his allegations.

Chapter synopsis: Black and Scholes did not believe there are no costs to trading, and let no Distinguished Professor of Risk Engineering tell you otherwise.

the web page I took this quote from isn’t available anymore, unfortunately. (back)

QUIXOTIC FINANCE

To protect our windmills against Don Quixote and his Knights of the Holy Black Swan

Tag Archives: probability

FOOLED BY A RED HERRING – CHAPTER 2: FAT TONY IS A PHONY

Nassim Taleb’s Army Of Straw Men