Taleb 2009
Taleb 2009
Taleb 2009
Vardi for the material on the events of Sept 18, 2008. (2007) and other ideas came from the academic establishment,
2 "Dr. Doom", New York Times, August 15, 2008 not from the popular press.
do not seem to be aware of the norm they are using, thus mis- The first type of decisions is simple, it aims at "binary"
estimating volatility, see Goldstein and Taleb (2007) . payoffs, i.e. you just care if something is true or false.
9 Practitioners have blamed the naive L-2 reliance in the
Clearly these are not very prevalent in life –they mostly LINEAR NONLINEAR
exist in laboratory experiments and in research papers. PAYOFF PAYOFF
f(x)=0
The second type of decisions depends on more complex f(x)=1 f(x)
payoffs. The decision maker does not just care of the nonlinear(=
frequency—but of the impact as well, or, even more x2, x3, etc.)
complex, some function of the impact. So there is
another layer of uncertainty of impact. These depend Medicine Finance : Derivative
on higher moments of the distribution. When one (health not nonleveraged payoffs
invests one does not care about the frequency, how epidemics) Investment
many times he makes or loses, he cares about the
expectation: how many times money is made or lost
times the amount made or lost. We will see that there Psychology Insurance, Dynamically
are even more complex decisions. experiments measures of hedged
expected portfolios
More formally, where p[x] is the probability distribution shortfall
of the random variable x, D the domain on which the
distribution is defined, the payoff λ(x) is defined by
integrating on D as: Bets General risk Leveraged
(prediction management portfolios
markets) (around the
loss point)
Note that we can incorporate utility or nonlinearities of
the payoff in the function f(x). But let us ignore utility
for the sake of simplification. Binary/Digital Climate Cubic payoffs
derivatives (strips of out
For a simple payoff, f(x) = 1. So L(x) becomes the of the money
simple probability of exceeding x, since the final options)
outcome is either 1 or 0 (or 1 and -1).
For more complicated payoffs, f(x) can be complex. If Life/Death Economics Errors in
the payoff depends on a simple expectation, i.e., λ(x) = (Policy) analyses of
E[x], the corresponding function f(x)=x, and we need volatility
to ignore frequencies since it is the payoff that matters.
One can be right 99% of the time, but it does not Security: Calibration of
matter at all since with some skewed distribution, the Terrorism, nonlinear
consequence on the expectation of the 1% error can be Natural models
too large. Forecasting typically has f(x)=x, a linear catastrophes
function of x, while measures such as least squares
depend on the higher moments f(x)= x2.
Epidemics Expectation
weighted by
nonlinear
12
More formally, a linear function with respect to the variable
11 The difference can be best illustrated as follows. One of x has no second derivative; a convex function is one with a
the most erroneous comparisons encountered in economics is positive second derivative. By expanding the expectation of f(x)
the one between “wine rating” and “credit rating” of complex we end up with E[f(x)]= f(x) e[Δx] + ½ f’’(x) E[Δx2] +... hence
securities. Errors in wine rating are hardly consequential for higher orders matter to the extent of the importance of higher
the buyer (the “payoff” is binary); errors in credit ratings derivatives.
bankrupted banks as these carry massive payoffs.
Casinos Kurtosis-based
positioning
(“volatility
trading”)
The passage from theory to the real world presents two Figure 4 The Confirmation Bias At Work. The
distinct difficulties: "inverse problems" and "pre- shaded area shows what tend to be missing
asymptotics". from the observations. For negatively-skewed,
fat-tailed distributions, we do not see much of
Inverse Problems. It is the greatest difficulty one can negative outcomes for surviving entities AND we
encounter in deriving properties. In real life we do not have a small sample in the left tail. This
observe probability distributions. We just observe illustrates why we tend to see a better past for a
events. So we do not know the statistical properties— certain class of time series than warranted.
until, of course, after the fact –as we can see in Figure
1. Given a set of observations, plenty of statistical
distributions can correspond to the exact same
realizations—each would extrapolate differently outside
the set of events on which it was derived. The inverse
problem is more acute when more theories, more
distributions can fit a set a data –particularly in the
presence of nonlinearities or nonparsimonious
distributions13.
So this inverse problem is compounded two problems:
+ The small sample properties of rare events as these
will be naturally rare in a past sample. It is also acute in
the presence of nonlinearities as the families of possible
models/parametrization explode in numbers.
+ The survivorship bias effect of high impact rare Figure 5 Outliers don’t Predict Outliers. The
events. For negatively skewed distributions (with a plot shows (in Logarithmic scale) a shortfall in
thicker left tail), the problem is worse. Clearly, one given year against the shortfall the following
catastrophic events will be necessarily absent from the one, repeated throughout for the 43 variables. A
data –since the survivorship of the variable itself will shortfall here is defined as the sum of deviations
depend on such effect. Thus left tailed distributions will in excess of 7%. Past large deviations do not
1) overestimate the mean; 2) underestimate the appear to predict future large deviations, at
variance and the risk. different lags.
Figure 6 Regular Events Predict Regular As we will saw from the data presented, this
Events. This plot shows, by comparison with classification, “fat tails” does not just mean having a
Figure 5, how, for the same variables, mean fourth moment worse than the Gaussian. The Poisson
deviation in one period predicts the one in the distribution, or a mixed distribution with a known
subsequent period. Poisson jump, would have tails thicker than the
Gaussian; but this mild form of fat tails can be dealt
Pre-asymptotics. Theories can be extremely with rather easily –the distribution has all its moments
dangerous when they were derived in idealized finite. The problem comes from the structure of the
situations, the asymptote, but are used outside the decline in probabilities for larger deviations and the
asymptote (its limit, say infinity or the infinitesimal). ease with which the tools at our disposal can be tripped
Some asymptotic properties do work well into producing erroneous results from observations of
preasymptotically (as we’ll see, with type-1 data in a finite sample and jumping to wrong decisions.
distributions), which is why casinos do well, but others
do not, particularly when it comes to the class of fat- The scalable property of Type-2 distributions:
tailed distributions. Take a random variable x. With scalable distributions,
asymptotically, for x large enough, (i.e. “in the tails”),
Most statistical education is based on these asymptotic,
laboratory-style Platonic properties—yet we take depends on n, not on x (the same
economic decisions in the real world that very rarely
resembles the asymptote. Most of what students of property can hold for P[X<n x] for negative values).
statistics do is assume a structure, typically with a This induces statistical self-similarities. Note that owing
known probability. Yet the problem we have is not so to the finiteness of the realizations of random variables,
much making computations once you know the and lack of samples in the tails we might not be able to
probabilities, but finding the true distribution. observe such property –yet not be able to rule out.
For economic variables, there is no fundamental reason
V- THE TWO PROBABILISTIC STRUCTURES for the ratio of “exceedances” (i.e., the cumulative
probability of exceeding a certain threshold) to decline
There are two classes of probability domains—very as both the numerator and the denominators are
distinct qualitatively and quantitatively –according to multiplied by 2.
precise mathematical properties. The first, Type-1, we
This self-similarity at all scales generates power-law, or
call “benign” thin-tailed nonscalable, the second, Type
Paretian, tails, i.e., above a crossover point, P[X>x]=K
2, “wild” thick tailed scalable, or fractal (the attribution
x-α.17 18
“wild” comes from Mandelbrot’s classification of
Mandelbrot[1963]).
Taleb (2009) shows that one of the mistakes in the
16 Makridakis et al(1993), Makridakis and Hibon (2000)
economics literature that “fattens the tails”, with two
present evidence that more complicated methods of
main classes of nonparsimonious models and processes forecasting do not deliver superior results to simple ones
(the jump-diffusion processes of Merton, 1973 14 or (already bad). The obvious reason is that the errors in
stochastic volatility models such as Engels’ ARCH15) is calibration swell with the complexity of the model.
to believe that the second type of distributions are 17 Scalable discussions: introduced in Mandelbrot(1963),
2) “Atypicality” of Moves. For thin tailed domains, The following table shows the evidence of lack of
the conditional expectation of a random variable X , convergence to thin tails –hence lack of “typicality” of
conditional on its exceeding a number K, converge to K the moves. We stopped for segments for which the
for larger values of K. number of observations becomes small –since lack of
observations in the tails can provide the illusion of
“thin” tails.
Table 3- Conditional expectation for moves > K,
43 economic variables
For instance the conditional expectation for a Gaussian
variable (assuming a mean of 0) conditional that the
variable exceeds 0 is approximately .8 standard K Mean Move (in MAD) in excess of
deviations. But with K equals 6 standard deviations, the n
Mean Deviations K
conditional expectation converges to 6 standard
deviations. The same applies to all the random
variables that do not have a Paretan tail. This induces 1 2.01443 65958
some “typicality” of large moves. 2 3.0814 23450
3 4.19842 8355
For tat tailed variables, such limit does not seem to
hold:
19 For the definition of Value at Risk, Jorion (2001);
discussed in Bak et al (1987, 1988), Bak (1996), as power laws critique: Joe Nocera, “Risk Mismanagement: What led to the
arise from conditions of self-organized criticality. Financial Meltdown”, New York Time Magazine, Jan 2, 2009
10
Distribution 1 First
(“thin tailed”) Quadrant Second
Quadrant:
Extremely
Safe
Safe
11
12
(1993), Harte et al. (1999), Solé et al (1999), Ritchie et al April meeting, 2008.
(1999), Enquist and Niklas (2001). 29 http://www.liveleak.com/view?i=ca2_1234032281
13
14
Harte, J; Kinzig, A; Green, J. Self-similarity in the Solé, RV; Manrubia, SC; Benton, M; Kauffman, S; Bak,
distribution and abundance of species . Science. 1999 P. Criticality and scaling in evolutionary ecology. Trends
Apr 9;284(5412):334–336. Ecol Evol. 1999 Apr;14(4):156–160.
Haug, E. G. (2007): Derivatives Models on Models, New Sornette, Didier, 2004, Critical Phenomena in Natural
York, John Wiley & Sons Sciences: Chaos, Fractals, Self-organization and Disorder:
Concepts and Tools, 2nd ed. Berlin and Heidelberg:
Jorion, Philippe, 2001, Value-at-Risk: The New Springer.
Benchmark for Managing Financial Risk, McGraw Hill.
Stanley, H.E. , L.A.N. Amaral, P. Gopikrishnan, and V.
Kahneman, D. (1999), Objective happiness . In Well Plerou, 2000, “Scale Invariance and Universality of
Being: Foundations of Hedonic Psychology, edited by Economic Fluctuations”, Physica A, 283,31–41
Kahneman,D, Diener, E., and Schwartz, N., New York:
Russell Sage Foundation Taleb, N. (2007a)The Black Swan: The Impact of the
Highly Improbable, Random House (US) and Penguin
Levi, Issac.: “The Paradoxes of Allais and Ellsberg,” (UK).
Economics and Philosophy, 2(1986), 23-53.
15
16