Probability Distributions
Probability Distributions
R-forge project
LATEXpowered
Contents
Introduction
Discrete distributions
27
II
34
Continuous distributions
35
47
56
75
84
8 Pareto family
88
108
111
3
III
CONTENTS
116
117
12 Multivariate distributions
133
13 Misc
135
Conclusion
137
Bibliography
137
A Mathematical tools
141
Introduction
This guide is intended to provide a quite exhaustive (at least as I can) view on probability distributions. It is constructed in chapters of distribution family with a section for each distribution.
Each section focuses on the tryptic: definition - estimation - application.
Ultimate bibles for probability distributions are Wimmer & Altmann (1999) which lists 750
univariate discrete distributions and Johnson et al. (1994) which details continuous distributions.
In the appendix, we recall the basics of probability distributions as well as common mathematical functions, cf. section A.2. And for all distribution, we use the following notations
X a random variable following a given distribution,
x a realization of this random variable,
f the density function (if it exists),
F the (cumulative) distribution function,
P (X = k) the mass probability function in k,
M the moment generating function (if it exists),
G the probability generating function (if it exists),
the characteristic function (if it exists),
Finally all graphics are done the open source statistical software R and its numerous packages
available on the Comprehensive R Archive Network (CRAN ). See the CRAN task view on probability distributions to know the package to use for a given non standard distribution, which is
not in base R.
http://cran.r-project.org
http://cran.r-project.org/web/views/Distributions.html
Part I
Discrete distributions
Chapter 1
1.1.1
The discrete uniform distribution can be defined in terms of its elementary distribution
(sometimes called mass probability function):
0.14
1
,
n
0.12
P (X = k) =
0.08
P(X=k)
where k S = {k1 , . . . , kn } (a finite set of ordered values). Typically, the ki s are consecutive positive integers.
0.06
F (k) =
0.10
1.1
1X
11(ki k) ,
n
i=1
10
G(t) = E(tX ) =
1 X ki
t ,
n
i=1
with the special cases where the ki s are {1, . . . , n}, we get
G(t) = z
7
1 zn
,
1z
when z 6= 1.
Finally, the moment generating function is expressed as follows
4
M (t) = E(tX ) =
1 X tki
e ,
n
i=1
1.1.2
t 1etn
1et
Properties
P
The expectation is X n , the empirical mean: E(X) = n1 ni=1 ki . When S = {1, . . . , n}, this is just
n+1
1 Pn
n2 1
2
i=1 (ki E(X) which is 12 for S = {1, . . . , n}.
2 . The variance is given by V ar(X) = n
1.1.3
Estimation
Since there is no parameter to estimate, calibration is pretty easy. But we need to check that
sample values are equiprobable.
1.1.4
Random generation
1.1.5
Applications
A typical application of the uniform discrete distribution is the statistic procedure called bootstrap
or others resampling methods, where the previous algorithm is used.
1.2
1.2.1
Bernoulli/Binomial distribution
Characterization
9
mass probability function
0.25
0.30
B(10,1/2)
B(10,2/3)
P(X=k)
0.15
0.20
P (X = k) = Cnk pk (1 p)nk ,
0.00
0.05
0.10
n!
,
where Cnk is the combinatorial number k!(nk)!
k N and 0 < p < 1 the success probability. Let us notice that the cumulative distribution function has no particular expression. In
the following, the binomial dsitribuion is denoted by B(n, p). A special case of the bino0
2
4
6
8
10
mial dsitribution is the Bernoulli when n = 1.
k
This formula explains the name of this distribution since elementary probabilities P (X = k) Figure 1.2: Mass probability function for binomial
are terms of the development of (p + (1 p))n distributions
according the Newtons binom formula.
Another way to define the binomial distribution is to say thats the sum of n identically and
independently Bernoulli distribution B(p). Demonstration can easily be done with probability
generating function. The probability generating function is
G(t) = (1 p + pz)n ,
while the moment generating function is
M (t) = (1 p + pet )n .
The binomial distribution assumes that the events are binary, mutually exclusive, independent
and randomly selected.
1.2.2
Properties
The expectation of the binomial distribution is then E(X) = np and its variance V ar(X) =
np(1 p). A useful property is that a sum of binomial distributions is still binomial if success
L
10
1.2.3
Estimation
Bernoulli distribution
Let (Xi )1im be an i.i.d. sample of binomial distributions B(n, p). If n = 1 (i.e. Bernoulli
distribution, we have
m
1 X
pm =
Xi
m
i=1
is the unbiased and efficient estimator of p with minimum variance. It is also the moment-based
estimator.
There exists a confidence interval for the Bernoulli distribution using the Fischer-Snedecor
distribution. We have
"
1
1 #
mT
mT +1
, 1+
,
I (p) =
1+
f2(mT +1),2T, 2
f
T
T + 1 2(mT ),2(T +1), 2
P
where T = m
i=1 Xi and f1 ,2 , the 1 quantile of the Fischer-Snedecor distribution with 1 and
2 degrees of freedom.
We can also use the central limit theorem to find an asymptotic confidence interval for p
u p
u p
I (p) = pm
pm (1 pm ), pm +
pm (1 pm ) ,
n
n
where u is the 1 quantile of the standard normal distribution.
Binomial distribution
When n is not 1, there are two cases: either n is known with certainty or n is unknown. In the
first case, the estimator of p is the same as the Bernoulli distribution. In the latter case, there are
no closed form for the maximum likelihood estimator of n.
One way to solve this problem is to set n
to the maximum number of success at first. Then
we compute the log likelihood for wide range of integers around the maximum and finally choose
the likeliest value for n.
Method of moments for n and p is easily computable. Equalling the 2 first sample moments,
we have the following solution
(
2
Sm
p = 1 X
m
,
n
= Xpm
with the constraint that n
N.
Exact confidence intervals cannot be found since estimators do not have analytical form. But
we can use the normal approximation for p and n
.
1.2.4
11
Random generation
1.2.5
Applications
The direct application of the binomial distribution is to know the probability of obtaining exactly
n heads if a fair coin is flipped m > n times. Hundreds of books deal with this application.
In medecine, the article Haddow et al. (1994) presents an application of the binomial distribution
to test for a particular syndrome.
In life actuarial science, the binomial distribution is useful to model the death of an insured or
the entry in invalidity/incapability of an insured.
1.3
1.3.1
Characterization
mass probability function
M (t) =
0.3
P(X=k)
0.0
G(t) =
0.2
0.4
Cnk pk (1 p)nk
,
1 (1 p)n
B(10,2/3)
B(10,2/3,0)
B(10,2/3,1/4)
0.1
P (X = k) =
0.5
and
12
if k = 0
otherwise
1
p
where K is the constant 1(1p)
are the parameters. In terms of probability/moment
n , n, p, p
generating functions we have:
1.3.2
Properties
The expectation and the variance for the zero-truncated version is E(X) =
np(1p(1p+np)(1p)n )
(1(1p)n )2
np
1(1p)n
and V ar(X) =
Knp(1 p).
1.3.3
Estimation
From Cacoullos & Charalambides (1975), we know there is no minimum variance unbiased estimator
for p. NEED HELP for the MLE... NEED Thomas & Gart (1971)
Moment based estimators are numerically computable whatever we suppose n is known or
unknown.
Confidence intervals can be obtained with bootstrap methods.
1.3.4
Random generation
13
if U < p, then X = 0
otherwise
do; generate X binomially distributed B(n, p); while X = 0
return X
1.3.5
Applications
Human genetics???
1.4
1.4.1
Quasi-binomial distribution
Characterization
The quasi-binomial distribution is a small pertubation of the binomial distribution. The mass
probability function is defined by
P (X = k) = Cnk p(p + k)k1 (1 p k)nk ,
where k {0, . . . , n}, n, p usual parameters and ] np , 1p
n [. Of course, we retrieve the binomial
distribution with set to 0.
1.4.2
Properties
NEED REFERENCE
1.4.3
Estimation
NEED REFERENCE
1.4.4
Random generation
NEED REFERENCE
1.4.5
Applications
NEED REFERENCE
14
1.5
1.5.1
Poisson distribution
Characterization
0.4
k
e ,
P (X = k) =
k!
M (t) = e(e
0.0
0.1
G(t) = e(t1) ,
0.2
P(X=k)
0.3
10
(ct)k ct
e ,
k!
since the interoccurence are i.i.d. positive random variables with the property of lack of memory .
1.5.2
Properties
The Poisson distribution has the interesting but sometimes annoying property to have the same
mean and variance. We have E(X) = = V ar(X).
The sum of two independent Poisson distributions P() and P() (still) follows a Poisson
distribution P( + ).
Let N follows a Poisson distribution P().PKnowing the value of N = n, let (Xi )1in be a
sequence of i.i.d. Bernoulli variable B(q), then ni=1 Xi follows a Poisson distribution P(q).
1.5.3
15
Estimation
I () = n
m , n +
m ,
n
n
where u is the 1 quantile of the standard normal distribution.
1.5.4
Random generation
1.5.5
Applications
TODO
1.6
1.6.1
16
The zero-truncated version of the Poisson distribution is defined the zero-truncated binomial
distribution for the Poisson distribution. The
elementary probabilities is defined as
1.0
0.8
P(1/2)
P(1/2,0)
P(1/2,1/4)
0.6
k
1
,
k! (e 1)
ee 1
et 1
and
M
(t)
=
.
e 1
e 1
0.0
G(t) =
0.4
0.2
where k N . We can define probability/moment generating functions for the zerotruncated Poisson distribution P0 ():
P(X=k)
P (X = k) =
1p
where K is the constant 1e
. The generating functions for the zero-modified Poisson distribution P(, p) are
t
G(t) = p + K(et 1) and M (t) = p + K(ee 1).
1.6.2
Properties
The expectation of the zero-truncated Poisson distribution is E(X) = 1e and K for the zeromodified version. While the variance are respectively V ar(X) = (1e )2 and K + (K K 2 )2 .
1.6.3
Estimation
T
n (1
T
n (1
t1
2 Sn1
t
2 Sn
N1
T )
T
n
,
1e
17
P
where T = ni=1 Xi , 2 Snk denotes the Stirling number of the second kind and N1 the number of
observations equal to 1. Stirling numbers are costly do compute, see Tate & Goen (1958) for
approximate of theses numbers.
1.6.4
Random generation
1.6.5
Applications
NEED REFERENCE
1.7
Quasi-Poisson distribution
18
1.7.1
Characterization
TODO
1.7.2
Properties
TODO
1.7.3
Estimation
TODO
1.7.4
Random generation
TODO
1.7.5
1.8
1.8.1
Applications
Geometric distribution
Characterization
0.6
0.5
G(1/2)
G(1/3)
G(1/4)
0.3
0.2
0.1
F (k) = 1 (1 q)k+1 .
0.0
P(X=k)
0.4
P (X = k) = q(1 q)k ,
19
q
,
1 (1 q)t
1.8.2
q
.
1 (1 q)et
Properties
1q
q
1q
.
q2
The sum of n i.i.d. geometric G(q) random variables follows a negative binomial distribution
N B(n, q).
The minimum of n independent
geometric G(qi ) random variables follows a geometric distribuQ
tion G(q. ) with q. = 1 ni=1 (1 qi ).
The geometric distribution is the discrete analogue of the exponential distribution thus it is
memoryless.
1.8.3
Estimation
1
n ,
1+X
NEED REFERENCE
1.8.4
Random generation
TOIMPROVE WITH Devroye, L. (1986) Non-Uniform Random Variate Generation. SpringerVerlag, New York. Page 480.
20
1.8.5
Applications
1.9
1.9.1
G(1/3)
G(1/3,0)
G(1/3,1/4)
0.5
P (X = k) = p(1 p)k1 ,
0.3
0.2
0.1
F (k) = 1 (1 p)k .
0.0
P(X=k)
0.4
k
The zero-modified version of the geometric
distribution is characterized as follows
p
if k = 0
Figure 1.7: Mass probability function for zeroP (X = k) =
,
k
Kq(1 q)
otherwise
modified geometric distributions
1.9.2
Properties
1
p
1p
.
p2
1.9.3
21
Estimation
Snt1
,
Snt
P
1 Pn
nk C k (k + t 1) . The maximum
where t denotes the sum ni=1 Xi , Snt is defined by n!
t
n
k=1 (1)
likelihood estimator of q is given by
1
q = ,
Xn
which is also the moment based estimator. By the uniqueness of the unbiased estimator, q is a
biased estimator.
n )2
(X
2 .
Sn
n
X
2
Sn
NEED REFERENCE
1.9.4
Random generation
For the zero-truncated geometric distribution, a basic algorithm is to use i.i.d. Bernoulli variables
as follows
initialize X to 1 and generate U from an uniform distribution,
while U > q do ; generate U from an uniform distribution; X = X + 1;
return X.
While for the zero-modified geometric distribution, it is a little bit tricky
generate U from an uniform distribution
if U < p, then X = 0
otherwise
where Cnk s are the binomial coefficient and (n)m is the falling factorial.
22
1.9.5
Applications
NEED REFERENCE
1.10
1.10.1
Characterization
1.10.2
Characterization
mass probability function
0.15
0.10
0.05
0.00
P(X=k)
0.20
k
P (X = k) = Cm+k1
pm (1 p)k ,
k
s are combinatorial numwhere k N, Cm+k1
bers and parameters m, p are constraint by
0 < p < 1 and m N . However a second
parametrization of the negative binomial distribution is
r
k
(r + k)
1
P (X = k) =
,
(r)k!
1+
1+
NB(4,1/2)
NB(4,1/3)
NB(3,1/2)
0.25
0.30
The negative binomial distribution can be characterized by the following mass probability
function
23
One may wonder why there are two parametrization for one distribution. Actually, the first
parametrization N B(m, p) has a meaningful construction: it is the sum of m i.i.d. geometric G(p)
random variables. So it is also a way to characterize a negative binomial distribution. The name
comes from the fact that the mass probability function can be rewritten as
1 p k 1 mk
k
P (X = k) = Cm+k1
,
p
p
which yields to
k
P (X = k) = Cm+k1
P k Qmk .
1.10.3
Properties
m(1p)
p2
m(1p)
p
or (r(1 + )).
Let N be Poisson distributed P() knowing that = where is gamma distributed G(a, a).
Then we have N is negative binomial distributed BN (a, a ).
1.10.4
Estimation
2
Sn
Xn
1 and r =
n
X
.
NEED REFERENCE
1.10.5
Random generation
1.10.6
Applications
From Simon (1962), here are some applications of the negative binomial distribution
number of bacterial colonies per microscopic field,
quality control problem,
claim frequency in non life insurance.
24
1.11
1.11.1
Characterization
P (X = k) =
(1 (t 1))r (1 + )r
.
1 (r + )r
1p
,
1
)r
1( 1+
if k = 0
otherwise
1 r
1
r
) (
) ,
1 (t 1)
1+
M (t) = (
1
1 r
)r (
)
t
1 (e 1)
1+
1.11.2
Properties
r
1(r+)r
1.11.3
r(1+(1++r)(1+)r )
(1(r+)r )2
and
Estimation
According to Cacoullos & Charalambides (1975), the (unique) minimim variance unbiased estimator
of p for the zero-truncated geometric distribution is
p = t
t1
Sr,n
,
Snt
25
P
where t denotes the sum ni=1 Xi , Snt is defined by
likelihood estimator of q is given by
1
n!
nk C k (k + t 1) .
t
n
k=1 (1)
Pn
The maximum
1
q = ,
Xn
which is also the moment based estimator. By the uniqueness of the unbiased estimator, q is a
biased estimator.
1.11.4
Random generation
1.11.5
Applications
1.12
Pascal distribution
1.12.1
Characterization
The negative binomial distribution can be constructed by summing m geometric distributed variables G(p). The Pascal distribution is got from summing n geometrically distributed G0 (p) variables.
Thus possible values of the Pascal distribution are in {n, n + 1, . . . }. The mass probability function
is defined as
n1 n
P (X = k) = Ck1
p (1 p)kn ,
where k {n, n + 1, . . . }, n N and 0 < p < 1. The probability/moment generating functions are
G(t) =
1.12.2
pt
1 (1 p)t
n
and M (t) =
pet
1 (1 p)et
n
.
Properties
For the Pascal distribution Pa(n, p), we have E(X) = np and V ar(X) = n(1p)
. The link between
p2
Pascal distribution Pa(n, p) and the negative binomial distribution BN (n, p) is to substract the
constant n, i.e. if X Pa(n, p) then X n BN (n, p).
where Cnk s are the binomial coefficient and (n)m is the increasing factorial.
26
1.12.3
Estimation
1.12.4
Random generation
1.12.5
Applications
1.13
Hypergeometric distribution
1.13.1
Characterization
k C nk
Cm
N m
,
n
CN
n
n
t
CN
CN
m 2 F1 (n, m; N m n + 1; t)
m 2 F1 (n, m; N m n + 1; e )
and
M
(t)
=
,
n
n
CN
CN
1.13.2
Properties
nm
N
and V ar(X) =
nm(N n)(N m)
.
N 2 (N 1)
m
We have the following asymptotic result: H(N, n, m) 7 B(n, N
) when N and m are large such
m
that N 0 < p < 1.
N +
1.13.3
Estimation
1.13.4
Random generation
1.13.5
Applications
Let N be the number of individuals in a given population. In this population, m has a particular
m
. If we draw n individuals among this population, the random
property, hence a proportion of N
variable associated with the number of people having the desired property follows a hypergeometric
n
distribution H(N, n, m). The ratio N
is called the survey rate.
Chapter 2
Conway-Maxwell-Poisson distribution
Characterization
TODO
2.1.2
Properties
TODO
2.1.3
Estimation
TODO
2.1.4
Random generation
TODO
27
28
2.1.5
2.2
2.2.1
Applications
Delaporte distribution
Characterization
TODO
2.2.2
Properties
TODO
2.2.3
Estimation
TODO
2.2.4
Random generation
TODO
2.2.5
2.3
2.3.1
Applications
Engen distribution
Characterization
TODO
2.3.2
Properties
TODO
2.3.3
TODO
Estimation
2.3.4
Random generation
TODO
2.3.5
2.4
2.4.1
Applications
Logaritmic distribution
Characterization
TODO
2.4.2
Properties
TODO
2.4.3
Estimation
TODO
2.4.4
Random generation
TODO
2.4.5
2.5
2.5.1
Applications
Sichel distribution
Characterization
TODO
2.5.2
TODO
Properties
29
30
2.5.3
Estimation
TODO
2.5.4
Random generation
TODO
2.5.5
2.6
Applications
Zipf distribution
The name Zipf distribution comes from George Zipfs work on the discretized version of the Pareto
distribution, cf. Arnold (1983).
2.6.1
Characterization
2.6.2
Properties
TODO
2.6.3
Estimation
TODO
2.6.4
TODO
Random generation
2.6.5
2.7
2.7.1
Applications
TODO
2.7.2
Properties
TODO
2.7.3
Estimation
TODO
2.7.4
Random generation
TODO
2.7.5
2.8
2.8.1
Applications
Rademacher distribution
Characterization
TODO
2.8.2
Properties
TODO
2.8.3
TODO
Estimation
31
32
2.8.4
Random generation
TODO
2.8.5
2.9
2.9.1
Applications
Skellam distribution
Characterization
TODO
2.9.2
Properties
TODO
2.9.3
Estimation
TODO
2.9.4
Random generation
TODO
2.9.5
Applications
2.10
Yule distribution
2.10.1
Characterization
TODO
2.10.2
TODO
Properties
2.10.3
Estimation
TODO
2.10.4
Random generation
TODO
2.10.5
Applications
2.11
Zeta distribution
2.11.1
Characterization
TODO
2.11.2
Properties
TODO
2.11.3
Estimation
TODO
2.11.4
Random generation
TODO
2.11.5
Applications
33
Part II
Continuous distributions
34
Chapter 3
Uniform distribution
Characterization
1
,
ba
U(0,1)
U(0,2)
U(0,3)
0.8
f (x) =
density function
0.2
0.4
f(x)
0.6
if x < a
0
xa
if a x b .
F (x) =
ba
1
otherwise
0.0
Another way to define the uniform distribution is to use the moment generating function
0.0
M (t) =
etb
eta
t(b a)
3.1.2
0.5
1.0
1.5
2.0
2.5
3.0
eibt eiat
.
i(b a)t
Properties
a+b
2
(ba)2
12 .
36
If U is uniformally distributed U(0, 1), then (ba)U +a follows a uniform distribution U(a, b).
The sum of two uniform distribution does not follow a uniform distribution but a triangle
distribution.
The order statistic Xk:n of a sample of n i.i.d. uniform U(0, 1) random variable is beta distributed
Beta(k, n k + 1).
Last but not least property is that for all random variables Y having a distribution function
FY , the random variable FY (Y ) follows a uniform distribution U(0, 1). Equivalently, we get that
the random variable FY1 (U ) has the same distribution as Y where U U(0, 1) and FY1 is the
generalized inverse distribution function. Thus, we can generate any random variables having a
distribution from the a uniform variate. This methods is called the inverse function method.
3.1.3
Estimation
For a sample (Xi )i of i.i.d. uniform variate, maximum likelihood estimators for a and b are respectively X1:n and Xn:n , where Xi:n denotes the order statistics. But they are biased so we can use
the following unbiased estimators
a
=
n2
1
1
n
n
X1:n +
Xn:n and b =
X1:n + 2
Xn:n .
2
2
1
1n
1n
n 1
3.1.4
Since this is the core distribution, the distribution can not be generated from another distribution.
In our modern computers, we use deterministic algorithms to generate uniform variate initialized
with the machine time. Generally, Mersenne-Twister algorithm (or its extensions) from Matsumoto
& Nishimura (1998) is implemented, cf. Dutang (2008) for an overview of random number generation.
3.1.5
Applications
The main application is sampling from an uniform distribution by the inverse function method.
3.2
3.2.1
Triangular distribution
Characterization
37
density function
1.0
2(xa)
if a x c
(ba)(ca)
f (x) =
,
2(bx)
if c x b
0.8
T(0,2,1)
T(0,2,1/2)
T(0,2,4/3)
0.4
0.2
0.0
(ba)(bc)
f(x)
(xa)2
if a x c
(ba)(ca)
F (x) =
.
2
(bx)
1
if c x b
0.6
(ba)(bc)
3.2.2
2(c a)ebt
(b c)eat (b a)ect
+
.
2(b a)(c a)(b c)t2 (b a)(c a)(b c)t2
Properties
3.2.3
a+b+c
3
Estimation
Maximum likelihood estimators for a, b, c do not have closed form. But we can maximise the loglikelihood numerically. Furthermore, moment based estimators have to be computed numerically
solving the system of sample moments and theoretical ones. One intuitive way to estimate the
parameters of the triangle distribution is to use sample minimum, maximum and mode: a
= X1:n ,
b = Xn:n and c = mode(X1 , . . . , Xn ), where mode(X1 , . . . , Xn ) is the middle of the interval whose
bounds are the most likely order statistics.
3.2.4
Random generation
The inverse function method can be used since the quantile function has a closed form:
(
p
a + u(b a)(c a)
if 0 u ca
1
ba
p
.
F (u) =
b (1 u)(b a)(b c)
if ca
1
ba
38
3.2.5
Applications
A typical of the triangle distribution is when we know the minimum and the maximum of outputs
of an interest variable plus the most likely outcome, which represent the parameter a, b and c. For
example we may use it in business decision making based on simulation of the outcome, in project
management to model events during an interval and in audio dithering.
3.3.1
The beta distribution of first kind is a distribution valued in the interval [0, 1]. Its density is
defined as
2.0
f(x)
0.0
1.5
xa1 (1 x)b1
,
(a, b)
B(2,2)
B(3,1)
B(1,5)
Arcsine
0.5
f (x) =
density function
1.0
3.3
0.0
0.2
0.4
0.6
0.8
1.0
39
(a, b, x)
,
(a, b)
where x [0, 1] and (., ., .) denotes the incomplete beta function. There is no analytical formula
for the incomplete beta function but can be approximated numerically.
There exists a scaled version of the beta I distribution. Let be a positive scale parameter.
The density of the scaled beta I distribution is given by
f (x) =
xa1 ( x)b1
,
a+b1 (a, b)
(a, b, x )
.
(a, b)
Beta I distributions have moment generating function and characteristic function expressed in
terms of series:
!
+ k1
X
Y a+r
tk
M (t) = 1 +
a + b + r k!
k=1
r=0
and
(t) = 1 F1 (a; a + b; i t),
where 1 F1 denotes the hypergeometric function.
40
3.3.2
Special cases
A special case of the beta I distribution is the arcsine distribution, when a = b = 12 . In this special
case, we have
1
f (x) = p
,
x(1 x)
from which we derive the following distribution function
F (x) =
2
arcsin( x).
Another special case is the power distribution when b = 1, with the following density
f (x) = axa1 and F (x) = xa ,
for 0 < x < 1.
3.3.3
Properties
a
a+b
and V ar(X) =
ab
(a+b)2 (a+b+1)
(and
a
a+b ,
( + )( + r)
,
( + + r)()
F
,
r,
+
,
.
E ((X E(X))r ) =
2 1
+
For the arcsine distribution, we have 12 and 81 respectively. Let us note that the expectation of
a arcsine distribution is the least probable value!
Let n be an integer. If we consider n i.i.d. uniform U(0, 1) variables Ui , then the distribution
of the maximum max Ui of these random variables follows a beta I distribution B(n, 1).
1in
3.3.4
Estimation
Maximum likelihood estimators for a and b do not have closed form, we must solve the system
1 P
1
n
i=1
n
P
i=1
41
3.3.5
n)
n
Xn (1 X
1X
b = a
1
and
n .
Sn2
X
Random generation
NEED REFERENCE
3.3.6
Applications
The arcsine distribution (a special case of the beta I) can be used in game theory. If we have two
players playint at head/tail coin game and denote by (Si )i1 the serie of gains of the first player
for the different game events, then the distribution of the proportion of gains among all the Si s
that are positive follows asymptotically an arcsine distribution.
3.4.1
(a, b, ( x ) )
F (x) =
,
(a, b)
3.0
2.5
f(x)
1.0
0.5
2.0
(x/)a1 (1 (x/))b1
(a, b)
x
B(2,2,2,2)
B(3,1,2,2)
B(3,1,1/2,2)
B(1/2,2,1/3,2)
0.0
f (x) =
density function
1.5
3.4
0.0
0.5
1.0
1.5
2.0
42
3.4.2
Properties
(a + r )
.
(a, b)
3.4.3
Estimation
Maximum likelihood estimators as well as moment based estimators have no chance to have explicit
form, but we can compute it numerically. NEED REFERENCE
3.4.4
Random generation
NEED REFERENCE
3.4.5
Applications
NEED REFERENCE
3.5
3.5.1
A generalization of the generalized beta distribution has been studied in Nadarajah & Kotz (2003).
Its density is given by
f (x) =
b(a, b) a+b1
x
2 F1 (1 , a, a + b, x),
(a, b + )
where 0 < x < 1 and 2 F1 denotes the hypergeometric function. Its distribution function is also
expressed in terms of the hypergeometric function:
F (x) =
b(a, b)
xa+b 2 F1 (1 , a, a + b + 1, x),
(a + b)(a, b + )
3.5.2
43
Special cases
Nadarajah & Kotz (2003) list specials cases of this distribution: If a + b + = 1 then we get
f (x) =
b(b)xa+b1 (1 x)a
.
(1 a)(a + b)
If a + b + = 2 then we get
f (x) =
b(a + b 1)(a, b)
(a + b 1, 1 a, x)
(a, 2 a)
If in addition
a + b 1 N, we have
b(a + b 1)(a, b)(a + b 1, 1 a)
f (x) =
(a, 2 a)
a+b1
X
i=1
(i a)
xi1 (1 x)1a
(1 a)(i)
x
1x
2
arctan
!
k1
X
p
x
x(1 x)
di (x, k)
1x
i=1
If = 0 then, we get
f (x) = b(a + b 1)(1 x)b1 (a + b 1, 1 b, x)
If in addition
a + b 1 N, we have
f (x) = b(a + b 1)(a, b)(a + b 1, 1 a) 1
a+b1
X
i=1
(i b)
xi1 (1 x)1b
(1 b)(i)
x
1x
a = k N, we have
(2k 1)(1/2, k 1/2)
f (x) =
4 1x
2
arctan
!
k1
X
p
x
x(1 x)
di (x, k)
1x
i=1
44
(a, , x)
(a, + 1)
If in addition
a N, we have
f (x) =
a
+1
1
i=1
N, we have
f (x) =
a
+1
a
X
( + i 1)
1
()(i)
a
X
(a + i 1)
i=1
(a)(i)
!
x
i1
(1 x)
!
xa (1 x)i1
a = = 1/2, we have
4
f (x) = arctan
x
1x
a
+1
2
arctan
!
j1
k1
X
X
p
x
x(1 x)
di (x, k) +
ci (x, k)
1x
i=1
i=1
(k + i 1)xk1/2 (1 x)i1/2
(k 1/2)(i + 1/2)
and
di (x, k) =
3.5.3
(i)xi1
.
(i + 1/2)(1/2)
Properties
b(a, b)
xa+b 3 F1 (1 , a, n + a + b + 1, a + b, n + a + b + 1, 1),
(n + a + b)(a, b + )
3.5.4
45
Estimation
NEED REFERENCE
3.5.5
Random generation
NEED REFERENCE
3.5.6
Applications
NEED REFERENCE
3.6
3.6.1
Kumaraswamy distribution
Characterization
density function
K(5,2)
K(2,5/2)
K(1/2,1/2)
K(1,3)
2.5
1.5
1.0
0.5
0.0
f(x)
A construction of the Kumaraswamy distribution use minimum/maximum of uniform samples. Let n be the number of samples (each with
m i.i.d. uniform variate), then the distribution
of the minimumm of all maxima (by sample)
is a Kumaraswamy Ku(m, n), which is also the
distribution of one minus the maximum of all
minima.
2.0
0.0
0.2
0.4
0.6
0.8
1.0
46
3.6.2
Properties
E(X ) = b(1 + , b)
a
when > a with (., .) denotes the beta function. Thus the expectation of a Kumaraswamy
2
1
2 2
distribution is E(X) = b(1+1/a)(b)
(1+1/a+b) and its variance V ar(X) = b(1 + a , b) b (1 + a , b).
3.6.3
Estimation
From Jones (2009), the maximum likelihood estimators are computable by the following procedure
3.6.4
n
a
1+
1
n
Pn
Pn
i=1 log(1
log Yi
i=1 1Yi
Pn
Yi log Yi
Pni=1 1Yi
i=1 log(1Yi )
1
Xia ) .
Random generation
3.6.5
Applications
From wikipedia, we know a good example of the use of the Kumaraswamy distribution: the storage
volume of a reservoir of capacity zmax whose upper bound is zmax and lower bound is 0.
Chapter 4
The normal distribution comes from the study of astronomical data by the German mathematician
Gauss. Thats why it is widely called the Gaussian distribution. But there are some hints to
think that Laplace has also used this distribution. Thus sometimes we called it the Laplace Gauss
distribution, a name introduced by K. Pearson who wants to avoid a querelle about its name.
Characterization
0.8
density function
0.2
f(x)
0.6
N(0,1)
N(0,2)
N(0,1/2)
N(-1,1)
0.4
4.1.1
0.0
which has no explicit expressions. Many softwares have this distribution function implemented, since it is The basic distribution. Generally, we denote by the distribution function
-4
-2
0
2
4
a N (0, 1) normal distribution, called the stanx
dard normal distribution. F can be rewritten
as
x
.
F (x) =
Figure 4.1: The density of Gaussian distributions
47
48
Finally, the normal distribution can also be characterized through its moment generating function
2 t2
M (t) = emt+ 2 ,
as well as its characteristic function
(t) = eimt
4.1.2
2 t2
2
Properties
It is obvious, but let us recall that the expectation (and the median) of a normal distribution
N (, 2 ) is and its variance 2 . Furthermore if X N (0, 1) we have that E(X n ) = 0 if x is odd
and (2n)!
2n n! if x is even.
The biggest property of the normal distribution is the fact that the Gaussian belongs to the
family of stable distribution (i.e. stable by linear combinations). Thus we have
if X N (, 2 ) and Y N (, 2 ), then aX + bY N (a + b, a2 2 + b2 2 + 2abCov(X, Y )),
with the special case where X, Y are independent cancelling the covariance term.
if X N (, 2 ), a, b two reals, then aX + b N (a + b, a2 2 ).
If we consider an i.i.d. sample of n normal random variables (Xi )1in , then the sample mean
2
2
X n follows a N (, n ) independently from the sample variance Sn2 such that Sn2n follows a chi-square
distribution with n 1 degrees of freedom.
A widely used theorem using a normal distribution is the central
Pnlimit theorem:
X nm L
2
If (Xi )1in are i.i.d. with mean m and finite variance s , then i=1sni
N (0, 1). If we
drop the hypothesis of identical distribution, there is still an asymptotic convergence (cf. theorem
of Lindeberg-Feller).
4.1.3
Estimation
Xn =
1
n
Sn2 =
1
n1
i=1 Xi
Pn
i=1 (Xi
q
( n1
)p 2
2
n = n1
Sn is the unbiased estimator with minimum variance of but we generally
2
( n
)
2
p
use Sn2 .
Confidence intervals for these estimators are also well known quantities
This estimator is not the maximum likelihood estimator since we unbias it.
I() = X n
I( 2 ) =
2n
Sn
zn1,/2
2
Sn
n tn1,/2 ; X n
;z
2n
Sn
n1,1/2
2
Sn
n tn1,/2
49
,
where tn1,/2 and zn1,/2 are quantiles of the Student and the Chi-square distribution.
4.1.4
Random generation
4.1.5
Applications
From wikipedia, here is a list of situations where approximate normality is sometimes assumed
In counting problems (so the central limit theorem includes a discrete-to-continuum approximation) where reproductive random variables are involved, such as Binomial random variables, associated to yes/no questions or Poisson random variables, associated to rare events;
In physiological measurements of biological specimens: logarithm of measures of size of living
tissue (length, height, skin area, weight) or length of inert appendages (hair, claws, nails,
teeth) of biological specimens, in the direction of growth; presumably the thickness of tree bark
also falls under this category or other physiological measures may be normally distributed,
but there is no reason to expect that a priori;
Measurement errors are often assumed to be normally distributed, and any deviation from
normality is considered something which should be explained;
Financial variables: changes in the logarithm of exchange rates, price indices, and stock
market indices; these variables behave like compound interest, not like simple interest, and
so are multiplicative; or other financial variables may be normally distributed, but there is
no reason to expect that a priori;
Light intensity: intensity of laser light is normally distributed or thermal light has a BoseEinstein distribution on very short time scales, and a normal distribution on longer timescales
due to the central limit theorem.
50
Characterization
One way to characterize a random variable follows a log-normal distribution is to say that its
logarithm is normally distributed. Thus the distribution function of a log-normal distribution
(LG(, 2 )) is
,
0.0
From this we can derive an explicit expression for the density LG(, 2 )
f (x) =
1.0
0.8
log(x)
LN(0,1)
LN(0,2)
LN(0,1/2)
0.6
F (x) =
f(x)
0.4
4.2.1
0.2
4.2
(log(x))2
1
e 22 ,
x 2
10
A log-normal distribution does not have fi- Figure 4.2: The density of log-normal distribunite characteristic function or moment generat- tions
ing function.
4.2.2
Properties
n2 2
2
From Klugman et al. (2004), we also have a formula for limited expected values
k 2
E (X L)k = ek(+ 2 (u k) + Lk (1 (u)),
where u =
log(L)
.
Since the Gaussian distribution is stable by linear combination, log-normal distribution is stable
by product combination. That is to say if we consider X and Y two independent log-normal
variables (LG(, 2 ) and LG(, 2 )), we have XY follows a log-normal distribution LG(+, 2 +2 ).
2
2
Let us note that X
Y also follows a log-normal distribution LG( , + ).
51
An equivalence of the Limit Central Theorem for the log-normal distribution is the product of
i.i.d. random variables (Xi )1in asymptotically follows a log-normal distribution with paramter
nE(log(X)) and nV ar(log(X)).
4.2.3
Estimation
1
n
c2 =
Pn
i=1 log(xi )
1
n1
Pn
is an unbiased estimator of ,
i=1 (log(xi )
)2 is an unbiased estimator of 2 .
One amazing fact about parameter estimations of log-normal distribution is that those estimators
are very stable.
4.2.4
Random generation
Once we have generated a normal variate, it is easy to generate a log-normal variate just by taking
the exponential of normal variates.
4.2.5
Applications
There are many applications of the log-normal distribution. Limpert et al. (2001) focuses on
application of the log-normal distribution. For instance, in finance the Black & Scholes assumes
that assets are log-normally distributed (cf. Black & Scholes (1973) and the extraordinary number
of articles citing this article). Singh et al. (1997) deals with environmental applications of the
log-normal distribution.
As for the 2 estimator of normal distribution, this estimator is not the maximum likelihood estimator since we
unbias it.
52
4.3
4.3.1
Characterization
0.8
density function
0.6
LN(0,1,0)
LN(0,1,1)
LN(0,1,1/2)
,
f (x) =
(log(x))2
1
2 2
e
,
(x ) 2
0.0
0.2
f(x)
F (x) =
log(x )
0.4
0.0
0.5
1.0
1.5
2.0
4.3.2
Properties
The expectation and the variance of a log-normal distribution are E(X) = +e+
2
(e
2+ 2
1)e
4.3.3
2 2
n+ n 2
2
2
and V ar(X) =
Estimation
An intuitive approach is to estimate with X1:n , then estimate parameters on shifted samples
(Xi )i .
4.3.4
Random generation
Once we have generated a normal variate, it is easy to generate a log-normal variate just by taking
the exponential of normal variates and adding the shifted parameter .
4.3.5
53
Applications
An application of the shifted log-normal distribution to finance can be found in Haahtela (2005) or
Brigo et al. (2002).
4.4.1
exp
,
f (x) =
2x3
2 2 x
0.0
2
1 1 2 it
( )
(t) = e
0.5
f(x)
+1 ,
x
x
InvG(1,2)
InvG(2,2)
InvG(1,1/2)
1.0
4.4
The moment generating function is ex- Figure 4.4: The density of inverse Gaussian distributions
pressed as
q
2
( ) 1 1 2 t
M (t) = e
.
4.4.2
Properties
3
.
(n+i)
2 i
i=0 (i+1)(ni) ( )
Pn1
54
4.4.3
Estimation
=X
!1
n
X
1
1
=n
and
.
Xi
i=1
4.4.4
follows a
Random generation
NEED
Mitchael,J.R., Schucany, W.R. and Haas, R.W. (1976). Generating random roots from variates
using transformations with multiple roots. American Statistician. 30-2. 88-91.
4.4.5
Applications
NEED REFERENCE
4.5
Characterization
density function
1.5
4.5.1
0.0
0.5
x1
2
1
+ x
,
f (x) =
exp
2 x
2K ( )
f(x)
1.0
A generalization of the inverse Gaussian distribution exists but there is no closed form for its
distribution function and its density used Bessel
functions. The latter is as follows
GIG(-1/2,5,1)
GIG(-1,2,3)
GIG(-1,1/2,1)
GIG(1,5,1)
55
4.5.2
2t
/2
p
K ( ( 2t))
.
K ( )
(4.1)
Properties
K+1 ( )
,
K ( )
n
2 K+n ( )
.
E(X ) =
K ( )
n
2
K+2 ( ) K+1 ( )
V ar(X) =
.
K ( )
K ( )
Furthermore,
dE(X )
E(log X) =
.
d =0
(4.2)
Note that numerical calculations of E(log X) may be performed with the integral representation as
well.
4.5.3
Estimation
NEED REFERENCE
4.5.4
Random generation
NEED REFERENCE
Chapter 5
Exponential distribution
Characterization
2.0
density function
E(1)
E(2)
E(1/2)
f (x) = ex ,
,
t
while its characteristic function is
(t) =
.
it
0.5
Since it is a light-tailed distribution, the moment generating function of an exponential distribution E() exists which is
1.0
f(x)
1.5
5.1.2
0.0
M (t) =
0.0
0.5
1.0
1.5
2.0
Properties
and
1
.
2
Furthermore
57
The exponential distribution is the only one continuous distribution to verify the lack of memory
property. That is to say if X is exponentially distributed, we have
P (X > t + s)
= P (X > t),
P (X > s)
where t, s > 0.
If we sum n i.i.d. exponentially distributed random variables, we get a gamma distribution
G(n, ).
5.1.3
Estimation
The maximum likelihood estimator and the moment based estimator are the same
= Pnn
i=1 Xi
1
,
Xn
for a sample (Xi )1in . But the unbiased estimator with mininum variance is
= Pnn 1 .
i=1 Xi
Exact confidence interval for parameter is given by
z2n, 2
z2n,1 2
P
, P
,
I () =
2 ni=1 Xi 2 ni=1 Xi
where zn, denotes the quantile of the chi-squared distribution.
5.1.4
Random generation
Despite the quantile function is F 1 (u) = 1 log(1 u), generally the exponential distribution
E() is generated by applying 1 log(U ) on a uniform variate U .
5.1.5
Applications
From wikipedia, the exponential distribution occurs naturally when describing the lengths of the
inter-arrival times in a homogeneous Poisson process.
The exponential distribution may be viewed as a continuous counterpart of the geometric distribution, which describes the number of Bernoulli trials necessary for a discrete process to change
state. In contrast, the exponential distribution describes the time for a continuous process to change
state.
In real-world scenarios, the assumption of a constant rate (or probability per unit time) is rarely
satisfied. For example, the rate of incoming phone calls differs according to the time of day. But
58
if we focus on a time interval during which the rate is roughly constant, such as from 2 to 4 p.m.
during work days, the exponential distribution can be used as a good approximate model for the
time until the next phone call arrives. Similar caveats apply to the following examples which yield
approximately exponentially distributed variables:
the time until a radioactive particle decays, or the time between beeps of a geiger counter;
the time it takes before your next telephone call
the time until default (on payment to company debt holders) in reduced form credit risk
modeling
Exponential variables can also be used to model situations where certain events occur with a
constant probability per unit distance:
the distance between mutations on a DNA strand;
the distance between roadkill on a given road;
In queuing theory, the service times of agents in a system (e.g. how long it takes for a bank
teller etc. to serve a customer) are often modeled as exponentially distributed variables. (The interarrival of customers for instance in a system is typically modeled by the Poisson distribution in most
management science textbooks.) The length of a process that can be thought of as a sequence of
several independent tasks is better modeled by a variable following the Erlang distribution (which
is the distribution of the sum of several independent exponentially distributed variables).
Reliability theory and reliability engineering also make extensive use of the exponential distribution. Because of the memoryless property of this distribution, it is well-suited to model the
constant hazard rate portion of the bathtub curve used in reliability theory. It is also very convenient because it is so easy to add failure rates in a reliability model. The exponential distribution is
however not appropriate to model the overall lifetime of organisms or technical devices, because the
failure rates here are not constant: more failures occur for very young and for very old systems.
In physics, if you observe a gas at a fixed temperature and pressure in a uniform gravitational
field, the heights of the various molecules also follow an approximate exponential distribution. This
is a consequence of the entropy property mentioned below.
5.2
5.2.1
59
Shifted exponential
Characterization
0.6
density function
0.5
E(1/2,0)
E(1/2,1)
E(1/2,2)
f(x)
0.2
F (x) = 1 e(x )
0.1
for x > .
M (t) = et
0.0
(t) = eit
.
it
5.2.2
0.3
0.4
f (x) = e(x )
Properties
and
1
.
2
Furthermore the n-th moment (for n integer) is computable with the binomial formula by
E(X n ) =
n
X
i=0
5.2.3
( )n
n!
.
(n i)! ( )i
Estimation
= Pn
= X1:n and
where Xi:n denotes the ith order statistic. Since the minimum X1:n follows a shifted exponential
distribution E(n, ), we have is biased but asympotically unbiased.
NEED REFERENCE for unbiased estimators
60
5.2.4
Random generation
The random generation is simple: just add to the algorithm of exponential distribution.
5.2.5
Applications
NEED REFERENCE
5.3.1
Inverse exponential
Characterization
0.6
0.5
e x,
x2
IE(1)
IE(2)
IE(3)
0.4
f (x) =
density function
f(x)
where x > 0 and > 0. The distribution function can then be derived as
0.3
5.3
0.2
F (x) = e x .
0.0
0.1
(t) = 2 itK1 2 it
and
M (t) = 2 itK1 2 t .
5.3.2
Properties
5.3.3
61
Estimation
Xi
!1
,
i=1
5.3.4
Random generation
1
,
5.3.5
Applications
NEED REFERENCE
5.4
5.4.1
Gamma distribution
Characterization
1.0
0.6
0.4
0.0
0.2
G(1,1)
G(2,1)
G(2,2)
G(1/2,1)
0.8
x 1
e
x
,
()
f(x)
f (x) =
density function
Figure 5.4: Density function for gamma distribuwhere (., .) is the incomplete gamma function. tions
62
1
X
i=0
(x)i x
e
.
i!
For the gamma distribution, the moment generating and characteristic functions exist.
(t) =
it
and
M (t) =
5.4.2
,
.
Properties
The expectation of a gamma distribution G(, ) is E(X) = , while its variance is V ar(X) =
.
2
( + r)
,
()
5.4.3
Estimation
n )2
(X
= Xn .
and
2
Sn
Sn2
X
X+Y
follows a
63
where (.) denotes the digamma function. The first equation can be solved numerically to get
= . But
is biased, so the unbiased estimator with minimum variance of is
and then
X
n
n 1 Xn
5.4.4
Random generation
Simulate a gamma G(, ) is quite tricky for non integer shape parameter. Indeed, if the shape
parameter is integer, then we simply sum exponential random variables E(). Otherwise we
need to add a gamma variable G(bc, ). This is carried out by an acceptance/rejection method.
NEED REFERENCE
5.4.5
Applications
NEED REFERENCE
Characterization
i=1
j=1,j6=i
density function
0.4
0.6
Erlang(1,2,3)
Erlang(1,2,4)
Erlang(1,3,5)
Erlang(2,3,4)
0.2
d
d
X
Y
i ei x ,
f (x) =
j i
0.0
5.5.1
f(x)
5.5
64
d
d
X
Y
j
F (x) =
(1 ei x ).
j i
i=1
j=1,j6=i
d
Y
j=1
5.5.2
d
Y
j
j
and M (t) =
.
j it
j t
j=1
Properties
d
P
i=1
V ar(X) =
d
P
i=1
5.5.3
1
i
1
.
2i
Estimation
NEED REFERENCE
5.5.4
Random generation
The algorithm is very easy simulate independently d random variables exponentially E(j ) distributed and sum them.
5.5.5
Applications
NEED REFERENCE
5.6
Chi-squared distribution
A special case of the gamma distribution is the chi-squared distribution. See section 6.1.
65
5.7
Inverse Gamma
5.7.1
Characterization
The inverse gamma distribution is the distribution of a random variable X1 when X is gamma
distributed. Hence the density is
1.5
InvG(3/2,1)
InvG(3/2,3/2)
InvG(1,3)
0.5
F (x) =
1.0
e x ,
+1
()x
f(x)
f (x) =
density function
0.0
2 it
Figure 5.6: Density function for inverse gamma
(t) =
K (2 it)
()
distributions
and
2 it
M (t) =
K (2 t).
()
5.7.2
Properties
The expectation exists only when > 1 and in this case E(X) =
finite if > 2 and V ar(X) =
5.7.3
2
(1)2 (2)
1 ,
Estimation
=2+
n )2
(X
=X
n (
and
1)
Sn2
n and Sn2 the sample mean and variance. If the variance does not exist, then will be 2, it
with X
means we must use the maximum likelihood estimator (which works also for 2).
66
P
log () = log( n1 ni=1
P
1
= n1 ni=1 X1
i
1
Xi )
1
n
Pn
1
i=1 log Xi
where (.) denotes the digamma function. The first equation can be solved numerically to get
5.7.4
Random generation
5.7.5
Applications
NEED REFERENCE
5.8
5.8.1
density function
x
(
)
(, ( x ) )
.
()
0.4
F (x) =
f(x)
0.6
0.8
0.2
( x ) 1 e
()
0.0
f (x) =
TG(3,1/2,1)
TG(3,1/2,1/3)
TG(3,1/2,4/3)
0
1
2
3
4
5
Obviously, a special case of the transformed
x
gamma is the gamma distribution with = 1.
But we get the Weibull distribution with = 1. Figure 5.7: Density function for transformed
gamma distributions
5.8.2
67
Properties
2 (+ 2 )
()
(+ 1 )
()
E 2 [X].
5.8.3
( + r )
,
()
> 0.
Estimation
n
n
1 P
1 P
log
X
log(
Xi )
()
log
i
n
n
i=1
i=1
n
n
n
1
n
1 P
1 P
1 P
1 P
,
= n
Xi n
Xi log Xi n
Xi
log Xi
n
i=1
i=1
i=1
i=1
1 P
i
n
i=1
where denotes the digamma function. This system can be solved numerically.
TODO : use Gomes et al. (2008)
5.8.4
Random generation
5.8.5
and multiply it by .
Applications
In an actuarial context, the transformed gamma may be useful in loss severity, for example, in
workers compensation, see Venter (1983).
68
5.9
5.9.1
density function
ITG(3,2,1)
ITG(3,2,1/2)
ITG(3,2,4/3)
5.9.2
2.5
2.0
f(x)
1.5
1.0
0.0
0.5
3.0
( ) e( x )
f (x) = x
,
x()
Properties
0.0
2 ( 2 )
()
( 1 )
()
0.5
1.0
1.5
2.0
2.5
3.0
E 2 [X].
5.9.3
Estimation
NEED REFERENCE
5.9.4
Random generation
Simply simulate a gamma G(, 1) distributed variable, inverse it, raise it to power
it by .
5.9.5
Applications
NEED REFERENCE
and multiply
5.10
Log Gamma
5.10.1
Characterization
69
f (x) =
ek
xa
e
b
xa
b
(k)
for x > 0, where a is the location parameter, b > 0 the scale parameter and k > 0 the shape
parameter. The distribution function is
xa
(k, e b )
F (x) =
,
(k)
for x > 0. This is the distribution of a + b log(X) when X is gamma G(k, 1).
5.10.2
Properties
The expectation is E(X) = a + b(k) and the variance V ar(X) = b2 1 (k) where is the digamma
function and 1 the trigamma function.
5.10.3
Estimation
NEED REFERENCE
5.10.4
Random generation
5.10.5
Applications
NEED REFERENCE
5.11
Weibull distribution
5.11.1
Characterization
70
0.8
0.6
f(x)
where x > 0 and , > 0. In terms of distribution function, the Weibull can be defined
as
( x )
F (x) = 1 e .
0.2
0.4
1 ( x )
x
e
,
0.0
f (x) =
W(3,1)
W(3,2)
W(4,2)
W(4,3)
1.0
10
f (x) = x1 e x ,
with the same constraint on the parameters , > 0. In this context, the distribution function is
F (x) = 1 e x .
We can pass from the first parametrization to the second one with
(
=
.
= 1
5.11.2
Properties
The expectation of a Weibull distribution W(, ) is E(X) = (1+ 1 ) and the variance V ar(X) =
+1 2
2 [( +2
) ( ) ]. In the second parametrization, we have E(X) =
1
2
( (1 + 2 ) (1 + 1 )2 ).
(1+ 1 )
1
and V ar(X) =
The rth raw moment E(X r ) of the Weibull distribution W(, ) is given by (1 + r ) for r > 0.
The Weibull distribution is the distribution of the variable
distribution E(1).
5.11.3
Estimation
We work in this sub-section with the first parametrization. From the cumulative distribution, we
have
log( log |1 F (x)|) = log x log .
71
Thus we can an estimation of and by regressing log( log | ni |) on log Xi:n . Then we get the
following estimators
b
= a
and = e a ,
where a
and b are respectively the slope and the intercept of the regression line.
The maximum likelihood estimators verify the following system
(
n
+
n
+1
Pn
n ln() +
i=1
P
n
(xi ) = 0
i=1 ln(xi )
Pn
xi
i=1 ln(xi )( )
=0
which can be solved numerically (with algorithm initialized by the previous estimators).
5.11.4
Random generation
1
Using the inversion function method, we simply need to compute ( log(1 U )) for the first
1
)
parametrization or log(1U
for the second one where U is an uniform variate.
5.11.5
Applications
5.12
5.12.1
Characterization
density function
InvW(3,1)
InvW(3,2)
InvW(4,2)
InvW(4,3)
2
1
F (x) = e( x ) .
f(x)
e( x )
f (x) =
,
x+1
0.0
0.5
1.0
x
1.5
2.0
72
5.12.2
Properties
5.12.3
Estimation
1
1 = 1 Pn
i=1 Xi
n
1 + log() = 1 Pn
log
i=1 Xi
n
Xi +
1
n
i=1 log(Xi )
n )2 (1 2 )
n )2 )2 (1 1 ) = (X
(Sn2 + (X
n
X
(1 1 )
Pn
5.12.4
Random generation
5.12.5
Applications
NEED REFERENCE
TODO Carrasco et al. (2008)
5.13
5.13.1
Characterization
1 |xm|
e ,
2 2
.3
f (x) =
density function
L(0,1)
L(0,1)
L(0,3)
73
if x < m
2e
F (x) =
.
1 xm
1 2e
otherwise
There exists a moment generating function for
this distribution, which is
M (t) =
emt
,
1 2 t2
5.13.2
Properties
The expectation for the Laplace distribution is given by E(X) = m while the variance is V ar(X) =
2 2 .
5.13.3
Estimation
m
=
:n
bn
c:n
2
if n is even
otherwise
1X
=
|Xi m|.
n
i=1
5.13.4
Random generation
74
5.13.5
Applications
NEED
The Double Exponential Distribution: Using Calculus to Find a Maximum Likelihood Estimator
Robert M. Norton The American Statistician, Vol. 38, No. 2 (May, 1984), pp. 135-136
Chapter 6
Chi-squared distribution
Characterization
0.5
Chisq(2)
Chisq(3)
Chisq(4)
Chisq(5)
0.4
k
X
density function
Xi2 ,
f(x)
0.2
0.3
i=1
0.0
0.1
f (x) =
x 2 1
k
( k2 )2 2
e 2 ,
10
where k is the so-called degrees of freedom and Figure 6.1: Density function for chi-squared disx 0. One can notice that is the density of a tributions
gamma distribution G( k2 , 12 ), so k is not necessarily an integer. Thus the distribution function
can be expressed with the incomplete gamma
function
( k , x )
F (x) = 2 k 2 .
( 2 )
75
76
Thirdly, the chi-squared distribution can be defined in terms of its moment generating function
k
M (t) = (1 2t) 2 ,
or its characteristic function
(t) = (1 2it) 2 .
6.1.2
Properties
The expectation and the variance of the chi-squared distribution are simply E(X) = k and
V ar(X) = 2k. Raw moments are given by
r k
( 2 + r)
1
E(X ) =
.
2
( k2 )
r
6.1.3
Estimation
6.1.4
Random generation
For an integer k, just sum the square of k normal variable. Otherwise use the algorithm for the
gamma distribution.
6.1.5
Applications
The chi-squared distribution is widely used for inference, typically as pivotal function.
77
6.2
Chi distribution
6.2.1
Characterization
0.6
density function
0.5
Chi(2)
Chi(3)
Chi(4)
Chi(5)
0.3
0.2
f(x)
0.4
i=1
k
1
2
x2
e
k
,
0.0
xk1
f (x) =
0.1
6.2.2
k 1 t2
, ,
2 2 2
k+1
2
.
+t 2
k2
Properties
The expectation and the variance of a chi distribution are given by E(X) =
k E 2 (X). Other moments are given by
r
E(X r ) = 2 2
for k + r > 0.
( k+r
2 )
( k2 )
2( k+1
)
2
( k2 )
and V ar(X) =
78
6.2.3
Estimation
n
log(2)
k
1X
+
=
log(Xi ),
2
2
n
i=1
where denotes the digamma function. This equation can be solved on the positive real line or
just the set of positive integers.
6.2.4
Random generation
6.2.5
Applications
NEED REFERENCE
6.3
6.3.1
Characterization
Xi2 ,
f (x) =
x+
1 x k2
4
e 2 I k 1
x ,
2
2
0.2
f(x)
0.3
i=1
where (Xi )i are independent normally distributed N (i , 1), i.e. non centered normal random variable. We generally define the non central chi-squared distribution by the density
Chisq(2)
Chisq(2,1)
Chisq(4)
Chisq(4,1)
0.1
k
X
density function
0.0
79
M (t) =
e 12t
k
(1 2t) 2
and the characteristic function
it
e 12it
(t) =
(1 2it) 2
from which we see it is a convolution of a gamma distribution and a compound Poisson distribution.
6.3.2
Properties
n1
X
j=1
(n 1)!2j1
(k + j)E(X nj ),
(n j)!
6.3.3
Estimation
6.3.4
Random generation
6.3.5
k , 1).
Applications
NEED REFERENCE
80
6.4
6.4.1
v
u k
uX
t
X 2,
i
i=1
where (Xi )i are i.i.d. normally distributed N (i , 1) and a given k. This is equivalent as the
distribution of a square root of a non central chi-squared distribution (hence the name).
We generally define the non central chi distribution by
f (x) =
xk
(x)
k
2
x2 +2
2
I k 1 (x),
2
where x > 0 and I. (.) denotes the modified Bessels function. The distribution function can be
expressed in terms of the gamma incomplete function
F (x) =??,
for x > 0.
6.4.2
Properties
6.4.3
Estimation
NEED REFERENCE
6.4.4
Random generation
NEED REFERENCE
6.4.5
81
Applications
NEED REFERENCE
6.5
density function
2.5
InvChisq(2)
InvChisq(3)
InvChisq(4)
InvChisq(2.5)
1
( k2 , 2x
)
( k2 )
1.5
1.0
,
0.0
F (x) =
0.5
f(x)
2.0
2 2 k2 1
x 2 e 2x ,
f (x) =
( k2 )
0.0
0.2
0.4
0.6
0.8
1.0
Figure 6.4: Density function for inverse chiThirdly, the chi-squared distribution can be squared distributions
defined in terms of its moment generating function
2
M (t) =
( k2 )
t
2
k
4
Kk
2t ,
6.5.1
it
2
k
4
Kk
2it .
Properties
The expectation and the variance of the chi-squared distribution are simply E(X) =
and V ar(X) = (k2)22 (k4) . Raw moments are given by
E(X r ) =??
1
k2
if k > 2
82
6.5.2
Estimation
= log(2)
log(xi ),
2
n
i=1
6.5.3
Random generation
6.5.4
Applications
NEED REFERENCE
6.6
6.6.1
TODO
6.6.2
Properties
TODO
6.6.3
Estimation
TODO
6.6.4
TODO
Random generation
6.6.5
TODO
Applications
83
Chapter 7
Student t distribution
Intro?
Characterization
There are many ways to define the student distribution. One can say that it is the distribution
of
dN
,
C
0.4
density function
0.3
f(x)
0.1
( d+1 )
f (x) = 2 d
d( 2 )
T(1)
T(2)
T(3)
T(4)
0.2
7.1.1
0.0
-4
-2
2
2
( d2 )
where 2 F1 denotes the hypergeometric function.
84
7.1.2
85
Properties
The expectation of a student distribution is E(X) = 0 if d > 1, infinite otherwise. And the variance
d
is given by V ar(X) = d2
if d > 2.
Moments are given by
r/2
Y
2i 1 r/2
E(X ) =
,
2i
r
i=1
7.1.3
Estimation
Maximum likelihood estimator for d can be found by solving numerically this equation
d+1
2
n
n
d
1X
Xi2
d + 1 X (Xi /d)2
=
log 1 +
,
2
n
d
n
1 + Xi2 /d
i=1
i=1
7.1.4
Random generation
7.1.5
dN .
C
Applications
The main application of the student is when dealing with a normally distributed sample, the
derivation of the confidence interval for the standard deviation use the student distribution. Indeed
for a normally distributed N (m, 2 ) sample of size n we have that
n m
X
p
n
Sn2
follows a student n distribution.
86
Cauchy distribution
7.2.1
Characterization
7.2.2
Characterization
0.6
0.5
f(x)
0.1
0.2
Cauchy(0,1)
Cauchy(1,1)
Cauchy(1,1/2)
Cauchy(1,2)
0.4
density function
0.3
7.2
0.0
2
,
[ 2 + (x )2 ]
-4
-2
Even if there is no moment generating function, the Cauchy distribution has a characteristic function
(t) = exp( i t |t|).
7.2.3
Properties
The Cauchy distribution C(, ) has the horrible feature not to have any finite moments. However, the Cauchy distribution belongs to the family of stable distribution, thus a sum of Cauchy
distribution is still a Cauchy distribution.
7.2.4
Estimation
1
1 P
=n
2 +(Xi )2
n
P
i=1
i=1
Xi
2 +(Xi )2
n
P
i=1
2 +(Xi )2
7.2.5
87
Random generation
Since the quantile function is F 1 (u) = + tan((u 1/2)), we can use the inversion function
method.
7.2.6
Applications
NEED REFERENCE
7.3
7.3.1
Fisher-Snedecor distribution
Characterization
TODO
7.3.2
Properties
TODO
7.3.3
Estimation
TODO
7.3.4
Random generation
TODO
7.3.5
TODO
Applications
Chapter 8
Pareto family
8.1
Pareto distribution
name??
Characterization
1.5
density function
f(x)
1.0
P1(1,1)
P1(2,1)
P1(2,2)
P1(2,3)
0.0
0.5
8.1.1
Pareto I
Figure 8.1: Density function for Pareto I distriThe Pareto type I distribution PaI (, ) is debutions
fined by the following survival function
x
F (x) =
,
89
still for x > . is the positive slope parameter (sometimes called the Paretos index) and is
the scale parameter. Pareto type I distribution is sometimes called the classical Pareto distribution
or the European Pareto distribution.
Pareto II
density function
1.0
f(x)
1.5
P2(2,1)
P2(2,2)
P2(2,3)
P2(3,2)
0.5
x 1
f (x) =
1+
,
2.0
,
F (x) = 1 +
0.0
for x > . We retrieve the Pareto I distribution with = , i.e. if X follows a Pareto I
distribution then + X follows a Pareto II
distribution. The Pareto II is sometimes called
the American Pareto distribution.
F (x) = 1 +
,
2.5
1.5
1.0
f(x)
2.0
P3(0,1,1)
P3(1,1,1)
P3(1,2,1)
P3(1,1,3/2)
0.5
density function
0.0
where x > . The Pareto III is not a generalisation of the Pareto II distribution, but from
0
1
2
3
4
5
these two distribution we can derive more genx
eral models. It can be seen as the following
Figure 8.3: Density function for Pareto III distritransformation + Z , where Z is a Pareto II
butions
PaII (0, 1, 1).
the slope of the Pareto chart log F (x) vs. log x, controlling the shape of the distribution.
90
Pareto IV
The Pareto type IV PaIV (, , , ) distribution is defined by
1 !
F (x) = 1 +
,
1.5
2.0
2.5
P4(0,1,1,1)
P4(,0,2,1,1)
P4(0,1,3/2,1)
P4(0,1,1,2)
1.0
f(x)
density function
0.0
0.5
for x > .
8.1.2
Properties
Equivalence
It is easy to verify that if X follows a Pareto I distribution PaI (, ), then log X follows a translated
exponential distribution T E(, ?).
The Pareto type III distribution is sometimes called the log-logistic distribution, since if X has
a logistic distribution then eX has a Pareto type III distribution with = 0.
Moments
Moments for the Pareto I distribution are given by E(X) =
and E(X ) =
for > and = 1.
if > 1, V ar(X) =
2
(1)2 (2)
Moments for the Pareto II, III can be derived from those of Pareto IV distribution, which are
E(X ) =
with 1 < < and = 0.
(1 + )( )
,
()
91
f (x) =
,
x(n)
where x > .
If we consider only independent Pareto I distribution PaI (i , i ), then we have for the density
of the product
n
X
i x i 1 Y k
f (x) =
,
i k
i=1
k6=i
Q
where x > ni=1 i .
Other Pareto distributions??
Order statistics
Let (Xi )i be a sample of Pareto distributions. We denote by (Xi:n )i the associated order statistics,
i.e. X1:n is the minimum and Xn:n the maximum.
For Pareto I distribution, the ith order statistic has the following survival function
FXi:n (x) =
i
X
j=1
1+
x (nj+1) Y n l + 1
,
lj
l=1
l6=i
E(Xi:n
) =
n! (n i + 1 1 )
,
(n i)! (n + 1 1 )
for R.
For Pareto II distribution, we get
FXi:n (x) =
i
i
X
x (nj+1) Y n l + 1
,
1+
lj
j=1
l=1
l6=i
where x > . Moments can be derived from those in the case of the Pareto I distribution using the
fact Xi:n = + Yi:n with Yi:n order statistic for the Pareto I case.
For Pareto III distribution, the ith order statistic follows a Feller-Pareto FPa(, , , i, ni+1).
Moments of order statistics can be obtained by using the transformation of Pareto II random
92
E(Zi:n
)=
(i + )(n i + + 1)
(i)(n i + 1)
, , ,
n
X
!
i
i=1
But the ith order statistic does not have a particular distribution. The intermediate order statistic
can be approximated by the normal distibution with
Xi:n N F 1 (i/n) , i/n (1 i/n) f 2 F 1 (i/n) n1
n+
where f and F denotes respectively the density and the distribution function of the Pareto IV
distribution. Moments for the order statistics are computable from the moments of the minima
since we have
n
X
ni
E(X1:r
).
E(Xi:n ) =
(1)rn+i1 Cnr Cr1
r=ni+1
E(X1:r
) = E(( + Z1:r ) ),
)=
where Z1:r PaIV (0, 1, , r) and E(Z1:r
(1+ )(r )
.
(r)
Truncation
Let us denote by X|X > x0 the random variable X knowing that X > x0 . We have the following
properties (with x0 > ):
if X PaI (, ) then X|X > x0 PaI (x0 , )
if X PaII (, , ) then X|X > x0 PaI (x0 , + x0 , )
In this case, the truncation is a rescaling. It comes from the lack of memory property of the log variable since
the log variable follows an exponential distribution.
93
Record values
Geometric minimization
8.1.3
Estimation
Estimation of the Pareto distribution in the context of actuarial science can be found in Rytgaard
(1990).
Pareto I
Arnold (1983) notices that from a log transformation, the parameter estimation reduces to a problem for a translated exponentiallly distributed data. From this, we have the following maximum
likelihood estimator for the Pareto I distribution
n = X1:n ,
h P
i1
i
n = n1 ni=1 log XX1:n
,
where (Xi )1in denotes a sample of i.i.d. Pareto variables. Those estimators are strongly consistent estimator of and . Let us note that for these estimator we have better than the asymptotic
normality (due to the maximum likelihoodness). The distributions for these two estimators are
respectively Pareto I and Gamma distribution:
n PI (, n),
n1 G(n 1, (n)1 ).
From this, we can see these estimators are biased, but we can derive unbiased estimators with
minimum variance:
n =
n2
n,
n
h
n = 1
n .
94
nM =
n X1:n
nX
n X1:n ) ,
n(X
nM =
n
M
n 1
X1:n ,
n
M
n
p1 = 1 Xbnp1 c:n
X ,
bnp2 c:n
p2 = 1
Pareto II-III-IV
Estimation of parameters for Pareto II, III and IV are more difficult. If we write the log-likelihood
for a sample (Xi )1in Pareto IV distributed, we have
X
1 !
n
n
X
1
xi
xi
log L(, , , ) =
1
log
(+1)
log 1 +
n log n log +n log ,
i=1
i=1
with the constraint that 1 i n, xi > . Since the log-likelihood is null when x1:n and a
decreasing function of otherwise the maximum likelihood estimator of is the minimum
= X1:n .
Then if we substract
to all observations, we get the following the log-likelihood
X
n
n
x
x 1
X
1
i
i
log L(, , ) =
1
( + 1)
n log n log + n log ,
log
log 1 +
i=1
i=1
which can be maximised numerically. Since there are no close form for estimators of , , , we do
not know their distributions, but they are asymptotically normal.
We may also use the method of moments, where again
is X1:n . Substracting this value to all
observations, we use the expression of moments above to have three equations. Finally solve the
system numerically. A similar scheme can be used to estimate parameters with quantiles.
8.1.4
Random generation
It is very easy to generate Pareto random variate using the inverse function method. Quantiles
function can be easily calculated
1
95
for PIII (, , ) distribution, F 1 (u) = (1 u)1 1 + ,
i
h
1
for PIV (, , ) distribution, F 1 (u) = (1 u) 1 + .
Therefore algorithms for random generation are simply
for PI (, ) distribution, F 1 (u) = U
,
h 1
i
for PII (, , ) distribution, F 1 (u) = U 1 + ,
for PIII (, , ) distribution, F 1 (u) = U 1 1 + ,
i
h 1
for PIV (, , ) distribution, F 1 (u) = U 1 + ,
8.1.5
Applications
From wikipedia, we get the following possible applications of the Pareto distributions:
the sizes of human settlements (few cities, many hamlets/villages),
file size distribution of Internet traffic which uses the TCP protocol (many smaller files, few
larger ones),
clusters of Bose-Einstein condensate near absolute zero,
the values of oil reserves in oil fields (a few large fields, many small fields),
the length distribution in jobs assigned supercomputers (a few large ones, many small ones),
the standardized price returns on individual stocks,
sizes of sand particles,
sizes of meteorites,
numbers of species per genus (There is subjectivity involved: The tendency to divide a genus
into two or more increases with the number of species in it),
areas burnt in forest fires,
severity of large casualty losses for certain lines of business such as general liability, commercial
auto, and workers compensation.
In the litterature, Arnold (1983) uses the Pareto distribution to model the income of an individual
and Froot & OConnell (2008) apply the Pareto distribution as the severity distribution in a context
of catastrophe reinsurance. Here are just a few applications, many other applications can be listed.
96
8.2
8.2.1
Feller-Pareto distribution
Characterization
with x , (., .) denotes the beta function and (., ., .) the incomplete beta function.
We have the following density for the Feller-Pareto distribution FP(, , , 1 , 2 ) :
2
f (x) =
where x . Let y be
x
,
( x
)
1
1
1 +2
(1 , 2 )x(1 + ( x
) )
1
f (x) =
(1 , 2 )
!2
1+y
!1
1+y
1
,
xy
for x . In this expression, we see more clearly the link with the beta distribution as well as the
transformation of the variable VU .
There is a lot of special cases to the Feller-Pareto distribution FP(, , , 1 , 2 ). When = 0,
we retrieve the transformed beta distribution of Klugman et al. (2004) and if in addition = 1,
we get the generalized Pareto distribution (as defined by Klugman et al. (2004)).
Finally the Pareto IV distribution is obtained with 1 = 1. Therefore we have the following
equivalences
PI (, ) = FP(, , 1, 1, ),
PII (, , ) = FP(, , 1, 1, ),
PIII (, , ) = FP(, , , 1, 1),
PIV (, , , ) = FP(, , , 1, ).
8.2.2
97
Properties
8.2.3
(1 + r)(2 r)
,
(1 )(2 )
2
.
Estimation
NEED REFERENCE
8.2.4
Random generation
= B .
Once we have simulated a beta I distribution B, we get a beta II distribution with B
1B
Finally we shift, scale and take the power X = + B to get a Feller-Pareto random variable.
8.2.5
Applications
NEED REFERENCE
We can also use two gamma variables to get the beta II variable.
98
8.3
Inverse Pareto
8.3.1
Characterization
density function
1.0
0.8
InvP(1,1)
InvP(2,1)
InvP(2,2)
InvP(1,2)
1+
1
1+
1
x x
0.6
2
x 1
(x + ) +1
0.2
f (x) =
0.0
0.4
f(x)
1
f (x) =
(1, 2 )
0.0
0.5
1.0
1.5
2.0
2.5
3.0
8.3.2
Properties
8.3.3
Estimation
NEED REFERENCE
8.3.4
Random generation
8.3.5
Applications
NEED REFERENCE
( +1)
( ) ,
8.4
8.4.1
99
F (x) =
1 (1 + x)
1 ex
1.5
1.0
if 6= 0 ,
if = 0
0.5
f(x)
GPD(0)
GPD(1/2)
GPD(1)
GPD(2)
GPD(3)
GPD(-1/3)
GPD(-2/3)
GPD(-1)
GPD(-5/4)
2.0
The generalized Pareto distribution was introduced in Embrechts et al. (1997) in the context
of extreme value theory.
0.0
i
h
where x R+ if 0 and x 0, 1 otherwise. This distribution function is generally
denoted by G .
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Figure 8.6: Density function for standard generWe can see the impact of the shape parame- alized Pareto distributions
ter on the figure on the right. The case where
= 0 can be seen as a limiting case of G when
0.
To get the full generalized Pareto distribution, we introduce a scale and a location parameter
. We get
1
if > 0
1 1+
F (x) =
1e
if = 0 ,
if < 0
1 1 + x
h
i
where x lies in [, +[, [, +[ and , respectively. We denote it by G,, (x) (which is
simply G ( x
)). Let us note when > 0, we have a Pareto II distribution, when = 0 a shifted
exponential distribution and when < 0 a generalized beta I distribution.
From these expression, we can derive a density function for the generalized Pareto distribution
f (x) =
1
e
1
1 + x
1 1
1 () x
if > 0
if = 0 ,
1
1
if < 0
100
8.4.2
Properties
For a generalized Pareto distribution G,0, , we have results on raw moments (for simplicity = 0).
The expectation E(X) is finite if and only if < 1. In this case we have
r !
1
1
E
1+ X
=
, for r >
1 + r
k !
E
log 1 + X
= k k!, for k N
r+1
, for
>0
E X F (X)r =
(r + 1 )(r + 1)
||
k ( 1 k)
1
E X k = k+1
k!, for < ,
1
(1 + )
k
where is a positive function. This makes the link between the generalized Pareto distribution
and the generalized extreme value distribution.
8.4.3
Estimation
+ u
,
1
101
for a given u. This can be estimated by the empirical mean of the sample (Yi )1iNu . Embrechts
et al. (1997) warn us about the difficulty of chosing u, since they are many u for wich the plot of
(u, YNu ).
Once we find the treshold u, we can use conditional likelihood estimation on sample (Yi )1iNu .
Let be /. However we can also use a linear regression to fit the shape and the scale parameter.
Xi
1
= n
+1
2 +Xi
i=1
,
n
n
P
Xi
1 P
1
log
1
+
X
+
1)
=
(
2
i
+Xi
i=1
i=1
but the system may be instable for 1/2. When > 1/2, we have some asymptotical properties
L
n , 1 N (0, M 1 ),
Method of moments
From the properties, we know the theoretical expression of E(X) and E X F (X) . From wich we
get the relation
2E(X)E X F (X)
E(X)
and = 2
.
=
E(X) 2E X F (X)
E(X) 2E X F (X)
We simply replace E(X) and E X F (X) by the empirical estimators.
8.4.4
Random generation
if 6= 0
,
if = 0
thus we can use the inversion function method to generate GPD variables.
102
8.4.5
Applications
The main application of the generalized Pareto distribution is the extreme value theory, since
there exists a link between the generalized Pareto distribution and the generalized extreme value
distribution. Typical applications are modeling flood in hydrology, natural disaster in insurance
and asset returns in finance.
8.5
8.5.1
Burr distribution
Characterization
density function
F (x) = 1
+ x
f(x)
0.5
1.0
1.5
(x/) 1
(1 + (x/) )+1
,
0.0
f (x) =
Burr(1,1,1)
Burr(2,1,1)
Burr(2,2,1)
Burr(2,2,2)
2.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
F (x) = 1 +
x
8.5.2
Properties
(1 + r )( r )
,
()
()
()
(1 + 1 )( 1 )
()
!2
.
8.5.3
103
Estimation
n
P
Xi
n
log
1
+
=
i=1
n
n
P
Xi
Xi
Xi
n
+1 P
=
log
+
log
+Xi ,
i=1
i=1
n
n
1
1
+1 P
n = 1
Xi +
+X
i=1
i=1
8.5.4
Random generation
1
From the quantile function F 1 (u) = ((1 u) 1) , it is easy to generate Burr random variate
1
1
with (U 1) where U is a uniform variable.
8.5.5
Applications
NEED REFERENCE
104
8.6
8.6.1
density function
1.0
0.8
0.6
x 1
+1 ,
+ x
f(x)
f (x) =
InvBurr(1,1,1,0)
InvBurr(1,2,1,0)
InvBurr(2,2,1,0)
InvBurr(1,2,2,0)
1.2
0.0
0.2
0.4
x
+1
+1
+1 ,
1
F (x) =
+1
for x . Here it is also clearer that this is the inverse Burr distribution since we notice the
survival function of the Burr distribution taken in x1 . We denotes the inverse Burr distribution by
IB(, , , ).
8.6.2
Properties
( + r )(1 r )
,
()
when = 0 and > r. Thus the expectation and the variance are
E(X) = +
( + 1 )(1 1 )
()
105
and
V ar(X) = 2
( + 2 )(1 2 )
2 ( + 1 )2 (1 1 )
2
()
2 ()
8.6.3
Estimation
8.6.4
n
P
log 1 + Yi
i=1
n
n
P
P
log Yi Y + ,
=
log Yi + ( + 1)
i=1
= ( + 1)
n
P
i=1
1
Yi +
+1
i=1
n
P
i=1
Random generation
1
8.6.5
Yi +
Applications
NEED REFERENCE
8.7
8.7.1
There are many ways to characterize the beta type II distribution. First we can say it is the
X
when X is beta I distributed. But this is also the distribution of the ratio VU
distribution of 1X
106
when U and V are gamma distributed (G(a, 1) and G(b, 1) resp.). The distribution function of the
beta of the second distribution is given by
x
(a, b, 1+x
)
F (x) =
,
(a, b)
for x 0. The main difference with the beta I distribution is that the beta II distribution takes
values in R+ and not [0, 1].
The density can be expressed as
f (x) =
xa1
,
(a, b)(1 + x)a+b
x
1+x
x
1x
a1
1
x
1+x
b1
1
.
(a, b)(1 + x)2
8.7.2
Properties
The expectation and the variance of the beta II are given by E(X) =
when b > 1 and b > 2. Raw moments are expressed as follows
E(X r ) =
a
b1
and V ar(X) =
a(a+b1)
(b1)2 (b2)
(a + r)(b r)
,
(a)(b)
for b > r.
8.7.3
Estimation
(a) (a + b) =
1
n
(b) (a + b) =
1
n
n
P
(log(1 + Xi ) log(Xi ))
i=1
n
P
,
log(1 + Xi )
i=1
where denotes the digamma function. We may also use the moment based estimators given by
b = 2 + Xn (Xn + 1) and a
n,
= (b 1)X
Sn2
which have the drawback that b is always greater than 2.
8.7.4
Random generation
We can simply use the construction of the beta II, i.e. the ratio of
However we may also use the ratio of two gamma variables.
8.7.5
107
Applications
NEED REFERENCE
X
1X
Chapter 9
Logistic distribution
Characterization
1
1 + e
x
s
9.1.2
Properties
TODO
9.1.3
Estimation
TODO
9.1.4
Random generation
TODO
108
109
110
9.1.5
9.2
Applications
9.2.1
Characterization
9.2.2
Properties
9.2.3
Estimation
9.2.4
Random generation
9.2.5
Applications
9.3
9.3.1
Characterization
9.3.2
Properties
9.3.3
Estimation
9.3.4
Random generation
9.3.5
Applications
9.4
9.4.1
Characterization
9.4.2
Properties
9.4.3
Estimation
9.4.4
Random generation
9.4.5
Applications
9.5
Paralogisitic distribution
Chapter 10
Gumbel distribution
10.1.1
Characterization
Gum(0,1)
Gum(1/2,1)
Gum(0,1/2)
Gum(-1,2)
0.7
,
0.6
f (x) = exe
density function
0.4
0.3
f(x)
F (x) = ee
0.5
0.1
0.2
e
,
0.0
f (x) =
-4
-2
112
e
.
The characteristic function of the Gumbel distribution of the first kind exists
(t) = (1 it)eit ,
while its moment generating function are
M (t) = (1 t)et .
10.1.2
Properties
The expectation of a Gumbel type I distribution is E(X) = , the Euler constant, roughly 0.57721.
2
Its variance is V ar(X) = 6 . Thus for the Fisher-Tippett distribution, we have E(X) = +
2 2
and V ar(X) = 6 .
For the Gumbel type II, expectation exists if a > 1 and variance if a > 2.
10.1.3
Estimation
n
X
1 P i
e
1= n
i=1
,
n
n
X
P
i
1
1 P
X
=
X
e
n
i
i
n
i=1
i=1
=X
and
=
,
2
where is the Euler constant.
10.1.4
Random generation
The quantile function of the Gumbel I distribution is simply F 1 (u) = log( log(u)), thus
we can use the inverse function method.
10.1.5
Applications
The Gumbel distribution is widely used in natural catastrophe modelling, especially for maximum
flood. NEED REFERENCE
10.2. FRECHET
DISTRIBUTION
10.2
113
Fr
echet distribution
) ,
for x . One can notice this is the inverse Weibull distribution, see section 5.12 for details.
10.3
Weibull distribution
) ,
10.4
10.4.1
Characterization
The generalized extreme value distribution is defined by the following distribution function
F (x) = e(1+
1
x
x
for 1+ > 0, the shape parameter, the location parameter and > 0 the scale parameter.
We can derive a density function
1
1
x 1 (1+ x ) 1
.
e
f (x) =
1+
H,, (x) ee .
0
10.4.2
Properties
2
(1 ) and V ar(X) = 2 ((1 2) 2 (1 ))
114
if they exist.
From the extreme value theory, we have the following theorem. Let (Xi )1in be an i.i.d.
sample and Xi:n the order statistics. If there exits two sequences (an )n and (bn )n valued in R+ and
R respectively, such that
Xn:n bn
P
an
have a limit in probability distribution. Then the limiting distribution H for the maximum belongs
to the type of one the following three distribution functions
x ,
x 0, > 0,
e
(x)
H(x) =
e
, x 0, < 0,
ex
e
,
x R, = 0,
MDA of Frechet
MDA of Weibull
MDA of Gumbel
where MDA stands for maximum domains of attraction . For all distribution, there is a unique
MDA. We quickly see that the limiting distribution for the maximum is nothing else than the generalized extreme value distribution H,0,1 . This theorem is the Fisher-Tippett-Gnedenko theorem.
For the minimum, assuming that P
X1:n bn
an
x 0, > 0
1 ex ,
(x)
H(x)
=
1e
, x 0, < 0 .
1 eex ,
x R, = 0
Again there are three types for the limiting distribution .
In the MDA of Frechet, we have the Cauchy, the Pareto, the Burr, the log-gamma and the stable
distributions, while in the Weibull MDA we retrieve the uniform, the beta and bounded support
power law distribution. Finally, the MDA of Gumbel contains the exponential, the Weibull, the
gamma, the normal, the lognormal, the Benktander distributions.
From the Embrechts et al. (1997), we also have some equivalence given a MDA:
a distribution function F belongs to the MDA of Frechet if and only if 1 F (x) = x L(x)
for some slowly varying function L,
a distribution function F belongs to the MDA of Weibull if and only if 1 F (xF 1/x) =
x L(x) for some slowly varying function L and xF < +,
a distribution function F belongs
to the MDA of Gumbel if and only if there exists z < xF
R
such that 1 F (x) = c(x)e
function a.
x g(t)
z a(t) dt
Sometimes the distribution characterized by the distribution function ee is called the extreme maximal-value
distribution.
x
Sometimes the distribution characterized by the distribution function 1ee is called the extreme minimal-value
distribution.
10.4.3
115
Estimation
According to Embrechts et al. (1997) maximum likelihood estimation is not very reliable in the case
of the generalized extreme value fitting. But thats not surprising since the generalized extreme
value distribution is a limiting distribution to very heterogeneous distribution, such as heavy tailed,
light tailed or bounded distributions.
We can use weighted moment method, where we estimate moments
r
r (, , ) = E(XH,,
(X))
r =
1X
r
Xj:n Uj:n
,
n
i=1
r are the order statistics of an uniform sample (which can be replaced by its expectation
where Uj:n
(nr1)! (nj)!
(n1)! (njr)! ). Equalling the theoretical and the empirical moments, we get that is a solution
of
3
2
0
3 1
=
.
2
1
0
2 1
10.4.4
(2
1
0 )
and
=
0 + (1 (1 )).
(1 )(2 1)
Random generation
The quantile function of the generalized extreme value distribution is F 1 (u) = + (( log u) )
1 for 6= 0. So we can use the inverse function method.
10.4.5
Applications
The application of the generalized extreme value distribution is obviously the extremex value theory
which can be applied in many fields : natural disaster modelling, insurance/finance extreme risk
management,. . .
10.5
Part III
116
Chapter 11
Generalization of common
distributions
11.1
11.1.1
Characterization
The first way to characterize generalized hyperbolic distributions is to say that the random vector
X follows a multivariate GH distribution if
L
X = + W + W AZ
(11.1)
where
1. Z Nk (0, Ik )
2. A Rdk
3. , Rd
4. W 0 is a scalar-valued random variable which is independent of Z and has a Generalized
Inverse Gaussian distribution, written GIG(, , ).
Note that there are at least five alternative definitions leading to different parametrizations.
Nevertheless, the parameters of a GH distribution given by the above definition admit the
following interpretation:
, , determine the shape of the distribution, that is, how much weight is assigned to the
tails and to the center. In general, the larger those parameters the closer is the distribution
to the normal distribution.
117
118
(11.2)
Another way to define a generalized hyperbolic distribution is to use the density. Since the
conditional distribution of X given W is Gaussian with mean + W and variance W the GH
density can be found by mixing X|W with respect to W .
Z
fX|W (x|w) fW (w) dw
(11.3)
fX (x) =
0
e(x)
Q(x)
exp
fW (w)dw
1
d
d
2w
2/w
0
(2) 2 || 2 w 2
p
p
d
(x)0 1
( /) ( + ) 2 K d2 ( ( + Q(x))( + )) e
,
p
1
d
d
( ( + Q(x))( + )) 2
(2) 2 || 2 K ( )
where K () denotes the modified Bessel function of the third kind and Q(x) denotes the mahalanobis distance Q(x) = (x )0 1 (x ) (i.e. the distance with 1 as norm). The domain of
variation of the parameters , and is given in section 11.1.2.
A last way to characterize generalized hyperbolic distributions is the usage of moment generating
functions. An appealing property of normal mixtures is that the moment generating function is
easily calculated once the moment generating function of the mixture is known. Based on equation
(11.4) we obtain the moment generating function of a GH distributed random variable X as
0
M (t) = E(E(exp t0 X |W )) = et E(exp W t0 + 1/2 t0 t )
p
/2
K ( ( 2t0 t0 t))
t0
= e
, 2 t0 + t0 t.
2t0 t0 t
K ( )
For moment generating functions of the special cases of the GH distribution we refer to Prause
(1999) and Paolella (2007).
11.1.2
Parametrization
There are several alternative parametrizations for the GH distribution. In the R package ghyp the
user can choose between three of them. There exist further parametrizations which are not implemented and not mentioned here. For these parametrizations we refer to Prause (1999) and Paolella
(2007).
Table 11.1 describes the parameter ranges for each parametrization and each special case.
Clearly, the dispersion matrices and have to fulfill the usual conditions for covariance matrices, i.e., symmetry and positive definiteness as well as full rank.
ghyp
hyp
NIG
t
VG
ghyp
hyp
NIG
t
VG
ghyp
hyp
NIG
t
VG
R
= d+1
2
= 12
<0
>0
>0
>0
>0
>0
=0
R
= d+1
2
= 21
= 2 < 1
>0
>0
>0
>0
=0
=0
R
= d+1
2
= 12
<0
>0
>0
>0
>0
= 0
>0
119
(, , , , , )-Parametrization
>0
>0
>0
=0
>0
Rd
Rd
Rd
Rd
Rd
R
R
R
R
R
Rd
Rd
Rd
Rd
Rd
(,
, , , )-Parametrization
Rd
Rd
Rd
Rd
Rd
R
R
R
R
R
Rd
Rd
Rd
Rd
Rd
(, , , , , )-Parametrization
>0
>0
>0
>0
=0
Rd
Rd
Rd
Rd
Rd
R
R
R
R
R
{x Rd : 2 x0 x > 0}
{x Rd : 2 x0 x > 0}
{x Rd : 2 x0 x > 0}
Rd
d
{x R : 2 x0 x > 0}
Table 11.1: The domain of variation for the parameters of the GH distribution and some of its
special cases for different parametrizations. We denote the set of all feasible covariance matrices in
Rdd with R . Furthermore, let R = {A R : |A| = 1}.
(, , , , , )-Parametrization
120
(,
, , , )-Parametrization
There is a more elegant way to eliminate the degree of freedom. We simply constrain the expected
value of the generalized inverse Gaussian distributed mixing variable W to be 1 (cf. 4.5). This
makes the interpretation of the skewness parameters easier and in addition, the fitting procedure
becomes faster (cf. 11.1.5).
We define
r
K+1 ( )
E(W ) =
= 1.
(11.4)
K ( )
and set
=
It follows that
=
p
.
(11.5)
K+1 (
)
2
K (
)
and =
=
.
K (
)
K+1 (
)
(11.6)
(, , , , , )-Parametrization
When the GH distribution was introduced in Barndorff-Nielsen (1977), the following parametrization for the multivariate case was used.
p
0
K d ( 2 + (x )0 1 (x )) e (x)
(2 0 )/2
2
fX (x) =
, (11.7)
p
p
dp
d
(2) 2 || K ( 2 0 )
( 2 + (x )0 1 (x )) 2
where the determinant of is constrained to be 1. In the univariate case the above expression
reduces to
fX (x) =
p
(2 2 )/2
K 1 ( 2 + (x )2 ) e(x) ,
p
1
2
2 2 K ( 2 2 )
(11.8)
(, , , , , )
(, , , , , )
121
(,
, , , ) (, , , , , ): Use the relations in (11.6) to obtain and . The parameters
and remain the same.
q
K
( )
(, , , , , ) (,
, , , ): Set k = K+1() .
= ,
k ,
(11.9)
(, , , , , ) (, , , , , ):
1
= || d , = 1
q
q
1
1
= || d , = || d ( + 0 1 )
(11.10)
(, , , , , ) (, , , , , ):
= ,
11.1.3
= ,
= 2,
= 2 0 .
(11.11)
Properties
Moments
The expected value and the variance are given by
E(X) = + E(W )
V ar(X) = E(Cov(X|W )) + Cov(E(X|X))
(11.12)
(11.13)
= V ar(W ) + E(W ).
Linear transformation
The GH class is closed under linear transformations: If X GHd (, , , , , ) and Y = BX +b,
where B Rkd and b Rk , then Y GHk (, , , B + b, BB 0 , B). Observe that by
introducing a new skewness parameter = , all the shape and skewness parameters (, , , )
become location and scale-invariant, provided the transformation does not affect the dimensionality,
that is B Rdd and b Rd .
11.1.4
Special cases
The GH distribution contains several special cases known under special names.
If = d+1
2 the name generalized is dropped and we have a multivariate hyperbolic (hyp)
distribution. The univariate margins are still GH distributed. Inversely, when = 1 we get
a multivariate GH distribution with hyperbolic margins.
If = 12 the distribution is called Normal Inverse Gaussian (NIG).
122
11.1.5
Estimation
Numerical optimizers can be used to fit univariate GH distributions to data by means of maximum
likelihood estimation. Multivariate GH distributions can be fitted with expectation-maximazion
(EM) type algorithms (see Dempster et al. (1977) and Meng & Rubin (1993)).
EM-Scheme
Assume we have iid data x1 , . . . , xn and parameters represented by = (,
, , , ). The problem
is to maximize
n
X
ln L(; x1 , . . . , xn ) =
ln fX (xi ; ).
(11.14)
i=1
This problem is not easy to solve due to the number of parameters and necessity of maximizing
over covariance matrices. We can proceed by introducing an augmented likelihood function
ln L(;
x1 , . . . , xn , w1 , . . . , wn ) =
n
X
i=1
n
X
ln fW (wi ; ,
)
(11.15)
i=1
and spend the effort on the estimation of the latent mixing variables wi coming from the mixture
representation (11.2). This is where the EM algorithm comes into play.
E-step: Calculate the conditional expectation of the likelihood function (11.15) given the data
x1 , . . . , xn and the current estimates of parameters [k] . This results in the objective function
1
ln(/) ln(2K ( )) + ( 1) ln w
w.
2
2w
2
(11.17)
123
Since the wi s are latent one has to replace w, 1/w and ln w with the respective expected values in
order to maximize the log likelihood function. Let
[k]
[k]
[k]
i := E wi | xi ; [k] , i := E wi1 | xi ; [k] , xii := E ln wi | xi ; [k] .
(11.18)
We have to find the conditional density of wi given xi to calculate these quantities.
MCECM estimation
In the R implementation a modified EM scheme is used, which is called multi-cycle, expectation,
conditional estimation (MCECM) algorithm (Meng & Rubin 1993, McNeil, Frey & Embrechts
2005a). The different steps of the MCECM algorithm are sketched as follows:
(1) Select reasonable starting values for [k] . For example = 1,
= 1, is set to the sample
mean, to the sample covariance matrix and to a zero skewness vector.
(2) Calculate [k] and [k] as a function of
[k] using (11.6).
[k]
[k]
i=1
i=1
1 X [k]
1 X [k]
i and [k] =
i .
=
n
n
(11.19)
=
.
[k]
[k]
n
1
(11.20)
[k+1]
[k+1] =
P
[k]
1 ni=1 i (xi [k+1] )
n
[k]
n
X
1
[k]
i (xi [k+1] )(xi [k+1] )0 [k] [k+1] [k+1] 0 .
n
(11.21)
(11.22)
i=1
[k,2]
[k,2]
, i
[k,2]
and xii
using
(7) Maximize the second summand of (11.15) with density (11.17) with respect to , and to
complete the calculation of [k,2] and go back to step (2). Note that the objective function
must calculate and in dependence of and
using relation (11.6).
11.1.6
Random generation
+ W + W AZ
where Z is a multivariate gaussian vector Nk (0, Ik ) and W follows a Generalized Inverse Gaussian
GIG(, , ).
124
11.1.7
Applications
Even though the GH distribution was initially ivented to study the distribution of the logarithm
of particle sizes, we will focus on applications of the GH distribution family in finance and risk
measurement.
We have seen above that the GH distribution is very flexible in the sense that it nests several
other distributions such as the Student-t (cf. 7.1).
To give some references and applications of the GH distribution let us first summarize some
of its important properties. Beside of the above mentioned flexibility, three major facts led to the
popularity of GH distribution family in finance:
(1) The GH distribution features both fat tails and skewness. These properties account for the
stylized facts of financial returns.
(2) The GH family is naturally extended to multivariate distributions . A multivariate GH
distribution does exhibit some kind of non-linear dependence, for example tail-dependence.
This reflects the fact that extremes mostly occure for a couple of risk-drivers simultaneously
in financial markets. This property is of fundamental importance for risk-management, and
can influence for instance the asset allocation in portfolio theory.
(3) The GH distribution is infinitely divisible (cf. Barndorff-Nielsen & Halgreen (1977)). This is
a necessary and sufficient condition to build Levy processes. Levy processes are widespread
in finance because of their time-continuity and their ability to model jumps.
Based on these properties one can classify the applications of the GH distributions into the fields
empirical modelling, risk and dependence modelling, derivative pricing, and portfolio selection.
In the following, we try to assign papers to each of the classes of applications mentioned above.
Rather than giving abstracts for each paper, we simply cite them and refer the interested reader
to the bibliography and to the articles. Note that some articles deal with special cases of the GH
distribution only.
Empirical modelling Eberlein & Keller (1995), Barndorff-Nielsen & Prause (2001), Fergusson
& Platen (2006)
Risk and dependence modelling Eberlein et al. (1998), Breymann et al. (2003), McNeil et al.
(2005b), Chen et al. (2005), Kassberger & Kiesel (2006)
L
evy processes Barndorff-Nielsen (1997a,b), Bibby & Sorensen (1997), Dilip B. Madan et al.
(1998), Raible (2000), Cont & Tankov (2003)
The extension to multivariate distributions is natural because of the mixing structure (see eq. (11.2)).
125
11.2
Stable distribution
A detailed and complete review of stable distributions can be found in Nolan (2009).
11.2.1
Characterization
+ bX = cX + d,
aX
and X are independent copies of a random variable X and some positive constants a, b, c
where X
and d. This equation means stable distributions are distributions closed for linear combinations.
For the terminology, we say X is strictly stable if d = 0 and symmetric stable if in addition we have
L
X = X. From Nolan (2009), we learn we use the word stable since the shape of the distribution
is preserved under linear combinations.
Another way to define stable distribution is to use characteristic functions. X has a stable
distribution if and only if its characteristic function is
(
(t) = e
it
if 6= 1
,
if = 1
where ]0, 2], ] 1, 1[, > 0 and b R are the parameters. In the following, we denote
S(, , , ), where is a location parameter, a scale parameter, an index of stability and a
skewness parameter. This corresponds to the parametrization 1 of Nolan (2009).
We know that stable distributions S(, , , ) are continuous distributions whose support is
[, +[
] , ]
] , +[
11.2.2
if < 1 and = 1
if < 1 and = 1 .
otherwise
Properties
If we work with standard stable distributions S(, , 0, 1), we have the reflection property. That is
to say if X S(, , 0, 1), then X S(, , 0, 1). This implies the following constraint on the
density and the distribution function:
fX (x) = fX (x) and FX (x) = 1 FX (x).
126
From the definition, we have the obvious property on the sum. If X follows a stable distribution
S(, , , ), then aX + b follows a stable distribution of parameters
S(, sign(a), |a|, a + b)
if 6= 1
.
S(1, sign(a), |a|, a + b 2 a log |a|)
if = 1
Furthermore if X1 and X2 follow a stable distribution S(, i , i , i ) for i = 1, 2, then the
1
+
sum X1 + X2 follows a stable distribution S(, , , ) with = 1 1 + 2 2 , = (1 + 2 ) and
2
1
= 1 + 2 .
11.2.3
Special cases
11.2.4
(x)2
2 2
1
2 +(x)2 ,
1 e
2 2
1
3
(x) 2
2(x)
Estimation
NEED REFERENCE
11.2.5
Random generation
Simulation of stable distributions are carried out by the following algorithm from Chambers et al.
(1976). Let be an independent random uniform variable U(/2, /2) and W be an exponential
variable with mean 1 independent from . For 0 < 2, we have
in the symmetric case,
Z=
sin()
cos()
cos(( 1))
W
1
follows a stable distribution S(, 0, 1, 0) with the limiting case tan() when 1.
in the nonsymetric case,
Z=
sin((+))
1
cos(+(1))
W
(cos()
cos())
2
( 2 + ) tan()
log
1
2
W cos()
+
2
127
11.2.6
Applications
NEED REFERENCE
11.3
Phase-type distribution
11.3.1
Characterization
+
P
n=0
T n xn
n! .
The computation of matrix exponential is studied in details in appendix A.3, but let us notice
that when T is a diagonal matrix, the matrix exponential is the exponential of its diagonal terms.
Let us note that there also exists discrete phase-type distribution, cf. Bobbio et al. (2003).
11.3.2
Properties
The moments of a phase-type distribution are given by (1)n n!T n 1. Since phase-type distributions are platikurtic or light-tailed distributions, the Laplace transform exists
fb(s) = (sIm T )1 t0 ,
matrix such that its row sums are equal to 0 and have positive elements except on its diagonal.
128
11.3.3
Special cases
Here are some examples of distributions, which can be represented by a phase-type distribution
exponential distribution E() : = 1, T = and m = 1.
generalized Erlang distribution G (n, (i )1in ) :
= (1, 0, . . . , 0),
T =
0
0
0
0
3
..
.
0
...
..
.
..
.
..
.
0
0
n1
n
and m = n.
a mixture of exponential distribution of parameter (pi , i )1in :
= (p1 , . . . , pn ),
T =
0
0
0
0
3
..
.
0
...
..
.
..
.
..
.
0
0
0
n
and m = n.
a mixture of 2 (or k) Erlang distribution G(ni , i )i=1,2 with parameter pi :
= (p1 , 0, . . . , 0, p2 , 0, . . . , 0),
| {z } | {z }
n1
n2
T =
1
0
0
129
1
..
.
0
1
0
1
0
..
. 2
...
..
.
0
2
..
.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
..
.
0
..
. 2
0 2
and m = n1 + n2 .
11.3.4
Estimation
The estimation based on moments can be a starting point for parameters, but according to Feldmann & Whitt (1996) the fit is very poor. Feldmann & Whitt (1996) proposes a recursive algorithm matching theoretical quantiles and empirical quantiles. They illustrates their method with
the Weibull and the Pareto distribution by a mixture of exponential distributions.
First Asmussen et al. (1996) and then Lee & Lin (2008) fit phase-type distribution with the EM
algorithm. Lee & Lin (2008) also investigates goodness of fit and graphical comparison of the fit.
Lee & Lin (2008) focuses on mixture of Erlang distributions while Asmussen et al. (1996) provides
an algorithm for general phase-type distributions. Lee & Lin (2008) illustrates their algorithm with
an uniform, a Weibull, a Pareto and log-normal distributions.
11.3.5
Random generation
From Neuts (1981), we have the following algorithm to generate phase-type distributed random
variate. Let s be the state of the underlying Markov chain.
S initialized from the discrete distribution characterized by
X initialized to 0
while S 6= 0 do
generate U from an uniform distribution,
X = X 1 log(U ),
ij
0
ij =
ij
ii
if
if
if
if
i=j=1
i = 1 and j 6= 1
i > 1 and j = i .
i > 1 and j 6= i
130
11.3.6
Applications
NEED REFERENCE
11.4
Exponential family
11.4.1
Characterization
Clark & Thayer (2004) defines the exponential family by the following density or mass probability
function
f (x) = ed()e(x)+g()+h(x) ,
where d, e, g and h are known functions and the vector of paremeters. Let us note that the support
of the distribution can be R or R+ or N. This form for the exponential family is called the natural
form.
When we deal with generalized linear models, we use the natural form of the exponential family,
which is
xb()
+c(x,)
f (x) = e a()
,
where a, b, c are known functions and , denote the parameters. This form is derived from the
previous by setting d() = , e(x) = x and adding a dispersion parameter .
Let be the mean of the variable of an exponential family distribution. We have = () since
is only a dispersion parameter. The mean value form of the exponential family is
f (x) = e
11.4.2
1 ()xb( 1 ())
+c(x,)
a()
Properties
00
For the exponential family, we have E(X) = = b0 () and V ar(X) = a()V
q () = a()b () where
a()
V is the unit variance function. The skewness is given by 3 (X) = dV
d ()
V () =
2
2
a()
b(4) ()a()3
the kurtosis is 4 (X) = 3 + ddV2 ()V () + dV
()
d
V () = 3 + V ar(Y )2 .
b(3) ()a()2
,
V ar(Y )3/2
while
The property of uniqueness is the fact that the variance function V uniquely identifies the
distribution.
11.4.3
Special cases
The exponential family of distributions in fact contains the most frequently used distributions.
Here are the corresponding parameters, listed in a table:
131
Law
Distribution
Normal N (, 2 )
1 e
2
Gamma G(, )
x1 x
() e
(x)2
2 2
Expectation
Variance
= 1
21 2
= (2) 2
(x)2
22 x
Bernoulli B()
x (1 )1x
log( 1
)
Poisson P()
x
x! e
log()
= e
x
!
log()
11.4.4
e
2x3
Estimation
1
n
1
n
n
P
i=1
Xi a0 ()
a2 ()
n
P
i=1
1
n
Xi
a()
n
P
i=1
11.4.5
Random generation
NEED REFERENCE
11.4.6
Applications
11.5
Elliptical distribution
11.5.1
Characterization
TODO
b0 ()
a()
c
(Xi , )
0
b() aa2()
()
e
1+e
3
(1 )
132
11.5.2
Properties
TODO
11.5.3
Special cases
11.5.4
Estimation
TODO
11.5.5
Random generation
TODO
11.5.6
Applications
Chapter 12
Multivariate distributions
12.1
Multinomial
12.2
Multivariate normal
12.3
Multivariate elliptical
12.4
Multivariate uniform
12.5
Multivariate student
12.6
Kent distribution
12.7
Dirichlet distribution
12.7.1
Characterization
TODO
12.7.2
Properties
TODO
133
134
12.7.3
Estimation
TODO
12.7.4
Random generation
TODO
12.7.5
Applications
TODO
12.8
12.9
Evens
Chapter 13
Misc
13.1
MBBEFD distribution
The MBBEFD distribution comes from the actuarial science due to Bernegger (1997). MBBEFD
stands for Maxwell-Boltzmann, Bore-Einstein and Fermi-Dirac distribution.
13.1.1
Characterization
a+1
a+bx
if 0 x 1
if x > 1
(a + 1)b
.
a+b
The parameters (a, b) are defined on a wide set of intervals, which are not trivial: ] 1, 0[]1, +[
and ], 1[]0, +[]0, 1[. The shape of the distribution function F has the following properties
for (a, b) I1 =] 1, 0[]1, +[, F is concave,
for (a, b) I2 =] , 1[]0, 1[, F is concave,
for (a, b) I3 =]0, b[]0, 1[, F is concave,
for (a, b) I4 = [b, 1[]0, 1[, F is convex then concave,
for (a, b) I4 = [1, +[]0, 1[, F is convex.
135
136
There is no usual density but if we use the Dirac function , we can define a function f such
that
a(a + 1)bx ln(b)
f (x) =
11]0,1[ (x) + 1 .
(a + bx )2
which is a mix between a mass probability and a density functions.
13.1.2
Special cases
TODO
13.1.3
Properties
TODO
13.1.4
Estimation
TODO
13.1.5
Random generation
TODO
13.1.6
Applications
13.2
Cantor distribution
TODO
13.3
TODO
Tweedie distribution
Bibliography
Arnold, B. C. (1983), Pareto distributions, International Co-operative Publishing House 5. 30, 88,
93, 95, 96
Asmussen, S., Nerman, O. & Olsson, M. (1996), Fitting phase-type distributions via the em algorithm, Scandinavian Journal of Statistics 23(4), 419441. 129
Barndorff-Nielsen, O. E. (1977), Exponentially decreasing distributions for the logarithm of particle
size, Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences
353(1674), 401419. 120
Barndorff-Nielsen, O. E. (1997a), Normal inverse Gaussian distributions and stochastic volatility
modelling, Scandinavian Journal of Statistics 24(1), 113. 124
Barndorff-Nielsen, O. E. (1997b), Processes of normal inverse gaussian type, Finance and Stochastics 2(1), 4168. 124
Barndorff-Nielsen, O. E. & Halgreen, O. (1977), Infinite divisibility of the hyperbolic and generalized inverse gaussian distribution, Zeitschrift f
ur Wahrscheinlichkeitstheorie und verwandte
Gebiete 38(4), 309311. 124
Barndorff-Nielsen, O. E. & Prause, K. (2001), Apparent scaling, Finance and Stochastics
5(1), 103113. 124
Bernegger, S. (1997), The swiss re exposure curves and the mbbefd distribution class, Astin Bull.
27(1), 99111. 135
Bibby, B. M. & Sorensen, M. (1997), A hyperbolic diffusion model for stock prices, Finance &
Stochastics 2 pp. 2541. 124
Black, F. & Scholes, M. (1973), The pricing of options and corporate liabilities, Journal of Political
Economy 81(3). 51
Bobbio, A., Horvath, A., Scarpa, M. & Telek, M. (2003), Acyclic discrete phase type distributions:
properties and a parameter estimation algorithm, performance evaluation 54, 132. 127
Breymann, W., Dias, A. & Embrechts, P. (2003), Dependence Structures for Multivariate High
Frequency Data in Finance, Quantitative Finance 3(1), 114. 124
Breymann, W. & L
uthi, D. (2008), ghyp: A package on generalized hyperbolic distributions, Institute
of Data Analysis and Process Design. 54, 117
Brigo, D., Mercurio, F., Rapisarda, F. & Scotti, R. (2002), Approximated moment-matching dynamics for basket-options simulation, Product and Business Development Group,Banca IMI,
SanPaolo IMI Group . 53
137
138
BIBLIOGRAPHY
Cacoullos, T. & Charalambides, C. (1975), On minimum variance unbiased estimation for truncated
binomial and negative binomial distributions, Annals of the Institute of Statistical Mathematics
27(1). 12, 21, 24
Carrasco, J. M. F., Ortega, E. M. M. & Cordeiro, G. M. (2008), A generalized modified weibull
distribution for lifetime modeling, Computational Statistics and Data Analysis 53, 450462. 72
Chambers, J. M., Mallows, C. L. & Stuck, B. W. (1976), A method for simulating stable random
variables, Journal of the American Statistical Association, . 126
Chen, Y., H
ardle, W. & Jeong, S.-O. (2005), Nonparametric Risk Management with Generalized
Hyperbolic Distributions, Vol. 1063 of Preprint / WeierstraInstitut f
ur Angewandte Analysis
und Stochastik, WIAS, Berlin. 124
Clark, D. R. & Thayer, C. A. (2004), A primer on the exponential family of distributions, 2004
call paper program on generalized linear models . 130
Cont, R. & Tankov, P. (2003), Financial Modelling with Jump Processes, Chapman & Hall CRC
Financial Mathematics Series. 124
Dempster, A. P., Laird, N. M. & Rubin, D. B. (1977), Maximum likelihood from incomplete data
via the Em algorithm, Journal of the Royal Statistical Society 39(1), 138. 122
Dilip B. Madan, Peter Carr & Eric C. Chang (1998), The variance gamma process and option
pricing, European Finance Review 2, 79105. 124
Dutang, C. (2008), randtoolbox: Generating and Testing Random Numbers. 36
Eberlein, E. & Keller, U. (1995), Hyperbolic distributions in finance., Bernoulli 1 pp. 281299.
124
Eberlein, E., Keller, U. & Prause, K. (1998), New insights into smile, mispricing and value at risk
measures., Journal of Business 71, 371405. 124
Embrechts, P., Kl
uppelberg, C. & Mikosch, T. (1997), Modelling extremal events, Springer. 99,
100, 101, 114, 115
Feldmann, A. & Whitt, W. (1996), Fitting mixtures of exponentials to long tail distributions to
analyze network performance models, AT&T Laboratory Research . 129
Fergusson, K. & Platen, E. (2006), On the distributional characterization of log-returns of a world
stock index, Applied Mathematical Finance 13(1), 1938. 124
Froot, K. A. & OConnell, P. G. J. (2008), On the pricing of intermediated risks: Theory and
application to catastrophe reinsurance, Journal of banking & finance 32, 6985. 95
Gomes, O., Combes, C. & Dussauchoy, A. (2008), Parameter estimation of the generalized gamma
distribution, Mathematics and Computers in Simulation 79, 955963. 67
Haahtela, T. (2005), Extended binomial tree valuation when the underlying asset distribution is
shifted lognormal with higher moments. Helsinki University of Technology. 53
Haddow, J. E., Palomaki, G. E., Knight, G. J., Cunningham, G. C., Lustig, L. S. & Boyd, P. A.
(1994), Reducing the need for amniocentesis in women 35 years of age or older with serum
markers for screening, New England Journal of Medicine 330(16), 11141118. 11
BIBLIOGRAPHY
139
Johnson, N. L., Kotz, S. & Balakrishnan, N. (1994), Continuous univariate distributions, John
Wiley. 5
Jones, M. C. (2009), Kumaraswamys distribution: A beta-type distribution with some tractability
advantages, Statistical Methodology 6, 70. 45, 46
Kassberger, S. (2007), Efficient Portfolio Optimization in the Multivariate Generalized Hyperbolic
Framework, SSRN eLibrary . 125
Kassberger, S. & Kiesel, R. (2006), A fully parametric approach to return modelling and risk
management of hedge funds, Financial markets and portfolio management 20(4), 472491. 124
Klugman, S. A., Panjer, H. H. & Willmot, G. (2004), Loss Models: From Data to Decisions, 2 edn,
Wiley, New York. 50, 68, 96, 104
Knuth, D. E. (2002), The Art of Computer Programming: seminumerical algorithms, Vol. 2, 3rd
edition edn, Massachusetts: Addison-Wesley. 15
Lee, S. C. & Lin, X. S. (2008), Modeling and evaluating insurance losses via mixtures of erlang,
North American Actuarial Journal . 129
Li, Q. & Yu, K. (2008), Inference of non-centrality parameter of a truncated non-central chi-squared
distribution, Journal of Statistical Planning and Inference . 79
Limpert, E., Stahel, W. A. & Abbt, M. (2001), Log-normal distributions across the sciences: Keys
and clues, Bioscience 51(5). 51
Matsumoto, M. & Nishimura, T. (1998), Mersenne twister: A 623-dimensionnally equidistributed
uniform pseudorandom number generator, ACM Trans. on Modelling and Computer Simulation
8(1), 330. 36
Macutek, J. (2008), A generalization of the geometric distribution and its application in quantitative linguistics, Romanian Reports in Physics 60(3), 501509. 20
McNeil, A. J., Frey, R. & Embrechts, P. (2005a), Quantitative risk management: Concepts, techniques and tools, Princeton University Press, Princeton. 123
McNeil, A. J., Frey, R. & Embrechts, P. (2005b), Quantitative risk management: Concepts, techniques and tools, Princeton University Press, Princeton. 124
Meng, X.-L. & Rubin, D.-B. (1993), Maximum likelihood estimation via the ECM algorithm: A
general framework, Biometrika 80(2), 267278. 122, 123
Moler, C. & Van Loan, C. (2003), Nineteen dubious ways to compute the exponential of a matrix,
twenty-five years later, SIAM review 45(1), 300. 144
Nadarajah, S. & Kotz, S. (2003), A generalized beta distribution ii, Statistics on the internet .
42, 43
Neuts, M. F. (1981), Generating random variates from a distribution of phase-type, in Winter
Simulation Conference. 129
Nolan, J. P. (2009), Stable Distributions - Models for Heavy Tailed Data, Birkhauser, Boston. In
progress, Chapter 1 online at academic2.american.edu/jpnolan. 125
140
BIBLIOGRAPHY
Appendix A
Mathematical tools
A.1
TODO
A.1.1
Characterising functions
For a discrete distribution, one may use the probability generating function to characterize the
distribution, if it exists or equivalently the moment generating function. For a continuous distribution, we generally use only the moment generating function. The moment generating function is
linked to the Laplace transform of a distribution. When dealing with continuous distribution, we
also use the characteristic function, which is related to the Fourrier transform of a distribution, see
table below for details.
Probability generating
function GX (z)
Moment generating
function MX (t)
<=>
Laplace
Transform LX (s)
Characteristic
function X (t)
<=>
Fourrier
transform
E zX
E etX
<=>
E esX
E eitX
<=>
E eitX
. . . (Xk)) =
dk GX (t)
|t=1
dtk
1 dk GX (t)
|t=0 ; E(X
k!
dtk
dk MX (t)
|t=0
dtk
141
142
A.2
In this section, we recall the common mathematical quantities used in all this guide. By definition,
we have
A.2.1
Integral functions
R +
0
xa1 ex dx
Rx
0
R1
0
A.2.2
Rx
0
Ru
0
(a)(b)
(a+b)
0 (x)
(x)
, a > 1, (a) =
00 (x)
(x)
et dt
Factorial functions
factorial : n N, n! = n (n 1) . . . 2 1
rising factorial : n, m N2 , m(n) = m (m + 1) . . . (m + n 2) (m + n 1) =
(n+m)
(n)
(m)
(mn)
n!
p!(np)!
n!
(np)!
P
Stirling number of the first kind : coefficients 1 Snk of the expansion of (x)n = nk=0 1 Snk xk or
k1
k
defined by the recurrence 1 Snk = (n 1) 1 Sn1
+ 1 Sn1
with 1 Sn0 = n0 and 1 S01 = 0.
P
Stirling number of the second kind : coefficients 2 Snk of the expansion nk=0 2 Snk (x)k = xn or
k1
k
defined by the recurrence 2 Snk = 2 Sn1
+ k 2 Sn1
with 2 Sn1 = 2 Snn = 1.
A.2.3
143
Serie functions
+
P
n=1
1
ns
+
P
n=1
zn
ns
+
P
n=0
+
P
n=0
a(n) b(n) z n
c(n) n!
and 3 F1 (a, b, c, d, e, z) =
+
P
n=0
a(n) z n
, 2 F1 (a, b, c, z)
b(n) n!
Bessels functions verify the following ODE: x2 y 00 + xy 0 + (x2 2 )y = 0. We define the Bessel
P
(1)n
x 2n+
function of the 1st kind by J (x) =
and of the 2nd kind Y (x) =
n!(n++1) 2
n=0
J (x)cos()J (x)
.
sin()
(1)
A.2.4
ex dn (ex xn+ )
n!x
dxn
i ni xi
i=0 (1) Cn+ i!
Pn
Miscellanous
+
0
si x = x0
et
sinon
si x < x0
heavyside function : Hx0 (x) =
si x = x0
1
sinon
1
F
2 n1 (3x)
Cantor function : x [0, 1], Fn (x) =
1
2
1 1
+
F
(3(x 23 ))
n1
2
2
1
2
A.3
Matrix exponential
+
X
Qn un
n=0
n!
si
si
si
si
n=0
n 6= 0 et 0 x 31
n 6= 0 et 13 x 23
n 6= 0 et 23 x 1
144
There are various methods to compute the matrix exponential, Moler & Van Loan (2003) makes a
deep analysis of the efficiency of different methods. In our case, we choose a decomposition method.
We diagonalize the n n matrix Q and use the identity
eQu = P eDu P 1 ,
where D is a diagonal matrix with eigenvalues on its diagonal and P the eigenvectors. We compute
eQu =
m
X
l=1
el u P Ml P 1 ,
| {z }
Cl
where i stands for the eigenvalues of Q, P the eigenvectors and Ml = (il lj )ij (ij is the symbol
Kronecker, i.e. equals to zero except when i = j). As the matrix Ml is a sparse matrix with just
a 1 on the lth term of its diagonal. The constant Ci can be simplified. Indeed, if we denote by Xl
the lth column of the matrix P (i.e. the eigenvector associated to the eigenvalue l ) and Yl the lth
row of the matrix P 1 , then we have
4
Cl = P Ml P 1 = Xl Yl .
Despite Q is not obligatorily diagonalizable, this procedure will often work, since Q may have
a complex eigenvalue (say i ). In this case, Ci is complex but as eQu is real, we are ensured there
is j [[1, . . . , m]], such that j is the conjugate of l . Thus, we get
ei u Ci + ej u Cj = 2cos(=(i )u)e<i u <(Xi Yi ) 2sin(=(i )u)e<i u =(Xi Yi ) R,
where < and = stands resp. for the real and the imaginary part.
A.4
Contents
Introduction
Discrete distributions
1.1
1.2
1.3
1.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.4
Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bernoulli/Binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.4
Random generation
1.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
145
146
CONTENTS
1.4
1.5
1.6
1.7
1.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.4
Random generation
1.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Quasi-binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.4
Random generation
1.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.4
Random generation
1.5.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6.4
Random generation
1.6.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Quasi-Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
CONTENTS
147
1.7.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.4
Random generation
1.7.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.8
1.9
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Geometric distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.8.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.8.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8.4
Random generation
1.8.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.9.4
Random generation
1.9.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.10.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.11 Zero-truncated or zero-modified negative binomial distribution . . . . . . . . . . . . 24
148
CONTENTS
1.11.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.11.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.11.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.11.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.11.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.12 Pascal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.12.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.12.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.12.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.12.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.12.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13 Hypergeometric distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.13.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
27
Conway-Maxwell-Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.4
Random generation
2.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
CONTENTS
2.2
2.3
2.4
2.5
149
Delaporte distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.4
Random generation
2.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Engen distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.4
Random generation
2.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Logaritmic distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.4
Random generation
2.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Sichel distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.4
Random generation
2.5.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
150
CONTENTS
2.6
2.7
2.8
2.9
Zipf distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.4
Random generation
2.6.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7.4
Random generation
2.7.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Rademacher distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.8.4
Random generation
2.8.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Skellam distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.9.4
Random generation
2.9.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
CONTENTS
151
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.10.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11 Zeta distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.11.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
II
Continuous distributions
3.2
34
35
Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.1.4
3.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Triangular distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
152
CONTENTS
3.3
3.4
3.5
3.6
3.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.4
Random generation
3.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.2
Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.3
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.4
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.5
Random generation
3.3.6
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.4
Random generation
3.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.5.2
Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5.3
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.4
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5.5
Random generation
3.5.6
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Kumaraswamy distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
CONTENTS
153
3.6.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.6.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6.4
Random generation
3.6.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2
4.3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
47
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.1.4
Random generation
4.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.4
Random generation
4.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3.4
Random generation
4.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
154
CONTENTS
4.4
4.5
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.4.4
Random generation
4.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.5.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.5.4
Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2
56
Exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1.4
Random generation
5.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Shifted exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.4
Random generation
5.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
CONTENTS
5.3
5.4
5.5
155
Inverse exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3.4
Random generation
5.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Gamma distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4.4
Random generation
5.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.5.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.5.4
Random generation
5.5.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.6
Chi-squared distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.7
Inverse Gamma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.7.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.7.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.7.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.7.4
Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
156
CONTENTS
5.7.5
5.8
5.9
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.8.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.8.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.8.4
Random generation
5.8.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.9.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.9.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.9.4
Random generation
5.9.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.10.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.11 Weibull distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.11.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.11.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.11.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.11.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
CONTENTS
157
5.11.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.12 Inverse Weibull distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.12.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.12.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.12.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.12.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.12.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.13 Laplace or double exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . 72
5.13.1 Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.13.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.13.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.13.4 Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.13.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.2
75
Chi-squared distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.1.4
Random generation
6.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Chi distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
158
CONTENTS
6.3
6.4
6.5
6.6
6.2.4
Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.3.4
Random generation
6.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.4.4
Random generation
6.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.5.2
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.5.3
Random generation
6.5.4
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.6.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.6.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.6.4
Random generation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
CONTENTS
159
6.6.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.2
7.3
Student t distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.1.4
Random generation
7.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Cauchy distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.2
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.3
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.4
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.2.5
Random generation
7.2.6
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Fisher-Snedecor distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.3.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.3.4
Random generation
7.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8 Pareto family
8.1
84
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
88
Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.1.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
160
CONTENTS
8.2
8.3
8.4
8.5
8.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
8.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.1.4
Random generation
8.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Feller-Pareto distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.2.4
Random generation
8.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Inverse Pareto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.3.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.3.4
Random generation
8.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.4.4
Random generation
8.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
CONTENTS
161
8.5.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.5.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.5.4
Random generation
8.5.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.6
8.7
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.6.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.6.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.6.4
Random generation
8.6.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.7.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.7.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.7.4
Random generation
8.7.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9.2
108
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.1.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.1.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.1.4
Random generation
9.1.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
162
CONTENTS
9.3
9.4
9.5
9.6
9.2.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.2.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.2.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.2.4
Random generation
9.2.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.3.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.3.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.3.4
Random generation
9.3.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.4.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.4.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.4.4
Random generation
9.4.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.5.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.5.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.5.4
Random generation
9.5.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
CONTENTS
163
9.6.1
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.6.2
Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.6.3
Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
9.6.4
Random generation
9.6.5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
111
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
III
116
117
164
CONTENTS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
CONTENTS
165
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
12 Multivariate distributions
133
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
166
CONTENTS
13 Misc
135
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Conclusion
137
Bibliography
137
A Mathematical tools
141
CONTENTS
167