Joint Distribution: Eral Rvs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Joint Distribution

• We may be interested in probability statements of sev-


eral RVs.
• Example: Two people A and B both flip coin twice.
X: number of heads obtained by A. Y : number of
heads obtained by B. Find P (X > Y ).
• Discrete case:
Joint probability mass function: p(x, y) = P (X =
x, Y = y).
– Two coins, one fair, the other two-headed. A ran-
domly chooses one and B takes the other.
 
1 A gets head 1 B gets head
X= Y =
0 A gets tail 0 B gets tail
Find P (X ≥ Y ).

1
• Marginal probability mass function of X can be ob-
tained from the joint probability mass function, p(x, y):
X
pX (x) = p(x, y) .
y:p(x,y)>0

Similarly:
X
pY (y) = p(x, y) .
x:p(x,y)>0

2
• Continuous case:
Joint probability density function f (x, y):
Z Z
P {(X, Y ) ∈ R} = f (x, y)dxdy
R

• Marginal pdf:
Z ∞
fX (x) = f (x, y)dy
−∞
Z ∞
fY (y) = f (x, y)dx
−∞
• Joint cumulative probability distribution function of
X and Y
F (a, b) = P {X ≤ a, Y ≤ b} − ∞ < a, b < ∞

• Marginal cdf:
FX (a) = F (a, ∞)
FY (b) = F (∞, b)
• Expectation E[g(X, Y )]:
P P
= R y R x g(x, y)p(x, y) in the discrete case
∞ ∞
= −∞ −∞ g(x, y)f (x, y)dxdy in the continuous case

3
• Based on joint distribution, we can derive
E[aX + bY ] = aE[X] + bE[Y ]
Extension:
E[a1X1 + a2X2 + · · · + anXn]
= a1E[X1] + a2E[X2] + · · · + anE[Xn]

• Example: E[X], X is binomial with n, p:



1 ith flip is head
Xi =
0 ith flip is tail
n
X n
X
X= Xi , E[X] = E[Xi] = np
i=1 i=1
• Assume there are n students in a class. What is the
expected number of months in which at least one stu-
dent was born. (Assume equal chance of being born
in any month).
Solution: Let X be the number of months some stu-
dents are born. Let Xi be the indicator RV for the
ith month
P12 in which some students are born. Then
X = i=1 Xi. Hence,
11 n
E(X) = 12E(X1) = 12P (X1 = 1) = 12 · [1 − ( ) ].
12

4
Independent Random Variables

• X and Y are independent if


P (X ≤ a, Y ≤ b) = P (X ≤ a)P (Y ≤ b)

• Equivalently: F (a, b) = FX (a)FY (b).


• Discrete: p(x, y) = pX (x)pY (y).
• Continuous: f (x, y) = fX (x)fY (y).
• Proposition 2.3: If X and Y are independent, then for
function h and g, E[g(X)h(Y )] = E[g(X)]E[h(Y )].

5
Covariance

• Definition: Covariance of X and Y


Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))]

• Cov(X, X) = E[(X − E(X))2] = V ar(X).


• Cov(X, Y ) = E[XY ] − E[X]E[Y ].
• If X and Y are independent, Cov(X, Y ) = 0.
• Properties:
1. Cov(X, X) = V ar(X)
2. Cov(X, Y ) = Cov(Y, X)
3. Cov(cX, Y ) = cCov(X, Y )
4. Cov(X, Y + Z) = Cov(X, Y ) + Cov(X, Z)

6
Sum of Random Variables

• If Xi’s are independent, i = 1, 2, ..., n


n
X n
X
V ar( Xi ) = V ar(Xi)
i=1 i=1
n
X n
X
V ar( a i Xi ) = a2i V ar(Xi)
i=1 i=1
• Example: Variance of Binomial RV, sum of indepen-
dent Bernoulli RVs. V ar(X) = np(1 − p).

7
Moment Generating Functions

• Moment generating function of a RV X is φ(t)


φ(t) = E[etX ]
tx
P
x:p(x)>0 e p(x) X discrete
= R ∞ tx
−∞ e f (x)dx X continuous

• Moment of X: the nth moment of X is E[X n].


• E[X n] = φ(n)(t) | t = 0, where φ(n)(t) is the nth
order derivative.
• Example
1. Bernoulli with parameter p: φ(t) = pet + (1 − p),
for any t.
t
2. Poisson with parameter λ: φ(t) = eλ(e −1), for any
t.
• Property 1: Moment generation function of the sum
of independent RVs:
Xi, i = 1, ..., n are independent, Z = X1 + X2 + · · · +
Xn ,
n
Y
φZ (t) = φXi (t)
i=1

8
• Property 2: Moment generating function uniquely de-
termines the distribution.
• Example:
1. Sum of independent Binomial RVs
2. Sum of independent Poisson RVs
3. Joint distribution of the sample mean and sample
variance from a normal porpulation.

9
Important Inequalities

• Markov Inequality: If X is a RV that takes only non-


negative values, then for any a > 0
E[X]
P (X ≥ a) ≤ .
a
• Chebyshev’s Inequality: If X is a RV with mean µ
and variance σ 2, then for any value k > 0
σ2
P {|X − µ| ≥ k} ≤ 2 .
k
• Examples: obtaining bounds on probabilities.

10
Strong Law of Large Numbers

• Theorem 2.1 (Strong Law of Large Numbers): Let


X1, X2, ..., be a sequence of independent random
variables having a common distribution. Let E[Xi] =
µ. Then, with probability 1
X1 + X 2 + · · · + X n
→ µ as n → ∞
n

11
Central Limit Theorem

• Theorem 2.2 (Central Limit Theorem): Let X1, X2,


..., be a sequence of independent random variables
having a common distribution. Let E[Xi] = µ, V ar[Xi] =
σ 2. Then the distribution of
X1 + X2 + · · · + Xn − nµ

σ n
tends to the standard normal as n → ∞. That is
X1 + X2 + · · · + Xn − nµ
P{ √ ≤ z}
σZ n
z
1 2
→√ e−x /2dx = Φ(z)
2π −∞
• Example: estimate probability.
1. Let X be the number of times that a fair coin flipped
40 times lands heads. Find P (X = 20).
2. Suppose that orders at a restaurant are iid random
variables with mean µ = 8 dollars and standard
deviation σ = 2 dollars. Estimate the probability
that the first 100 customers spend a total of more
than $840. Estimate the probability that the first
100 customers spend a total of between $780 and
$820.

12

You might also like