Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Fundamentals of Probability
Wooldridge, Appendix B
B.1 Random Variables and Their
Probability Distributions
A random variable (rv) is one that takes on
numerical values and has an outcome that is
determined by an experiment
Examples of an rv
Flipping a coin, and let X be the outcome of head (X = 1) or
tail (X = 0). This (X) is a Bernoulli (binary) rv
Flipping two coins, and let X be the number of heads
Discrete Random Variable
Defn. (Discrete rv). A discrete rv is one that takes
only a finite or countably infinite number of
values, i.e., one with a space that is either finite
or countable.
Defn. (Probability mass function). Let X be a
discrete rv. The probability mass function (pmf)
of X is given by
pj = P(X = xj), j = 1, 2, ...,k,
Discrete Random Variable:
Properties
Properties of a pmf
0 pj 1 for all j
p1+p2+...+pk = 1
Example. Flipping 2 coins, and let X be the
number of heads. Then, the pmf of X is
xj
pj
0
¼
1
½
2
¼
Continuous Random Variable
Defn. (Continuous Random Variable). We say a
random variable is a continuous random variable
if its cumulative distribution function FX(x) is a
continuous function for all x R. For a
continuous rv there are no points of discrete
mass; that is, if X is continuous then
P(X = x) = 0 for all x R.
Continuous Random Variable
For continuous rv's,
.
FX ( x) =
ò
x
- ¥
f X (t )dt
for some function f X (t ). The function f X ( x) is called a
probability density function (pdf) of X . If f X ( x) is also
continuous, then the Fundamental Theorem of Probability
implies that
d
FX ( x) = f X ( x)
dx
Continuous Random Variable
If X is a continuous rv, then the probabilities can
be obtained by integration, i.e.,
P(a < X £ b) = FX (b) - FX (a ) =
ò
b
a
f X (t )dt
Expectations of a Random Variable
Defn. (Expectation). If X is a continuous rv with pdf f ( x) and
ò
¥
- ¥
| x | f ( x) dx < ¥
then the expectation (expected value) of X is
E( X ) =
ò
¥
- ¥
x f ( x) dx
If X is a discrete rv with pmf p( x) and
å
| x | p( x) < ¥
x
then the expectation (expected value) of X is
E( X ) =
å
x
x p( x)
Expectations: Properties
For any constant c,
E (c) = c.
For any constants a and b,
E (aX + b) = aE ( X ) + b
Given constants a1 , a2 ,..., an and random variables X 1 , X 2 ,..., X n ,
æn
ö
÷
E ççå ai X i ÷
= E (a1 X 1 + a2 X 2 + ... + an X n )
÷
çè i= 1
ø
= a1 E ( X 1 ) + a2 E ( X 2 ) + ... + an E ( X n )
n
=
å
i= 1
ai E ( X i )
Special Expectation: Variance and
Standard Deviation (Measures of
Variability)
The variance of a random variable (discrete or continuous),
denoted Var(X ) or 2 , is defined as
Var( X ) º 2 = E[( X - ) 2 ]
which is equivalent to
2 = E ( X 2 - 2 X + 2 ) = E ( X 2 ) - 2E ( X ) + E ( 2 )
= E ( X 2 ) - 2 2 + 2 = E ( X 2 ) - 2
Û E ( X 2 ) = 2 + 2
(used in proof of unbiasedness of sample variance s 2 = ( x - x ) 2 / (n - 1);
see Simple_OLS_inference.pdf , Appendix B, Lemma 6)
Variance & Standard Deviation:
Examples
Consider Bernoulli distribution X ~ Bernoulli(), with pmf
P ( x) = x (1 - )1- x , x = 0,1
Then,
E( X ) = º
å
x p ( x)= (0) 0 (1 - )1- 0 + (1) 1 (1 - )1- 1 =
x
E( X 2 ) = E( X ) =
\
2 = E ( X 2 ) - 2 = - 2 = (1 - )
Variance: Properties
Variance of a constant c
Var (c) = 0
For any constants a and b,
Var (aX + b) = a 2Var ( X )
Variance: Properties
Variance of a linear function,
æn
ö
÷
ç
Var çå ai X i ÷
= Var (a1 X 1 + a2 X 2 + .... + an X n )
÷
çè i= 1
ø
= a12Var ( X 1 ) + a22Var ( X 2 ) + ... + an2Var ( X n )
+ å 2ai a j Cov( xi , x j )
i> j
n
=
å
ai2Var ( X i ) if the xi ' s are uncorrelated,
i= 1
i.e., if Cov( xi , x j ) = 0 for all i ¹ j
Standard Deviation
The standard deviation of a random variable is the
squared root of its variance.
=
2
Standardizing a Random Variable
Property. If X ~ (, 2 ), then the standardized random variable
Z º ( X - ) / ~ (0,1)
Proof.
æX - ö 1
1
÷
ç
E (Z ) = E ç
= E ( X - ) = [ E ( X ) - E ()]
÷
÷
çè ø
1
= ( - ) = 0
æX - ö 1
1
÷
ç
Var ( Z ) = Var ç
= 2 Var ( X - ) = 2 Var ( X )
÷
çè ÷
ø
2
= 2=1
B.4 Joint and Conditional Distributions
Covariance
Cov( X , Y ) º E[( X - X )(Y - Y )]
= E[( X - X )Y ] = E[ X (Y - Y )]
= E ( XY ) - X Y
Properties
1. If X and Y are independent, then Cov( X , Y ) = 0.
2. For any constants a1 , b1 , a2 , and b2 ,
Cov(a1 X + b1 , a2Y + b2 ) = a1a2Cov ( X , Y )
B.4 Joint and Conditional Distributions
3. Cauchy -Schwartz inequality
For any rv's X , Y ,
3.1
3.2
[ E ( XY )]2 £ E ( X 2 ) E (Y 2 )
2
{E[ X - E ( X )][Y - E (Y )]}
£ E[ X - E ( X )]2 E[Y - E (Y )]2
3.3
| Cov( X , Y ) | £ sd ( X ) sd (Y )
[Note that 3.3 is equivalent to 3.1; why?]
B.4 Joint and Conditional Distributions
Correlation
Cov( X , Y )
XY
Corr ( X , Y ) =
=
sd ( X ) sd (Y ) X Y
Properties (correlation)
1. - 1 £ Corr ( X , Y ) £ 1 (Follows from C-S inequality; why?)
2. For constants a1 , b1 , a2 , and b2 ,
Corr (a1 X + b1 , a2Y + b2 ) = Corr ( X , Y ) if a1a2 > 0
Corr (a1 X + b1 , a2Y + b2 ) = - Corr ( X , Y ) if a1a2 < 0
Properties (variance)
3. For constant a and b,
Var (aX + bY ) = a 2Var ( X ) + b 2Var (Y ) + 2abCov( X , Y )
B.5 The Normal and Related
Distributions (Chi-square, F, t)
Defn.
(Normal (Gaussian) Distribution). The pdf for the normal
.
random var iable X ~ N (, 2 ) is
f ( x) =
1
f ( z) =
1
2
1æ
x- ö÷
ç
- ç
÷
2 çè ø÷
e
, - ¥ < x< ¥
2
where = E ( X ) and 2 = Var ( X ).
Defn. (Standard Normal Distribution). The standard normal distribution
is a special case of the normal distribution when = 0 and = 1. The pdf
for the s tan dard normal random var iable, denoted Z ~ N (0,1), is
2
-
e
1 2
z
2
, - ¥ < z< ¥
B.5 The Normal and Related
Distributions (Chi-square, F, t)
Draw the graphs to show:
P( Z > z ) = 1- ( z )
P( Z < - z ) = P( Z > z )
P (a £ Z £ b) = (b) - (a)
P (| Z |> c) = P( Z > c or Z < - c) = P ( Z > c) + P ( Z < - c)
= 2 P( Z > c) = 2[1 - (c)]
Using Excel's function:
P ( Z £ 1.96) = normsdist(1.96) = 0.975
Normal Distribution: Properties
.
Property (standardizing the normal random variable)
X-
2
If X ~ N (, ), then Z =
~ N (0,1)
Exercises
Let X ~ N (18, 4). Find
1. P( X £ 16)
2. P( X £ 20)
3. P( X ³ 16)
4. P( X ³ 20)
5. P(16 £ X £ 20)
Standard Normal Table
Normal Distribution: Additional
Properties
.1. If X ~ N (, 2 ), then aX + b ~ N ( + b, a 22 )
2. If X and Y are jointly normally distributed, then
they are independent if, and only if, Cov( X , Y ) = 0.
3. If Y1 , Y2 ,..., Yn are indendent random variables and
each is distributed as N (, 2 ), or, drawing independent
random sample of size n from N (, 2 ), then the statistic
Y ~ N (, 2 / n)
The Chi-Square Distribution
Defn. (Chi-square Statistic/Distribution). Let Z i , i = 1, 2,..., n
.be independent random variables, each distributed as the
standard normal, that is
Z i ~ N (0,1)
Then,
n
X=
å
Z i2 ~ n2
i= 1
is Chi-square distributed with n degrees of freedom.
Moments of the Chi-square distribution
E ( X ) = n, Var ( X ) = 2n.
The Student’s t Distribution
Defn. (Student's t Statistic/Distribution). Let Z have a
.standard normal distribution and X a Chi-square distribution
with n degrees of freedom. That is,
Z ~ N (0,1),
X ~ 2n
Further, assume Z and X are independent. Then, the random variable
Z
t=
~ tn
X /n
is distributed as Student's t distribution with n degrees of freedom:
Moments of the t distribution
E (t ) = 0
for n > 1
Var (t ) = n /(n - 2) for n > 2
The F Distribution
Defn. (F Statistic/Distribution). Let
.
X 1 ~ 2k1
X 2 ~ 2k2
and assume that X 1 and X 2 are independent. Then, the
random variable
X 1 / k1
F=
~ Fk1 ,k2
X 2 / k2
is F -distributed with k1 and k2 degrees of freedom.