Download Slides for this session - Notes 6: Bivariate Random Variables.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics and Data
Analysis
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
6-1/46
Part 6: Bivariate Random Variables
Statistics and Data Analysis
Part 6 – Bivariate Random
Variables and
Correlation
6-2/46
Part 6: Bivariate Random Variables
Probabilities for two Events, A,B



6-3/46
Marginal Probability = The probability of an
event not considering any other events. P(A)
Joint Probability = The probability that two
events happen at the same time. P(A,B)
Conditional Probability = The probability that
one event happens given that another event
has happened. P(A|B)
Part 6: Bivariate Random Variables
Probabilities: Inherited Color Blindness





6-4/46
Inherited color blindness has different incidence rates in men and
women. Women usually carry the defective gene and men usually
inherit it.
Pick an individual at random from the population.
CB
= has inherited color blindness
MALE = gender
Marginal: P(CB)
= 2.75%
Conditional:P(CB|MALE)
= 5.0%
(1 in 20 men)
P(CB|FEMALE)
= 0.5%
(1 in 200 women)
Joint:
P(CB and MALE)
= 2.5%
P(CB and FEMALE)
= 0.25%
Part 6: Bivariate Random Variables
Independent Random Variables
One card is drawn randomly
from a deck of 52 cards
Ace
Heart
Yes=1
No=0
Total
Yes=1
1/52
12/52
13/52
No=0
3/52
36/52
39/52
Total
6-5/46
4/52
48/52
52/52
P(Ace|Heart)
= 1/13
P(Ace|~Heart)
= 3/39 = 1/13
P(Ace)
= 4/52 = 1/13
P(Ace) does not depend on whether the
card is a heart or not.
P(Heart|Ace)
= 1/4
P(Heart|~Ace)
= 12/48 = 1/4
P(Heart)
= 13/52 = 1/4
P(Heart) does not depend on whether
the card is an ace or not.
Part 6: Bivariate Random Variables
Independence
6-6/46

Random variables are independent if the
occurrence of one does not affect the
probability distribution of the other.

If P(Y|X) does not change when X
changes, then the variables are
independent.
Part 6: Bivariate Random Variables
Equivalent Definition of
Independence
6-7/46

Random variables X and Y are independent if
PXY(X,Y) = PX(X)PY(Y).

“The joint probability equals the product of
the marginal probabilities.”
Part 6: Bivariate Random Variables
Independent Events
Ace
Heart
Yes=1
Yes=1 1/52
P(Ace,Heart)
= 1/52
P(Ace)
= 1/13
= 1/4
No=0
Total
12/52
13/52
=1/4
P(Heart)
No=0
3/52
36/52
39/52
P(Ace) x P(Heart) = (1/13)(1/4) = 1/52.
Total
4/52
=1/13
48/52
52/52
Ace and Heart are independent
6-8/46
Part 6: Bivariate Random Variables
Not Independent Events
Color Blind
P(Color blind, Male)
= .025
P(Male)
= .500,
= .0275
Gender
No
Yes
Total
P(Color blind)
Male
.475
.025
0.50
Female
.4975
.0025
0.50
P(Color blind) x P(Male)
= .500 x .0275 = .01375
Total
.97255
.0275
1.00
.01375 is not equal to .025
Gender and color blindness are
not independent.
6-9/46
Part 6: Bivariate Random Variables
Two Important Math Results
6-10/46

For two random variables,
P(X,Y) = P(X|Y) P(Y)
P(Color blind, Male) = P(Color blind|Male)P(Male)
= .05 x .5 = .025

For two independent random variables,
P(X,Y) = P(X) P(Y)
P(Ace,Heart) = P(Ace) x P(Heart).
(This does not work if they are not independent.)
Part 6: Bivariate Random Variables
Conditional Probability


Prob(A | B) = P(A,B) / P(B)
Prob(Color Blind | Male)
Prob(Color Blind,Male)
= ------------------------------P(Male)
= .025 / .50
= .05
6-11/46
Color Blind
Gender
No
Yes
Total
Male
.475
.025
0.500
Female
.4975
.0025
0.50
Total
.97255
.0275
1.00
What is P(Male | Color Blind)?
Part 6: Bivariate Random Variables
Conditional Distributions
Overall Distribution
Color Blind
Not Color Blind
.0275
.9725
 Distribution Among Men (Conditioned on Male)
Color Blind|Male
Not Color Blind|Male
.05
.95
 Distribution Among Women (Conditioned on Female)
Color Blind|Female Not Color Blind|Female
.005
.995
The distribution changes given gender.

6-12/46
Part 6: Bivariate Random Variables
Application –
Legal Case Mix:
Two kinds of
cases show up
each month,
real estate (R)
and financial
(F) (sometimes
together,
usually
separately).
6-13/46
Marginal
Distribution
for Financial
Cases
Real
Estate
Financial
0
1
2
P(R)
0
1
2
3
. 20
Joint Distribution
R = Real estate cases
F = Financial cases
.09
.16
.22
P(F)
.33
.47
.53
1.00
Marginal Distribution for Real Estate Cases
Part 6: Bivariate Random Variables
Legal Services Case Mix
Joint Discrete Distribution
R = Real estate cases
F = Financial cases
Real Estate (R)
Financial (F)
0
1
2
3
P(F)
0
.02
.05
.05
.08
.20
Joint Distribution
1
.03
.05
.08
.17
.33
Joint probabilities are
2
.04
.06
.09
.28
.47
.09
.16
.22
.53
1.00
Prob(F=f and R=r)
P(R)
Note that marginal probabilities are obtained
by summing across or down.
6-14/46
Part 6: Bivariate Random Variables
Legal Services Case Mix
Joint Discrete Distribution
R = Real estate cases
F = Financial cases
Real Estate (R)
Financial
(F)
0
1
2
3
P(f)
0
.02/.20
=.10
.05/.20
=.25
.05/.20
=.25
.08/.20
=.40
.20
1
.03/.33
=.10
.05/.33
=.15
.08/.33
=.24
.17/.33
=.51
.33
2
.04/.47
=.09
.06/.47
=.13
.09/.47
=.19
.28/.47
=.59
.47
Conditional Distributions
Read across the rows.
Probabilities for R given the value of F
Conditional probabilities are
Prob(R=r and F=f)/P(F=f)
6-15/46
Part 6: Bivariate Random Variables
Conditional Distributions



The probability distribution of Real estate cases (R) given Financial cases (F)
varies with the number of Financial cases.
The probability that (R=3)|F goes up as F increases from 0 to 2.
This means that the variables are not independent.
Conditional Probabilities for Real Estate Cases
6-16/46
0
1
2
3
R|F
Financial=0
.10
.25
.25
.40
1.00
Financial=1
.10
.15
.24
.51
1.00
Financial=2
.09
.13
.19
.59
1.00
Part 6: Bivariate Random Variables
Covariation

Pick 10,325 people at random from the population. Predict how
many will be color blind: 10,325 x .0275 = 284

Pick 10,325 MEN at random from the population. Predict how
many will be color blind: 10,325 x .05 = 516

Pick 10,325 WOMEN at random from the population. Predict how
many will be color blind: 10,325 x .005 = 52

The expected number of color blind people, given gender,
depends on gender.
Color Blindness covaries with Gender

6-17/46
Part 6: Bivariate Random Variables
Covariation in legal services
Real Estate Cases
0
1
2
3
Financial=0
.10
.25
.25
.40
Financial=1
.10
.15
.24
.51
Financial=2
.09
.13
.19
.59
These are the conditional distributions P(R|F)
How many real estated cases should the office expect if it
knows (or predicts) the number of financial cases?
E[R if F=0] = 0(.10) + 1(.25) + 2(.25) + 3(.40) = 1.95 (less than 2)
E[R if F=1] = 0(.10) + 1(.15) + 2(.24) + 3(.51) = 2.16 (more than 2)
E[R if F=2] = 0(.09) + 1(.13) + 2(.19) + 3(.59) = 2.28 (more than 2)
This is how R and F covary.
6-18/46
Part 6: Bivariate Random Variables
Covariation and Regression
Expected Number of Real Estate Cases
Given Number of Financial Cases
2.4 2.3 2.2 The “regression of R on F”
2.1 2.0 1.9 0
1
2
Financial Cases
6-19/46
Part 6: Bivariate Random Variables
Measuring How Variables Move
Together: Covariance
Cov(X,Y)   values of X  values of Y P(x,y)(x- X )(y  Y )
Covariance can be positive or negative
The measure will be positive if it is likely
that Y is above its mean when X is above
its mean.
It is usually denoted σXY.
6-20/46
Part 6: Bivariate Random Variables
Legal Services Case Mix Covariance
Real Estate
Financial
0
1
2
3
P(F)
0
.02
.05
.05
.08
.20
1
.03
.05
.08
.17
.33
2
.04
.06
.09
.28
.47
P(R)
.09
.16
.22
.53
1.00
The two means are
μR = 0(.09)+1(.16)+2(.22)+3(.53) = 2.19
μF = 0(.20)+1(.33)+2(.47)
6-21/46
Compute the Covariance
ΣFΣR (F-1.27)(R-2.19)P(F,R)=
(0-1.27)(0-2.19).02= +.055626
(0-1.27)(1-2.19).05= +.075565
(0-1.27)(2-2.19).05= +.012065
(0-1.27)(3-2.19).08= -.082296
(1-1.27)(0-2.19).03= +.017739
(1-1.27)(1-2.19).05= +.016065
(1-1.27)(2-2.19).08= +.004104
(1-1.27)(3-2.19).17= -.037179
(2-1.27)(0-2.19).04= -.063948
(2-1.27)(1-2.19).06= -.052122
(2-1.27)(2-2.19).09= -.012483
(2-1.27)(3-2.19).28= +.165564
Sum
= +0.09870
= 1.27
Part 6: Bivariate Random Variables
Covariance and Scaling
Real Estate
Financial
0
1
2
3
P(F)
0
.02
.05
.05
.08
.20
1
.03
.05
.08
.17
.33
2
.04
.06
.09
.28
.47
P(R)
.09
.16
.22
.53
1.00
μR = 0(.09)+1(.16)+2(.22)+3(.53 ) = 2.19
μF = 0(.20)+1(.33)+2(.47)
6-22/46
= 1.27
Compute the Covariance
Cov(R,F) = +0.09870
What does the covariance
mean?
Suppose each real estate case
requires 2 lawyers and each
financial case requires 3
lawyers. Then the number of
lawyers is NR = 2R and NF = 3F.
The covariance of NR and NF
will be 3(2)(.0987) = 0.5922.
But, the “relationship” is the
same.
Part 6: Bivariate Random Variables
Independent Random Variables
Have Zero Covariance
One card drawn randomly from a
deck of 52 cards
E[H] = 1(13/52)+0(49/52) = 1/4
A=Ace
H=Heart Yes=1 No=0
E[A] = 1(4/52)+0(48/52) = 1/13
Total
Covariance = ΣHΣA(H-mH)(A-mA)P(H,A)
(1 - 1/4)(1 - 1/13)1/52 = +36/522
Yes=1
1/52
12/52
13/52
No=0
3/52
36/52
39/52
(1 – 1/4)(0 – 1/13)12/52 = -36/522
Total
4/52
48/52
52/52
(0 – 1/4)(0 – 1/13)36/52 = +36/522
(0 - 1/4)(1 – 1/13)3/52 = -36/522
SUM
6-23/46
= 0 !!
Part 6: Bivariate Random Variables
A Shortcut for Covariance
Cov(X,Y)   values of X  values of Y P(x,y)(x- X )(y   Y )
  values of X  values of Y P(x,y)x y  -  X Y
6-24/46
Part 6: Bivariate Random Variables
Computing the Covariance Using the Shortcut
Compute the Covariance
ΣFΣR [(F-1.27)(R-2.19) * P(F,R)] =
(0-1.27)(0-2.19).02=+.055626
(0-1.27)(1-2.19).05=+.075565
(0-1.27)(2-2.19).05=+.012065
(0-1.27)(3-2.19).08= -.082296
(1-1.27)(0-2.19).03=+. 017739
(1-1.27)(1-2.19).05= +.016065
(1-1.27)(2-2.19).08= +.004104
(1-1.27)(3-2.19).17= -.037179
(2-1.27)(0-2.19).04= -.063948
(2-1.27)(1-2.19).06= -.052122
(2-1.27)(2-2.19).09= -.012483
(2-1.27)(3-2.19).28= +.165564
Sum
= +0.09870
6-25/46
Compute the Covariance
[ΣFΣR FR * P(F,R)] – [μF μR]
(0)(0).02=
0
(0)(1).05=
0
(0)(2).05=
0
(0)(3).08=
0
(1)(0).03=
0
(1)(1).05= .05
(1)(2).08= .16
(1)(3).17= .51
(2)(0).04=
0
(2)(1).06= .12
(2)(2).09= .36
(2)(3).28= 1.68
Sum
= 2.88
2.88 – (1.27)(2.19) = 0.09870
Part 6: Bivariate Random Variables
Covariance and Units of Measurement

Covariance takes the units of
(units of X) times (units of Y)



6-26/46
Consider Cov($Price of X,$Price of Y).
Now, measure both prices in GBP (roughly $1.60
per £). The prices are divided by 1.60, and the
covariance is divided by 1.602.
This is an unattractive result.
Part 6: Bivariate Random Variables
Correlation is Units Free
Correlation Coefficient
 XY
Covariance(x,y)

Standard deviation(x) Standard deviation(y)
 1.00   XY  +1.00.
6-27/46
Part 6: Bivariate Random Variables
Correlation
μR = 2.19 μF = 1.27
Real Estate
Financial
0
1
2
3
P(F)
0
.02
.05
.05
.08
.20
1
.03
.05
.08
.17
.33
2
.04
.06
.09
.28
.47
P(R)
.09
.16
.22
.53
1.00
Correlation =
6-28/46
.0987
= 0.12416
.78945 1.006926
Var(F) =
02(.20)+12(.33)+22(.47) - 1.272
= 0.62323
Standard deviation = .78945
Var(R) =
02(.09)+12(.16)+22(.22) +32(.53)
– 2.192
= 1.0139
Standard deviation = 1.006926
Covariance = +0.09870
Part 6: Bivariate Random Variables
Aspect of Correlation
Independence implies zero
correlation. If the variables are
independent, then the numerator of
the correlation coefficient is 0.
6-29/46
Part 6: Bivariate Random Variables
Sums of Two Random Variables
Example 1: Total number of cases = F+R
 Example 2: Personnel needed
= 3F+2R
 Find for Sums




6-30/46
Expected Value
Variance and Standard Deviation
Application from Finance: Portfolio
Part 6: Bivariate Random Variables
Math Facts 1 – Mean of a Sum

Mean of a sum. The
Mean of X+Y = E[X+Y] = E[X]+E[Y]

Mean of a weighted sum
Mean of aX + bY = E[aX] + E[bY]
= aE[X] + bE[Y]
6-31/46
Part 6: Bivariate Random Variables
Mean of a Sum
Real Estate
Financial
0
1
2
3
P(F)
0
.02
.05
.05
.08
.20
1
.03
.05
.08
.17
.33
2
.04
.06
.09
.28
.47
P(R)
.09
.16
.22
.53
1.00
μR = 2.19
μF = 1.27
What is the mean (expected) number of cases each
month, R+F? E[R + F] = E[R] + E[F] = 2.19 + 1.27 = 3.46
6-32/46
Part 6: Bivariate Random Variables
Mean of a Weighted Sum
Suppose each Real Estate
case requires 2 lawyers and
each Financial case requires 3
lawyers. Then
NR = 2R and NF = 3F.
μR = 2.19
μF = 1.27
If NR = 2R and NF = 3F, then the mean number of lawyers is the mean of
2R+3F. E[2R + 3F] = 2E[R] + 3E[F] = 2(2.19) + 3(1.27) = 8.19 lawyers required.
6-33/46
Part 6: Bivariate Random Variables
Math Facts 2 – Variance of a Sum
Variance of a Sum
Var[x+y] = Var[x] + Var[y] +2Cov(x,y)
Variance of a sum equals the sum of the variances
only if the variables are uncorrelated.
Standard deviation of a sum
The standard deviation of x+y is not equal to the sum
of the standard deviations.
x  y      2xy
2
x
6-34/46
2
y
Part 6: Bivariate Random Variables
Variance of a Sum
μR = 2.19,
σR2 = 1.0139
μF = 1.27,
σF2 = 0.62323
σRF = 0.0987
What is the variance of the total number of cases that occur each month?
This is the variance of F+R = (1.0139 + 0.62323 + 2(.0987)) = 1.83453.
The standard deviation is 1.35445.
6-35/46
Part 6: Bivariate Random Variables
Math Facts 3 – Variance of a
Weighted Sum
Var[ax+by] = Var[ax] + Var[by] +2Cov(ax,by)
= a2Var[x] + b2Var[y] + 2ab Cov(x,y).
Also, Cov(x,y) is the numerator in ρxy, so
Cov(x,y) = ρxy σx σy.
ax by  a   b   2abxy x y
2
6-36/46
2
x
2
2
y
Part 6: Bivariate Random Variables
Variance of a Weighted Sum
μR = 2.19,
σR2 = 1.0139
μF = 1.27,
σF 2 = 0.62323
σRF = 0.0987, RF = .14216
Suppose each real estate case
requires 2 lawyers and each
financial case requires 3
lawyers. Then NR = 2R and NF
= 3F.
What is the variance of the total number of lawyers needed each month?
What is the standard deviation? This is the variance of 2R+3F
= 22(1.0139) + 32(0.62323) + 2(2)(3)(.12416)(1.006926)(0.78945)=10.84903
The standard deviation is the square root, 3.29379
6-37/46
Part 6: Bivariate Random Variables
Application - Portfolio




6-38/46
You have $1000 to allocate between assets
A and B. The yearly returns on the two
assets are random variables rA and rB.
The means of the two returns are
E[rA] = μA and E[rB] = μB
The standard deviations (risks) of the
returns are σA and σB.
The correlation of the two returns is ρAB
Part 6: Bivariate Random Variables
6-39/46
Part 6: Bivariate Random Variables
The two returns are positively correlated.
6-40/46
Part 6: Bivariate Random Variables
6-41/46
Part 6: Bivariate Random Variables
Portfolio
6-42/46

You have $1000 to allocate to A and B.

You will allocate proportions w of your
$1000 to A and (1-w) to B.
Part 6: Bivariate Random Variables
Return and Risk
Your expected return on each dollar is
E[wrA + (1-w)rB] = wμA + (1-w)μB
 The variance your return on each dollar is
Var[wrA + (1-w)rB]
= w2 σA2 + (1-w)2σB2 + 2w(1-w)ρABσAσB
 The standard deviation is the square root.

6-43/46
Part 6: Bivariate Random Variables
Risk and Return: Example
Suppose you know μA, μB, ρAB, σA, and σB (You have watched
these stocks for over 6 years.)
 The mean and standard deviation are then just functions of w.
 I will then compute the mean and standard deviation for different
values of w.
 For our Microsoft and Walmart example,
μA = .050071, μB, = .021906
σA = .114264, σB,= .086035, ρAB = .248634
E[return] = w(.050071) + (1-w)(.021906)
= .021906 + .028156w
SD[return] = sqr[w2(.1142)+ (1-w)2(.0862) +
2w(1-w)(.249)(.114)(.086)]
= sqr[.013w2 + .0074(1-w)2 + .000244w(1-w)]

6-44/46
Part 6: Bivariate Random Variables
W=1
W=0
For different values of w,
risk = sqr[.013w2 + .0074(1-w)2 + .00244w(1-w)] is on the horizontal axis
return =
.02196 + .028156w
is on the vertical axis.
6-45/46
Part 6: Bivariate Random Variables
Summary





Random Variables – Independent
Conditional probabilities change with the values
of dependent variables.
Covariation and the covariance as a measure.
(The regression)
Correlation as a units free measure of covariation
Math results



6-46/46
Mean of a weighted sum
Variance of a weighted sum
Application to a portfolio problem.
Part 6: Bivariate Random Variables
Related documents