Download Chapter 5 Joint Probability Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 5 Joint Probability Distributions
z
If x and y are two random variables, the probability distribution that
defines their simultaneous behavior is called a joint probability
distribution.
5-1 Two discrete random variables
5-1.1 Joint Probability Distributions
Example 5-1
z
In the development of a new receiver for the transmission of digital
information, each received bit is rated as acceptable, suspect, or
unacceptable,with probabilities 0.9, 0.08, and 0.02, respectively.
z
Assume that the ratings of each bit are independent.
z
In the first four bits transmitted:
X: the number of acceptable bits
Y: the number of suspect bits
z
The distribution of X is binomial with n =4 and p =0.9, and the
distribution of Y is binomial with n =4 and p =0.08.
z
The probability of each of the points show in Fig. 5-1
Four unacceptable bits:
( )(0.02)
4
4
4
= (2 ×10 −2 ) 4 = 1.6 ×10 −7
If X and Y are discrete random variables, the joint probability
distribution of X and Y is a description of the set of point (x, y) in the
range of (X, Y) along with the probability of each point.
z
The joint probability distribution of two random variables is sometimes
referred to as the bivariate probability distribution or bivariate
5-1
distribution of random variables.
The joint probability mass function P(X = x and Y = y) is usually written
as P(X = x, Y = y)
z
Definition
The joint probability mass function of the discrete random variables X
and Y, denoted as fXY (x, y), satisfies
(1) f XY ( x, y ) ≥ 0
(2)
∑∑ f
x
XY
( x, y ) = 1
y
(3) f XY ( x, y ) = P( X = x, Y = y )
(5-1)
Example 5-2
z
See Fig 5-1
z
For example, P(X=2, Y=1)is the probability that exactly two
acceptable bits and exactly one suspect bit are received among the
four bits transferred.
z
a: acceptable ( p = 0.9)
s: suspect
( p = 0.08)
u: unacceptable ( p = 0.02)
By the assumption of independence:
P(aasu)=0.9(0.9)(0.08)(0.02)=0.0013
⎛ 4 ⎞
f XY ( 2 ,1) = P( X = 2, Y = 1) = ⎜
⎟ × 0.0013
2
,
1
,
1
⎠
⎝
4!
=
× 0.0013 = 12 × 0.0013 = 0.0156
2! 1! 1!
5-2
5-1.2 Marginal Probability Distributions
z
z
The individual probability distribution of a random variable is referred
to as its marginal probability distribution.
To determine P(X = x), we sum P(X = x, Y = y ) over all points in the
range of (X , Y ) for which X = x
Example 5-3
z
See Fig 5-1
z
Find the marginal probability distribution of X.
z
For example :
P(X = 0 )=
P(X = 1 )=
P(X = 2 )=
P(X = 3 )= P(X =3, Y = 0) + P(X =3, Y = 1)= 0.0583 + 0.2333
⎛4⎞
3
1
= 0.292= ⎜ ⎟ 0.9 0.1
⎝3⎠
P(X = 4 )=
Figure 5-2 Marginal probability distributions of X and Y from Fig. 5-1
5-3
Definition
If X and Y are discrete random variables with joint probability mass function
fXY (x, y), then the marginal probability mass functions of X and Y are
f X ( x) = P( X = x) = ∑ f XY ( x, y )
Rx
fY ( y ) = P(Y = y ) = ∑ f XY ( x, y )
(5-2)
Ry
where Rx denotes the set of all points in the range of (X, Y) for which X = x and
Ry denotes the set of all points in the range of (X, Y) for which Y = y
Definition - Mean and Variance from Joint Distribution
If the marginal probability distribution of X has the probability mass function
fX (x), then
E ( X ) = µ X = ∑ xf X ( x) = ∑ x ∑ f XY ( x, y ) = ∑∑ xf XY ( x, y )
x
x
Rx
x
Rx
= ∑ xf XY ( x, y )
(5-3)
R
and
V ( X ) = σ X2 = ∑ ( x − µ X ) 2 f X ( x ) = ∑ ( x − µ X ) 2 ∑ f XY ( x, y )
x
x
Rx
= ∑∑ ( x − µ X ) 2 f XY ( x, y ) = ∑ ( x − µ X ) 2 f XY ( x, y )
x
Rx
R
where Rx denotes the set of all points in the range of (X, Y) for which X = x and
R denotes the set of all points in the range of (X, Y)
Example 5-4
z In Example 5-1
E ( X ) = 0 [ f XY (0, 0) + f XY (0, 1) + f XY (0, 2) + f XY (0, 3) + f XY (0, 4)]
+ 1[ f XY (1, 0) + f XY (1,1) + f XY (1, 2) + f XY (1, 3)]
+ 2 [ f XY (2, 0) + f XY (2, 1) + f XY (2, 2)]
+ 3 [ f XY (3, 0) + f XY (3, 1)]
+ 4 [ f XY (4, 0)]
= 0 [0.0001] + 1 [0.0036] + 2 [0.0486] + 3 [0.02916] + 4 [0.6561] = 3.6
5-4
z
Alternatively, because the marginal probability distribution of X is
binomial,
E ( X ) = np = 4(0.9) = 3.6
V ( X ) = np (1 − p ) = 4(0.9)(1 − 0.9) = 0.36
5-5
5-1.3 Conditional Probability Distributions
Example 5-5
z See Example 5-1(Fig 5-1)
z
z
Fig 5-1
The probability that Y = 0 given that X = 3
(Note: P ( A | B ) = P ( A I B ) P ( B ) )
P(Y = 0 | X = 3) = P( X = 3, Y = 0) P( X = 3) = f XY (3, 0) f X (3)
= 0.05832 0.2916 = 0.200
The probability that Y = 1 given that X = 3
P(Y = 1 | X = 3) = P( X = 3, Y = 1) P( X = 3) = f XY (3, 1) f X (3)
= 0.2333 0.2916 = 0.800
Notice P(Y = 0 | X =3) +P(Y = 1 | X = 3) =1
Definition
Given discrete random variables X and Y with joint probability mass function
fXY (x, y) the conditional probability mass function of Y given X = x is
P ( X = x , Y = y ) f XY ( X , Y )
P (Y = y | X = x ) =
=
for fX(x)>0 (5-4)
P( X = x)
f X ( x)
5-6
z
The function fY|x(y) is used to find the probabilities of the possible value
for Y given that X = x
Because a conditional probability mass function fY|x( y )is a probability mass
function for all y in Rx, the following properties are satisfied:
(1) fY |x ( y ) ≥ 0
(2)∑ fY |x ( y ) = 1
z
(5-5)
Rx
(3) P(Y = y | X = x) = fY |x ( y )
Example 5-6
z
For the joint probability distribution in Fig. 5-1,
f ( x, y ) P ( X = x, Y = y )
f Y | x ( y ) = XY
=
f X ( x)
P( X = x)
The function fY | x (y) is shown in Fig. 5-3.
f Y |0 (0) = P(Y = 0 | X = 0) =
5-7
P( X = 0, Y = 0) 1.6 × 10 −7
=
= 0.0016
P ( X = 0)
0.0001
Definition
Let Rx denote the set of all points in the range of (X, Y) for which X = x.
The conditional mean of Y given X = x, denoted as E(Y | x) or µ Y | x , is
E (Y | x) = ∑ yfY |x ( y )
(5-6)
Rx
2
and the conditional variance of Y given X =x, denoted as V(Y| x) or σY|x ,
is
V (Y | x) = ∑ ( y − µY |x ) 2 fY |x ( y ) =∑ y 2 fY |x ( y ) − µY2|x
Rx
Rx
Example 5-7
z
The conditional mean of Y given X=2 is obtained from the
conditional distribution in Fig. 5-3:
E (Y | 2) = µ Y |2 = 0 (0.040 ) + 1 (0.320 ) + 2 (0.640 ) = 1.6
z
The conditional variance of Y given X=2
V (Y | 2) = (0 − µY |2 ) 2 (0.040) + (1 − µY |2 ) 2 (0.320) + (2 − µY |2 ) 2 (0.640)
= 0.32
5-8
5-1.4 Independence
Example 5-8
z
In a plastic molding operation, each part is classified as to whether
it conforms to color and length specifications.
z
Define the random variable X and Y
⎧1 if the part conforms to color specifications
⎪
X =⎨
⎪0
otherwise
⎩
⎧1
⎪
Y =⎨
⎪0
⎩
z
if the part conforms to length specifications
otherwise
The joint probability distribution of X and Y is defined by fXY (x, y)
in Fig. 5-4(a).
Notice that for any x, fY|x ( y) =fY ( y).
Two random variables to be independent
If two random variables are independent, then
f ( x, y ) f X ( x ) f Y ( y )
f Y | x ( y ) = XY
=
= f Y ( y)
f X ( x)
f X ( x)
0.9702
f Y | 1 (1) =
= 0.98
0.99
Figure 5-4
5-9
f Y | 0 ( 0) =
0.0002
= 0.02
0.01
For discrete random variables X and Y, if any one of the following
properties is true, the others are also true, and X and Y are independent.
(1) fXY (x, y) = fX(x) fY(y) for all x and y
(2) fY|x ( y ) = fY (y) for all x and y with fX(x)>0
(3) fY|x( x )=fX (x) for all x and y with fY(y)>0
(4) P ( X ∈ A, Y ∈ B ) = P ( X ∈ A) P (Y ∈ B ) for any sets A and B in the
range of X and Y, respectively.
(5-7)
Rectangular Range for (X, Y)!
z
If the set of points in 2-d space that receive positive probability under
fXY(x, y) does not form a rectangle, X and Y are not independent because
knowledge of X can restrict the range of values of Y that receive positive
probability.
Example 5-9
z
In a large shipment of parts, 1% of the parts do not conform to
specifications.
z
The supplier inspects a random sample of 30 partS
z
X: The number of parts in the sample that do not conform to
specifications.
z
The purchaser inspects another random sample of 20 parts
z
Y: The number of parts in this sample that do not conform to
specifications.
z
What is the probability that X≦1and Y≦1?
z
The sampling is with replacement and that X and Y are
independent.
z
The marginal probability distribution of X is binomial with n = 30
and p =0.01
z
The marginal probability distribution of Y is binomial with n = 20
and p =0.01.
z
If independence between X and Y were not assumed
P ( X ≤ 1, Y ≤ 1) = P ( X = 0, Y = 0) + P ( X = 1, Y = 0) + P ( X = 0, Y = 1)
+ P ( X = 1, Y = 1)
= f XY (0, 0) + f XY (1, 0) + f XY (0, 1) + f XY (1, 1)
5-10
z
However, with independence
P ( X ≤ 1, Y ≤ 1) = P ( X ≤ 1) P (Y ≤ 1)
P ( X ≤ 1) = P ( X = 0) + P ( X = 1)
=
( )0.01 × 0.99
30
0
0
30
+
( )0.01 × 0.99
30
1
1
29
= 0.2241152 + 0.7397 = 0.9639
P(Y ≤ 1) = P(Y = 0) + P(Y = 1)
( )
z
z
z
( )
= 020 0.010 × 0.99 20 + 120 0.011 × 0.9919 = 0.9831
Therefore, P ( X ≤ 1, Y ≤ 1) = 0.9639 × 0.9831 = 0.948
If the supplier and the purchaser change their policies so that the
shipment is acceptable only if zero nonconforming parts are found in
the sample.
The probability that the shipment is accepted for production is still
quite high.
P( X = 0, Y = 0) = P( X = 0) P(Y = 0) = 0.605
Exercise 5-1: 5-1, 5-3, 5-5, 5-7, 5-9, 5-13, 5-15
5-11
5-2 Multiple discrete random variables
5-2.1 Joint Probability Distributions
z
Given discrete random variables X1, X2, …, Xp, the joint probability
distribution of X1, X2, …, Xp is a description of the set of points (x1, x2
,…, xp) in the range of X1, X2, …, Xp , along with the probability of each
point.
Definition
The joint probability mass function of X1, X2,… , Xp is
f X1 X 2 ... X p ( x1 , x2 , ..., x p ) = P( X 1 = x1 , X 2 = x2 , ..., X p = x p )
(5-8)
for all points (x1, x2,…, xp) in the range of X1, X2,…, Xp
Definition
If X1,X2,…, Xp are discrete random variables with joint probability
mass function f X 1 X 2 ... X p ( x1 , x2 , ..., x p ) , the marginal probability
mass function of any Xi is
f X i ( xi ) = P( X i = xi ) = ∑ f X1 X 2 ... X p ( x1 , x2 , ..., x p )
Rxi
(5-9)
where R xi denotes the set of points in the range of (X1, X2,… , Xp)
for which Xi = xi
Example 5-11
z
The joint probability distribution of three random variables X1, X2,
X3 are shown in Fig. 5-5.
z
x1 + x2 +x3 =3
z
The marginal probability distribution of X2 is found as follows
P( X 2 = 0) = f X1X 2 X 3 (3, 0, 0) + f X1X 2 X 3 (0, 0, 3) + f X1X 2 X 3 (1, 0, 2)
+ f X1X 2 X 3 (2, 0, 1)
P( X 2 = 1) = f X1X 2 X 3 (2, 1, 0) + f X1X 2 X 3 (0, 1, 2) + f X1X 2 X 3 (1, 1, 1)
P( X 2 = 2) = f X1X 2 X 3 (1, 2, 0) + f X1X 2 X 3 (0, 2, 1)
P( X 2 = 3) = f X1X 2 X 3 (0, 3, 0)
5-12
Figure 5-5 Joint probability distribution of X1, X2, and X3
Mean and variance from Joint distribution
E ( X i ) = ∑ xi f X1X 2 ... X p ( x1 , x2 , K, x p )
R
and
V ( X i ) = ∑ ( xi − µ X i ) 2 f X1X 2 ... X p ( x1 , x2 , K , x p )
(5-10)
R
where R is the set of all points in the range of X1, X2, … , Xp.
Distribution of a Subset of Random Variables
If X1,X2,…, Xp are discrete random variables with joint probability mass
function f X 1 , X 2 ,..., X p ( x1 , x2 , ..., x p ) ,the joint probability mass function of
X1,X2,…, Xk, k < p, is
f X1X 2 ... X k ( x1 , x2 ,..., xk ) = P ( X 1 = x1 , X 2 = x2 ,K , X k = xk )
∑ P( X
=
R
1
= x1 , X 2 = x2 ,..., X k = xk )
(5-11)
x1x2 ... xk
where Rx1x2 ... xk denotes the set of all points in the range of X1,X2,…Xp for
which X1 = x1, X2 = x2,…, Xk = xk
5-13
z
That is, P(X1 = x1, X2 = x2,…, Xk = xk) is the sum of the probabilities
over all points in the range of X1, X2, …, Xp for which X1 = x1 , X2 =
x2,…, and Xk = xk.
Conditional Probability Distributions
z
Conditional probability distributions can be developed for multiple
discrete random variables by an extension of the ideas used for two
discrete random variables.
z
For example, the conditional joint probability mass function of X1,
X2, X3 given X4, X5 is
f X 1 X 2 X 3 | x4 x5 ( x1 , x2 , x3 ) =
f X 1 X 2 X 3 X 4 X 5 ( x1 , x2 , x3 , x4 , x5 )
f X 4 X 5 ( x4 , x5 )
for f X 4 X 5 ( x 4 , x5 ) > 0
Definition
Discrete variables X1, X2, …, Xp are independent if and only if
f X 1 X 2 ... X p ( x1 , x2 , ..., x p ) = f X 1 ( x1 ) f X 2 ( x2 )... f X p ( x p )
(5-12)
for all x1, x2,…, xp
z
It can be show that if X1, X2, …, Xp are independent
P ( X 1 ∈ A1 , X 2 ∈ A2 , L , X p ∈ A p ) = P ( X 1 ∈ A1 ) P ( X 2 ∈ A2 ) K P ( X p ∈ A p )
for any sets A1, A2, …, Ap
5-14
5-2.2 Multinomial Probability Distribution
Example 5-12
z
Of the 20 bits received, what is the probability that 14 are excellent,
3 are good, 2 are fair, and 1 is poor?
z
The probabilities of E, G, F, and P are 0.6, 0.3, 0.08, and 0.02,
respectively.
z
One sequence of 20 bits : EEEEEEEEEEEEEE GGG FF P
14
3
2 1
P(EEEEEEEEEEEEEEGGGFFP)
= 0.6 14 0.3 30.0820.021 =2.708 × 10-9
z
The number of sequences is :
20!
= 2325600
14!3!2!1!
z
The requested probability is:
P(14E’s, three G’s, two F’s , and one P)
= 2325600(2.708 × 10-9) =0.0063
Multinomial distribution
Suppose a random experiment consists of a series of n trials. Assume that
(1) The result of each trial is classified into one of k classes.
(2) The probability of a trial generating a result in class 1, class 2, ...,
class k is constant over the trials and equal to p1, p2, …, pk,
respectively.
(3) The trials are independent.
The random variables X1, X2, …, Xk that denote the number of trials that
result in class 1, class 2, …, class k, respectively, have a multinomial
distribution and the joint probability mass function is
P( X 1 = x1 , X 2 = x2 , ..., X k = xk ) =
n!
p1x1 p2x2 ... pkxk
x1! x2 ! K xk !
for x1 +x2 +…+xk = n and p1 + p2 + …+pk = 1
5-15
(5-13)
Example 5-13
In Example 5-12, let the random variables X1, X2, X3, and X4 denote the
number of bits that are E, G, F, and P, respectively, in a transmission of 20
bits.
The probability that 12 of the bits received are E, 6 are G, 2 are F, and 0 are
P
20!
P( X 1 = 12, X 2 = 6, X 3 = 2, X 4 = 0) =
0.1612 0.36 0.082 0.020
12!6! 2!0!
= 0.0358
If X1, X2, . . . , Xk have a multinomial distribution, the marginal probability
distribution of Xi is binomial with
E(Xi)= npi and V(Xi)= npi (1- pi)
(5-14)
Example 5-14
z
In Example 5-13, the marginal probability distribution of X2 is
binomial with n =20 and p = 0.3.
z
The joint marginal probability distribution of X2 and X3 :
P(X2 =x2, X3 =x3) is the probability that exactly x2 trials result in G
and that x3 result in F.
z
The remaining n- x2 -x3 trials must result in either E or P.
z
{G}, {F}, or {E, P}, with probabilities 0.3, 0.08, and 0.6 + 0.02 =
0.62
f X 2 X 3 ( x 2 , x3 ) = P ( X 2 = x 2 , X 3 = x3 )
=
n!
(0.3) x2 (0.08) x3 (0.62) n− x2 − x3
x2 ! x3 !(n − x2 − x3 )!
Exercise 5-2 : 5-17, 5-19, 5-23, 5-25, 5-27, 5-29
5-16
5-3 Two continuous random variables
5-3-1 Joint probability distributions
A joint probability density function can be defined over
two-dimensional space. The double integral of fXY(x, y) over a region R
provides the probability that (X, Y )assumes a value in R.
The integral can be interpreted as the volume under the surface fXY(x, y)
over the region R.
z
z
Definition
A joint probability density function for the continuous random variables X and
Y, denoted as fXY(x, y), satisfies the following properties:
(1) f XY ( x, y ) ≥ 0 for all x , y
∞ ∞
(2)
∫∫f
XY
( x, y ) dx dy = 1
−∞ −∞
(3) For any region R of two-dimensional space
P([ X , Y ] ∈ R) = ∫ ∫ f XY ( x, y ) dx dy
(5-15)
R
Example 5-15
z
The random variable X : the time until a computer server connects
to your machine (in milliseconds)
z
Y :the time until the server authorizes you as a valid user (in
milliseconds).
z
X<Y
z
The joint probability density function for X and Y:
f XY ( x, y ) = 6 × 10 −6 exp(−0.001x − 0.002 y )
z
for x < y
The region with nonzero probability is shaded in Fig. 5-8.
Figure 5-8
The joint probability density function of X and Y is nonzero
over the shaded region
5-17
⎛∞
⎞
−6 −0.001 x −0.002 y
⎜
⎟dx
=
×
f
x
y
dy
dx
e
dy
(
,
)
6
10
XY
∫−∞ ∫−∞
∫0 ⎜⎝ ∫x
⎟
⎠
∞ ∞
⎛
⎞
−6
= 6 × 10 ∫ ⎜⎜ ∫ e −0.002 y dy ⎟⎟e −0.001x dx
0⎝x
⎠
∞
−0.002 x
⎞ −0.001x
−6 ⎛ e
⎟⎟e
= 6 × 10 ∫ ⎜⎜
dx
0
.
002
⎠
0⎝
∞
∞
∞
∞
(
)
1
) =1
0
.
003
0
The probability that X <1000 and Y < 2000
= 0.003∫ e −0.003 x dx = 0.003(
z
1000 2000
P( X ≤ 1000, Y ≤ 2000) =
∫ ∫f
0
XY
( x, y )dy dx
x
⎛ 2000 − 0.002 y ⎞ − 0.001x
= 6 × 10 ∫ ⎜⎜ ∫ e
dy ⎟⎟e
dx
0 ⎝ x
⎠
1000
⎛ e − 0.002 x − e − 4 ⎞ − 0.001x
−6
⎟⎟e
= 6 × 10 ∫ ⎜⎜
dx
0
.
002
⎠
0 ⎝
−6
1000
⎡⎛ 1 − e − 3 ⎞ − 4 ⎛ 1 − e −1 ⎞⎤
⎟⎟⎥
⎟⎟ − e ⎜⎜
− e − 4 e − 0.001x dx = 0.003⎢⎜⎜
0
.
003
0
.
001
⎝
⎠⎦
⎠
0
⎣⎝
= 0.003(316.738 − 11.578) = 0.915
1000
= 0.003
Fig5-9
∫ (e
− 0.003 x
)
Region of integral for probability that X<1000 and Y<2000 is
darkly shade.
5-18
5-3-2 Marginal probability distributions
Definition
If the joint probability density function of continuous random variables X and Y
is fXY (x, y), the marginal probability density functions of X and Y are
Definition
f X ( x ) = f XY ( x, y ) dy and f Y ( y ) = f XY ( x, y ) dx
(5-16)
∫
∫
Ry
Rx
Rx :the set of all points in the range of (X, Y) for which X = x
Ry :the set of all points in the range of (X, Y) for which Y = y
b
z
P ( a < X < b) = ∫ ∫
a Rx
b
⎞
⎛
f XY ( x, y ) dy dx = ∫ ⎜⎜ ∫ f XY ( x, y )dy ⎟⎟dx = ∫ f X ( x) dx
a ⎝ Rx
a
⎠
b
Mean and Variance from Joint Distribution
E( X ) = µ X =
∞
∫ xf X ( x)dx =
−∞
⎤
⎡
x
f
(
x
,
y
)
dy
⎥ dx
⎢
∫−∞ ⎢ R∫ XY
⎥⎦
⎣ x
∞
(5-17)
= ∫∫ xf XY ( x, y ) dx dy
R
and
⎡
⎤
V ( X ) = σ = ∫ ( x − µ X ) f X ( x)dx = ∫ ( x − µ X ) 2 ⎢ ∫ f XY ( x, y )dy ⎥ dx
−∞
−∞
⎣R
⎦
= ∫∫ ( x − µ X ) 2 f XY ( x) dx dy
∞
2
X
∞
2
x
R
Rx :the set of all points in the range of (X, Y) for which X = x
5-19
Example 5-16
z
For the random variables that denote times in Example 5-15,
calculate the probability that Y exceeds 2000 milliseconds.
2000
⎛ ∞
⎞
P(Y > 2000) = ∫ ⎜⎜ ∫ 6 × 10 −6 e −0.001 x−0.002 y dy ⎟⎟ dx
0 ⎝ 2000
⎠
⎛∞
⎞
+ ∫ ⎜⎜ ∫ 6 × 10 −6 e −0.001 x−0.002 y dy ⎟⎟ dx
2000 ⎝ x
⎠
= 0.0475 + 0.0025 = 0.05
Alternatively, the probability can be calculated from the marginal
probability distribution of Y
y>0
∞
z
z
y
fY ( y ) = ∫ 6 × 10−6 e −0.001x−0.002 y dx = 6 × 10−3 e −0.002 y (1 − e −0.001 y )
0
for y > 0
∞
P (Y > 2000) =
∫f
Y
( y )dy = 6 × 10
2000
−3
∞
∫e
2000
5-20
− 0.002 y
(1 − e −0.001 y )dy = 0.05
5-3.3 Conditional Probability Distributions
Definition
Given continuous random variables X and Y with joint probability density
function fXY (x, y), the conditional probability density function of Y given
X = x is
f ( x, y )
f Y | x ( y ) = XY
for fX(x)>0
(5-18)
f X ( x)
The function fY|x(y) is used to find the probabilities of the possible
values for Y given that X = x.
Let Rx denote the set of all points in the range of (X, Y) for which X =
x .The conditional probability function provides the conditional
probabilities for the values of Y in the set Rx
z
z
Because the conditional probability density function fY|x(y)is a probability
density function for all y in Rx, the following properties are satisfied:
(1) f Y | x ( y ) ≥ 0
(2)
∫f
Y |x
( y )dy = 1
Rx
(3) P (Y ∈ B | X = x) =
∫f
Y |x
( y ) dy for any set B in the range of Y
B
(5-19)
Example 5-17
z
For the random variables that denote times in Example 5-15,
determine the conditional probability density function for Y given
that X = x.
z
The marginal density function of x is
∞
f X ( x) = ∫ 6 × 10 −6 e −0.001x −0.002 y dy = 0.003e −0.003 x for x >0
x
z
This is an exponential distribution with λ = 0.003 .
5-21
z
For 0 < x and x < y the conditional probability density function :
f XY ( x, y ) 6 ×10−6 e −0.001x −0.002 y
fY | x ( y ) =
=
f X ( x)
0.003e −0.003 x
= 0.002e 0.002 x −0.002 y
z
for 0 < x and x < y
Determine the probability that Y exceeds 2000, given that x = 1500.
∞
P(Y > 2000 | x = 1500) =
∫f
∞
Y |1500
( y )dy =
2000
∫ 0.002e
0.002 (1500 ) −0.002 y
dy
2000
= 0.368
Definition
Let Rx denote the set of all points in the range of (X, Y) for which X =x. The
conditional mean of Y given X = x, denoted as E(Y | x) or µY|x is
E (Y | x) =
∫ yf
Y |x
( y ) dy
Rx
and the conditional variance of Y given X = x, denoted as V(Y | x) or σ Y2|x is
2
V (Y | x ) = ∫ ( y − µY |x ) f Y |x ( y ) dy =
Rx
∫y
2
f Y |x ( y ) dy − µY2|x
Rx
Example 5-18
z
For the random variables that denote times in Example 5-15,
determine the conditional mean for Y given that x = 1500.
z
The conditional probability density function for Y was
determined in Example 5-17.
fY | x ( y ) = 0.002e 0.002 x −0.002 y 0 < x and x < y
E (Y | x = 1500)
∞
=
∫ y(0.002e
0.002 (1500) −0.002 y
∞
)dy = 0.002e
1500
3
∫ ye
1500
5-22
−0.002 y
dy = 2000
(5-20)
5-3.4 Independence
z
If fXY (x, y) = fX(x) fY(y) for all x and y, X and Y are independent.
Definition
For continuous random variables X and Y, if any one of the following
properties is true, the others are also true, and X and Y are said to be
independent.
(1)fXY (x, y) = fX(x) fY(y) for all x and y
(2)fY|x(y) =fY(y)for all x and y with fX(x)>0
(3)fX|y(x) for all x and y with fY(y)>0
(4) P ( X ∈ A, Y ∈ B ) = P ( X ∈ A) P (Y ∈ B ) for any sets A and B in the range
of X and Y, respectively
(5-21)
Example 5-19
z
For the joint distribution of times in Example 5-15
z
From Ex. 5-16, P(Y>2000)=0.05
z
From Ex. 5-17, P(Y>2000 | x=1500)=0.368
∵P(Y>2000)≠P(Y>2000 | x=1500)
∴These variables are not independent
Example 5-20
z
Example 5-15 is modified so that the joint probability density
−6 −0.001 x −0.002 y
for x ≥ 0
function of X and Y is f XY ( x, y ) = 2 × 10 e
and y ≥ 0
z Show that X and Y are independent and determine
P(X >1000 , Y < 1000)
z The marginal probability density function of X:
∞
f X ( x) = ∫ 2 × 10 −6 e −0.001x −0.002 y dy = 0.001e −0.001x for x >0
0
z
The marginal probability density function of y :
∞
f Y ( y ) = ∫ 2 × 10 −6 e −0.001x −0.002 y dx = 0.002e −0.002 y
for y >0
0
z
z
Therefore, fXY (x, y) =fX (x) fY (y) for all x and y
X and Y are independent
P( X > 1000 , Y < 1000) = P( X > 1000) P(Y < 1000)
= e −1 (1 − e −2 ) = 0.318
5-23
Example 5-21
z The random variables X and Y : the lengths of two dimensions of a
machined part
z Assume that X and Y are independent random variables
z The distribution of X is normal with mean 10.5 millimeters and
variance 0.0025 (millimeter)2
z The distribution of Y is normal with mean 3.2 millimeters and
variance 0.0036(millimeter)2
z Determine the probability that 10.4 < X < 10.6 and 3.15 < Y < 3.25
z Because X and Y are independent
P(10.4 < X < 10.6, 3.15 < Y < 3.25)
= P(10.4 < X < 10.6) P(3.15 < Y < 3.25)
10.4 − 10.5
10.6 − 10.5
3.15 − 3.2
3.25 − 3.2
) P(
)
<X<
<Y <
= P(
0.05
0.05
0.06
0.06
= P(−2 < Z < 2) P(−0.833 < Z < 0.833) = 0.566
Exercise: 5-35, 5-37, 5-39, 5-49, 5-51, 5-53
5-24
5-5
z
Covariance and Correlation
The covariance is a measure to describe the relationship between two
random variables.
Definition
⎧∑∑ h( x, y ) f XY ( x, y )
X , Y discrete
⎪⎪ R
E [h( X , Y )] = ⎨
⎪∫∫ h( x, y ) f XY ( x, y ) dx dy X , Y continuous
⎪⎩ R
(5 − 27)
Example 5-27
z
For the joint probability distribution of the two random variables
in Fig. 5-12
0.3
0.4
0.3
0.3
z
z
z
0.7
Calculate E[( X − µ X )(Y − µ Y )]
µ X = 1 × 0.3 + 3 × 0.7 = 2.4
µY = 1 × 0.3 + 2 × 0.4 + 3 × 0.3 = 2.0
E[( X − µ X )(Y − µY )] = (1 − 2.4)(1 − 2.0) × 0.1
+ (1 − 2.4)( 2 − 2.0) × 0.2 + (3 − 2.4)(1 − 2.0) × 0.2
+ (3 − 2.4)( 2 − 2.0) × 0.2 + (3 − 2.4)(3 − 2.0) × 0.3
= 0.2
5-25
Definition
The covariance between the random variables X and Y, denoted as
cov(X, Y ) or σ XY is
σ XY = E[( X − µ X )(Y − µY )] = E ( XY ) − µ X µY
(5-28)
z
z
If the points in the joint probability distribution of X and Y that receive
probability tend to fall along a line of positive (or negative )
slope, σ XY is positive (or negative)
Covariance is a measure of linear relationship between the random
variables.
(a) Positive covariance
(b) Zero covariance
All points are
of equal
probability
(d) Zero covariance
(c)Negative covariance
5-26
z
The equality of the two expressions for covariance in Equation 5-28 is
shown for continuous random variables as follows.
∞ ∞
σ XY = E[( X − µ X )(Y − µY )] =
∫ ∫ (x − µ
X
)( y − µY ) f XY ( x, y ) dx dy
− ∞− ∞
∞ ∞
=
∫ ∫ [xy − µ
X
y − xµY + µ X µY ]f XY ( x, y ) dx dy
− ∞− ∞
∞ ∞
⎡
⎤
−∞ −∞
⎣
⎦
∫ ∫ µ X y f XY ( x, y) dx dy = µ X ⎢∫ ∫ y f XY ( x, y)dx dy ⎥ = µ X µY
E[( X − µ x )(Y − µ y )] =
∞ ∞
∫ ∫ xy f
XY
( x, y ) dx dy − µ X µY − µ X µY + µ X µY
− ∞− ∞
∞ ∞
=
∫ ∫ xy f
XY
( x, y ) dx dy − µ X µY = E ( XY ) − µ X µY
− ∞− ∞
Definition
The correlation between random variables X and Y, denoted as ρ XY is
σ
cov( X , Y )
ρ XY =
= XY
(5-29)
V ( X )V (Y ) σ X σY
For any two random variables X and Y
− 1 ≤ ρ XY ≤ +1
z
z
(5-30)
Two random variables with nonzero correlation are said to be
correlated .
The correlation is also a measure of the linear relationship between
random variables.
5-27
Example 5-29
z
For the discrete random variables X and Y with the joint
distribution shown in Fig. 5-14
0.4
0.2
0.2
0.2
0.2
z
z
0.2
0.2
0.4
Fig 5-14
Determine σ XY and ρ XY
The calculations for E(XY), E(X), and V(X):
E ( XY ) = 0 × 0 × 0.2 + 1 × 1 × 0.1 + 1 × 2 × 0.1 + 2 × 1 × 0.1
+ 2 × 2 × 0 .1 + 3 × 3 × 0 .4 = 4 .5
E ( X ) = 0 × 0 .2 + 1 × 0 .2 + 2 × 0 .2 + 3 × 0 .4 = 1 .8
V ( X ) = (0 − 1.8) 2 × 0.2 + (1 − 1.8) 2 × 0.2 + (2 − 1.8) 2 × 0.2
+ (3 − 1.8) 2 × 0.4 = 1.36
z
E[Y]=1.8, V[Y]=1.36
σ XY = E ( XY ) − E ( X ) E (Y ) = 4.5 − (1.8)(1.8) = 1.26
ρ XY =
σ XY
1.26
=
= 0.926
σ X σY ( 1.36 )( 1.36 )
If X and Y are independent random variables
σ XY = ρ XY =0
5-28
( 5-31)
Proof:
σ XY = E ( XY ) − µ X µ Y
= E ( X ) E (Y ) − µ X µ Y
=0
Example 5-31
z For the two random variables in Fig. 5-16, show that σ XY = 0
Fig. 5-16
⎤
1 ⎡
32
E ( XY ) = ∫ ∫ xyf XY ( x, y ) dx dy = ∫ ⎢ ∫ x 2 y 2 dx ⎥ dy =
16 0 ⎣ 0
9
0 0
⎦
4
4 2
2
4 2
1 ⎡ 2 ⎤
4
E ( X ) = ∫ ∫ xf XY ( x, y ) dx dy = ∫ ⎢ ∫ x dx ⎥ dy =
16 0 ⎣ 0
3
0 0
⎦
4 2
4
⎡2
⎤
1
8
2
E (Y ) = ∫ ∫ yf XY ( x, y ) dx dy = ∫ y ⎢ ∫ x dx ⎥ dy =
16 0 ⎣ 0
3
0 0
⎦
4 2
z
E ( XY ) − E ( X ) E (Y ) =
32 4 8
− ( )( ) = 0
9
3 3
Exercise 5-5: 5-67, 5-71, 5-73, 5-75
5-29
5-6 BIVARIATE NORMAL DISTRIBUTION
z
Bivariate normal distribution : an extension of a normal distribution to
two random variables.
Definition
The probability density function of a bivariate normal distribution is
f XY ( x, y; σ X , σ Y , µ X , µ Y , ρ ) =
1
2πσ X σ Y 1 − ρ 2
−
exp{
2 ρ ( x − µ X )( y − µ Y )
σ XσY
(x − µ X )2
−1
[
2(1 − ρ 2 )
σX2
+
( y − µY ) 2
(5-32)
]}
σY2
for − ∞ < x < ∞ and − ∞ < y < ∞ , with parameters σ x > 0 , σ Y > 0 ,
− ∞ < µ X < ∞ , − ∞ < µY < ∞ , and − 1 < ρ < 1 .
EXAMPLE 5-33
1
2
+ y2 )
z
The joint probability density function f XY ( x, y ) =
z
special case of a bivariate normal distribution with σ X = 1 , σ Y = 1 ,
µ X = 0 , µY = 0 , and ρ = 0 .
See Fig. 5-18.
Figure 5-18
2π
e −0.5( x
is a
Bivariate normal probability density function with σ X = 1 ,
σ Y = 1 , ρ = 0 , µ X = 0 , and µY = 0 .
5-30
Marginal Distributions of Bivariate Normal Random Variables
If X and Y have a bivariate normal distribution with joint probability density
f XY (σ X , σ Y , µ X , µY , ρ ) , the marginal probability distributions of X and Y
are normal with means µ X and µY and standard deviations σ X and σ Y ,
respectively.
(5-33)
If X and Y have a bivariate normal distribution with joint probability
density function f XY ( x, y;σ X , σ Y , µ X , µY , ρ ) , the correlation between X and Y
is ρ .
(5-34)
If X and Y have a bivariate normal distribution with ρ = 0, X and Y are
independent.
(5-35)
Exercise 5-6 : 5-81
5-31
5-7 LINEAR COMBINATIONS OF RANDOM VARIABLES
Definition
Given random variables X 1 , X 2 ,..., X p and constants c1 , c2 ,..., c P ,
(5-36)
Y = c1 X 1 + c 2 X 2 + ... + c p X p
is a linear combination of X 1 , X 2 ,..., X p .
Mean of a Linear Combination
If Y = c1 X 1 + c 2 X 2 + ... + c p X p ,
E (Y ) = c1 E ( X 1 ) + c 2 E ( X 2 ) + ... + c p E ( X p )
(5-37)
Variance of a Linear Combination
If X 1 , X 2 ,..., X p are random variables, and Y = c1 X 1 + c 2 X 2 + ... + c p X p then in
general
(5-38)
V (Y ) = c12V ( X 1 ) + c 22V ( X 2 ) + ... + c 2pV ( X p ) + 2∑∑ ci c j cov( X i , X j )
i< j
If X 1 , X 2 ,..., X p are independent,
(5-39)
V (Y ) = c12V ( X 1 ) + c 22V ( X 2 ) + ... + c 2pV ( X p )
Prove:
V [Y ]
= V [c1 X 1 + c 2 X 2 + ... + c p X p ]
= [c1 X 1 + c 2 X 2 + ... + c p X p − c1 µ1 − c 2 µ 2 − ... − c p µ p ] 2
= [c1 ( X 1 − µ1 ) + c 2 ( X 2 − µ 2 ) + ... + c p ( X p − µ p )]
2
= c12 ( X 1 − µ1 ) 2 + c 22 ( X 2 − µ 2 ) 2 + ... + c 2p ( X p − µ p ) 2
+ 2c1c 2 ( X 1 − µ1 )( X 2 − µ 2 ) + 2c2 c3 ( X 2 − µ 2 )( X 3 − µ 3 ) + ...
= c12V [ X 1 ] + ... + c 2pV [ X p ] + 2∑∑ ci c j cov( X i , X j )
i< j
EXAMPLE 5-36
z
Suppose the random variables X 1 and X 2 denote the length and
width, respectively, of a manufactured part.
E ( X 1 ) = 2 centimeters with standard deviation 0.1 centimeter
z
E ( X 2 ) = 5 centimeters with standard deviation 0.2 centimeter.
z
z
The covariance between X 1 and X 2 is -0.005.
5-32
z
Y = 2 X 1 + X 2 is a random variable that represent the perimeter of
the part.
X1
X2
X2
X1
Y=2X1+2X2
z
E (Y ) = 2(2) + 2(5) = 14 centimeters
z
V (Y ) = 2 2 (0.12 ) + 2 2 (0.2 2 ) + 2 × 2 × 2(−0.005) = 0.16 centimeters squared
Mean and Variance of an Average
If X = ( X 1 + X 2 + ... + X p ) / p with E ( X i ) = µ for i = 1,2..., p
E( X ) = µ
(5-40a)
if X 1 , X 2 ,..., X p are also independent with V ( X i ) = σ
V (X ) =
σ2
2
for i = 1,2,..., p
(5-40b)
p
Prove:
V ( X ) = (1 / p) 2 σ 2 + ... + (1 / p) 2 σ 2 = σ 2 / p
14444244443
p _ terms
Reproductive Property of the Normal Distribution
If X 1 , X 2 ,..., X p are independent, normal random variables with E ( X i ) = µ i
and V ( X i ) = σ i2 , for i = 1,2,..., p
Y = c1 X 1 + c 2 X 2 + ... + c p X p
is a normal random variable with
E (Y ) = c1 µ1 + c 2 µ 2 + ... + c p µ p
and
V (Y ) = c12σ 12 + c 22σ 22 + ... + c 2pσ p2
5-33
(5-41)
EXAMPLE 5-37
z
Let the random variables X 1 and X 2 denote the length and width,
respectively, of a manufactured part.
z
X 1 is normal with E ( X 1 ) = 2 centimeters and standard deviation
0.1 centimeter.
z
X 2 is normal with E ( X 2 ) = 5 centimeters and standard deviation
0.2 centimeter.
z
Assume that X 1 and X 2 are independent. Determine the
probability that the perimeter exceeds 14.5 centimeters.
Sol:
z
Y = 2 X 1 + 2 X 2 is a normal random variable that represents the
perimeter of the part.
z
E (Y ) = 2 × 2 + 2 × 5 = 14
V (Y ) = 4 × 0.12 + 4 × 0.2 2 = 0.0416
z
(先標準化,再查表)
z
P (Y > 14.5)
= P[(Y − µ Y ) / σ Y > (14.5 − 14) / 0.0416 ]
= P( Z > 1.12)
= 0.13
EXAMPLE 5-38
z
Soft-drink cans are filled by an automated filling machine.
z
The mean fill volume is 12.1 fluid ounces.
z
The standard deviation is 0.1 fluid ounce.
z
Assume that the fill volumes of the cans are independent, normal
random variables.
z
What is the probability that the average volume of 10 cans selected
from this process is less than 12 fluid ounces?
Sol:
z
Let X 1 , X 2 ,..., X 10 denote the fill volumes of the 10 cans.
0.12
= 0.001
10
12 − 12.1
<
] = P( Z < −3.16) = 0.00079
0.001
z
E ( X ) = 12.1 and V ( X ) =
z
P ( X < 12) = P[
X − µX
σX
Exercise 5-7 : 5-87,5-89,5-91,5-93,5-95
5-34
中華大學資訊工程學系
學號:
姓名:
日期:
時間:
輔導老師:
課程名稱:
年級:□一年級
班別:□甲
□乙
□二年級
地點:
□三年級
□丙(二部)
課後輔導記錄表
□四年級
□二技專
□研一
□碩專
問題描述:
問題回覆:
系所主任簽章:
輔導老師簽章:
5-35
□研二
Related documents