Download Normal distributions. • Normal distributions are commonly used

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Normal distributions.
• Normal distributions are commonly used because distributions of averages of IID random variables can often be approximated well by normal
distributions.
• The standard normal distribution (標準常態分布) N (0, 1). Suppose that
X has the following PDF:
x2
1 −
fX (x) = √ e 2 ,
2π
−∞ < x < ∞.
Then the distribution of X is called the standard normal distribution,
denoted by N (0, 1).
• Normal distribution N (µ, σ 2 ). Suppose that µ and σ > 0 are constants,
and X is a random variable such that
X −µ
∼ N (0, 1).
σ
The distribution of X is called the normal distribution with mean µ
and variance σ 2 (see Page 4 of this handout for mean and variance of
N (µ, σ 2 )), denoted by N (µ, σ 2 ). X has the following PDF:
fX (x) = √
(x − µ)2
2σ 2 ,
e
−
1
2πσ 2
−∞ < x < ∞.
b−µ
a−µ
< N (0, 1) <
. The notation
σ
σ
P (a < N (µ, σ 2 ) < b) means P (a < X < b), where X ∼ N (µ, σ 2 ).
• P (a < N (µ, σ 2 ) < b) is P
• P (a < N (0, 1) < b) can be found using software that computes
Z
a
b
2
1
√ e−x /2 dx
2π
or using normal tables.
• R codes for computing P (0 < N (0, 1) < 1.5).
– Use integrate. integrate(f, 0, 1.5) computes
f <- function(x){ exp(-x^2/2)/sqrt(2*pi) }
integrate(f, 0, 1.5)
1
R 1.5
0
f (x)dx.
– Use pnorm. pnorm(a) gives P (N (0, 1) ≤ a).
pnorm(1.5)-pnorm(0)
gives P (0 < N (0, 1) ≤ 1.5) = P (0 < N (0, 1) < 1.5).
• Example 1. Use the table “Normal probabilities” at
http://www3.nccu.edu.tw/~tmhuang/teaching/statistics/tables/normal.pdf
to find P (0 < N (0, 1) < 1.5) and P (0 < N (0, 1) < 1.51).
z
0.0
0.1
..
.
0.00
0.0000
0.0398
0.01
0.0040
0.0438
0.02
0.0080
0.0478
..
.
···
···
···
0.09
0.0359
0.0753
1.5
..
.
0.4332
0.4345
0.4357
..
.
···
0.4441
3.0
0.4987
0.4987
0.4987
···
0.4990
.
– P (0 < N (0, 1) < 1.5) = 0.4332.
.
– P (0 < N (0, 1) < 1.51) = P (0 < N (0, 1) < (1.5 + 0.01)) = 0.4345.
• The N (0, 1) PDF is symmetric about 0. Therefore, for a > 0,
P (−a < N (0, 1) < 0) = P (0 < N (0, 1) < a)
and
P (N (0, 1) < −a) = P (N (0, 1) > a).
(a)
(b)
.
Figure 1: P (−1.5 < N (0, 1) < 0) = P (0 < N (0, 1) < 1.5) = 0.4332.
2
• Example 2. Find P (−1.5 < N (0, 1) < 1.3).
Sol.
= P (−1.5 < N (0, 1) < 0) + P (0 ≤ N (0, 1) < 1.3)
P (−1.5 < N (0, 1) < 1.3)
= P (0 < N (0, 1) < 1.5) + P (0 < N (0, 1) < 1.3)
.
= 0.4332 + 0.4032 = 0.8364.
• Example 3. Find P (N (0, 1) > 1.3) and P (N (0, 1) < −1.5).
Sol.
P (N (0, 1) > 1.3)
=
P (N (0, 1) > 0) − P (0 < N (0, 1) ≤ 1.3)
= P (N (0, 1) > 0) − P (0 < N (0, 1) < 1.3)
.
= 0.5 − 0.4032 = 0.0968,
and
P (N (0, 1) < −1.5)
=
P (N (0, 1) > 1.5)
=
P (N (0, 1) > 0) − P (0 < N (0, 1) ≤ 1.5)
=
.
=
0.5 − 0.4332 = 0.0668.
P (N (0, 1) > 0) − P (0 < N (0, 1) < 1.5)
• Example 4. Suppose that X ∼ N (1000, (100)2 ).
(a) Find P (1000 < X < 1100).
(b) Find P (X < 790).
Sol.
(a)
P (1000 < X < 1100)
1000 − 1000
X − 1000
1100 − 1000
<
<
100
100
100
.
= P (0 < N (0, 1) < 1) = 0.3413.
=
P
(b)
P (X < 790)
=
=
X − 1000
790 − 1000
P
<
100
100
P (N (0, 1) < −2.1)
=
P (N (0.1) > 2.1)
=
.
=
P (N (0, 1) > 0) − P (0 < N (0, 1) < 2.1)
0.5 − 0.4821 = 0.0179.
3
• Mean and variance for N (µ, σ 2 ). Suppose that X ∼ N (µ, σ 2 ). Then
E(X) = µ and V ar(X) = σ 2 .
(1)
One may derive (1) using the following facts.
1. (X − µ)/σ ∼ N (0, 1).
2. The mean and variance for N (0, 1) are 0 and 1 respectively.
3. For any constants a and b,
E(a + bX) = a + bE(X) and V ar(a + bX) = b2 V ar(X).
(2)
The proof of the second equality in (2) is left as an exercise (Problem
14).
• The mean and variance for N (0, 1) can be computed using R and the
following formulas:
Z
∞
E(X) =
xf (x)dx
−∞
and
Z
∞
(x − E(X))2 f (x)dx,
V ar(X) =
−∞
where f is the N (0, 1) PDF.
• R codes
– Define
g(x) = xf (x), where f is the N (0, 1) PDF, and then compute
R∞
g(x)dx,
which is the mean for N (0, 1).
−∞
g <- function(x){ x*exp(-x^2/2)/sqrt(2*pi)
integrate(g, -Inf, Inf)
}
The result shows that the mean for N (0, 1) is 0.
– DefineR g(x) = x2 f (x), where f is the N (0, 1) PDF, and then com∞
pute −∞ g(x)dx, which is the variance for N (0, 1).
g <- function(x){ (x^2)*exp(-x^2/2)/sqrt(2*pi)
integrate(g, -Inf, Inf)
}
The result shows that the variance for N (0, 1) is 1.
• Shift foremen income example (Page 231 in the 14th Ed or Page 235 in
the 15th Ed). The weekly incomes of shift foremen in the glass industry
follow the normal probability distribution with a mean of $1,000 and a
standard deviation of $100.
4
(a) What is the probability of selecting a shift foreman in the glass
industry whose income is between $1,000 and $1,100?
(b) What is the probability of selecting a shift foreman in the glass
industry whose income is less than $790?
Sol. See the solution to Example 4.
• The empirical rule. Suppose that X ∼ N (µ, σ 2 ),
 .
 = 0.68
.
= 0.95
P (µ − kσ < X < µ + kσ)
 .
=1
σ > 0. Then
if k = 1;
if k = 2;
if k = 3.
p
• Recall that for a random variable X such that E(X) = µ and V ar(X) =
σ, Chebyshev’s theorem states that

if k = 1;
 0
1
0.75
if k = 2;
P (µ − kσ < X < µ + kσ) ≥ 1 − 2 =
(3)
 .
k
= 0.89 if k = 3.
If X ∼ N (µ, σ 2 ), then from (1), (3) holds. However, the guaranteed
coverage probabilities from (3) are not very close to the exact coverage
probability (see the empirical rule for probability comparison).
• Example in the text (Page 228 in the 14th Ed or Page 232 in the 15th Ed).
As part of its quality assurance program, the Autolite Battery Company
conducts tests on battery life. For a particular D-cell alkaline battery,
the mean life is 19 hours. The useful life of the battery follows a normal
distribution with a standard deviation of 1.2 hours. Answer the following
questions.
(a) About 95 percent of the batteries failed between what two values?
(b) Virtually all of the batteries failed between what two values?
Ans. About 95 percent of the battery failed between 16.6 hours and 21.4
hours (19±2(1.2)). Virtually all of the batteries failed between 15.4 hours
and 22.6 hours (19 ± 3(1.2)).
• The six-sigma problem. Suppose that in the manufacturing process of
some product, the measurement of a quality characteristic can be viewed
as a random variable with mean µ and standard deviation σ. Then we can
use µ ± kσ to indicate a reasonable range for the measurement. Suppose
that k = 6. What can we say about the coverage probability for the
5
range µ ± 6σ? What can we say about the coverage probability if the
measurement is normally distributed?
Ans. The coverage probability is at least 1 − 1/36 ≈ 0.972. If the measurement is normally distributed, the coverage probability is P (−6 <
N (0, 1) < 6), which is very close to 1.
• Suppose that X ∼ N (µ, σ 2 ), then a + bX ∼ N (E(a + bX), V ar(a + bX)).
Example 5.
Y < 9).
Suppose that X ∼ N (1, 4) and Y = 5 − 2X. Find P (3 <
Sol. E(Y ) = 5 − 2E(X) = 3 and V ar(Y ) = 4V ar(X) = 16, so
Y −3
< 1.5 = P (0 < N (0, 1) < 1.5) = 0.4332.
P (3 < Y < 9) = P 0 < √
16
6
Related documents