Download Chapter 7 Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Sampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Gibbs sampling wikipedia , lookup

Transcript
Chapter 7 – Sampling Distributions
 Sampling error
o Sampling error is the distance between the point estimate and its
target parameter.
Characteristic
Mean
Standard deviation
Proportion
Sample Statistic
̅
s
̂
Population parameter
μ
σ
p
Sampling error
|̅
|
|
|
|̂
|
 Sampling distribution of the sample mean ̅
o The sampling distribution of the sample mean ̅ for a given sample size n consists of
the collection of the sample means of all possible samples of size n from the population.
Fact 1: Mean of the Sampling Distribution of the Sample Mean ̅
The mean of the sampling distribution of the sample mean ̅ is the value of the
population mean μ. It can be denoted as ̅
and read as “the mean of the
sampling distribution of ̅ is μ.”
Fact 2: Standard deviation of the Sampling Distribution of the Sample Mean ̅
The standard deviation of the sampling distribution of the sample mean ̅ is
√ , where is the population standard deviation and n is the sample size.
̅
Fact 3: Sampling Distribution of the Sample Mean for a Normal Population
The sampling distribution of the sample mean for a normal population is itself
normal, regardless of sample size.
Fact 4: Sampling Distribution of the Sample Mean for a Normal Population
For a normal population, the sampling distribution of the sample mean ̅ is
distributed as normal (
√ ), where is the population mean and is the
population standard deviation.
Fact 5: Standardizing a Normal Sampling Distribution for Means
When the sampling distribution of ̅ is normal, we may standardize the produce the
standard normal random variable Z as follows:
̅
̅
̅
where is the population mean,
sample size.
̅
√
is the population standard deviation, and n is the
 Central Limit Theorem for Means
o Given a population with mean μ and standard deviation σ, the
sampling distribution of the sample mean ̅ becomes approximately
normal (
√ ) as the sample size gets larger, regardless of the
shape of the population.
 Rule of Thumb for When Central Limit Theorem for Means Takes Effect
o We consider n ≥ 30 as large enough to apply the Central Limit
Theorem for any population.
 Three Cases for the Sampling Distribution of the Sample Mean ̅
o Case 1: The population is normal. Then the sampling distribution of ̅
is normal.
o Case 2: The population is either non-normal or unknown distribution
and the sample size is at least 30. Then the sampling distribution of ̅
is approximately normal (Central Limit Theorem for Means).
o Case 3: The population is either non-normal or of unknown
distribution and the sample size is less than 30. Then we have
insufficient information to conclude that the sampling distribution of
the sample mean ̅ is either normal or approximately normal.
Example 7.1
Q1. The U.S. Small Business Administration (SBA) provides information on the number of small
businesses for each metropolitan area in the United States. The mean is μ = 12,485 and the standard
deviation is σ = 21,973. Find the probability that a random sample of size n = 10 cities will have a
mean number of small businesses greater than 17,000.
A1. First we try to apply Case 1. Clearly, the population is not normal, so Case 1 does not apply.
We cannot apply Case 2, since the sample size n = 10 is too small. We must default to Case 3.
The population is skewed and the sample size is small. Therefore, we have insufficient
̅ is either normal or
information to conclude that the sampling distribution of the sample mean 𝒙
approximately normal. We cannot find the probability that a random sample of size n = 10 cities
will have a mean number of small businesses greater than 17,000.
Q2. Now, try again to find the probability that a random sample of size n = 36 cities will have a men
number of small businesses greater than 17,000.
A1. We try to apply Case. Since the sample size n = 36 is large enough, the Central Limit
Theorem applies. 𝝁𝒙̅
𝒁
𝟏𝟕 𝟎𝟎𝟎−𝝁𝒙̅
𝝈𝒙̅
𝝁
𝟏𝟐 𝟒𝟖𝟓 and 𝝈𝒙̅
𝟏𝟕 𝟎𝟎𝟎−𝟏𝟐 𝟒𝟖𝟓
𝟑𝟔𝟔𝟐.𝟏𝟔𝟔𝟕
𝝈
√𝒏
𝟐𝟏 𝟗𝟕𝟑
√𝟑𝟔
≈ 𝟑𝟔𝟔𝟐. 𝟏𝟔𝟔𝟕
̅> 17,000) ≈ P(Z > 1.23) = 1 – 0.8907 = 0.1093
≈ 𝟏. 𝟐𝟑 Thus, P(𝒙
There is 10.93% probability that a random sample of 36 cities will have a mean number of small
businesses greater than 17,000.
 Sample Distribution of the sample proportion ̂
o The Sample Distribution of the sample proportion ̂ for a given
sample size n consists of the collection of the sample proportions of
all possible samples of size n from the population.
(continued…)
Fact 6: Mean of the Sampling Distribution of the Sample Proportion ̂
The mean of the sampling distribution of the sample proportion ̂ is the value of the
population proportion p. this may be denoted as ̂
and read as “the mean of the
sampling distribution of ̂ is p.”
Fact 7: Standard deviation of the Sampling Distribution of the Sample Proportion p
The standard deviation of the sampling distribution of the sample proportion ̂ is
̂
√
−
, where p is the population proportion and n is the sample size.
Fact 8: Conditions for Approximate Normality for the Sampling Distribution of the
Sample Proportion ̂
The sampling distribution of the sample proportion ̂ may be considered
approximately normal only if both the following conditions hold:
(1) np ≥ 5
and
(2) n(1 – p) ≥ 5.
The minimum sample size required to produce approximate normality in the sampling
distribution of ̂ is the larger of either
n1 =
or
n2 =
−
 Central Limit Theorem for Proportions
o The sampling distribution of the sample proportion ̂ follows an
approximately normal distribution with mean ̂
and standard
deviation
̂
√
−
when both the following condition are
satisfied: (1) np ≥ 5 and (2) n(1 – p) ≥ 5.
Example 7.2
Q1. We learned that color blindness linked to the X chromosome afflicts 8% of men. Describe the
̂ , the proportion of men who have color blindness linked the X chromosome,
sampling distribution of 𝒑
for sample of size (a) 50 and (b) 100.
a. p = 0.08 and n = 50.
np = 50*0.08 = 4 and n(1 – p) = 50*(0.92) = 46. Since 4 is not ≥ 5, then the first condition is not
satisfied. The Central Limit Theorem for Proportions cannot be used. We cannot conclude that
̂ is approximately normal.
the sampling distribution of 𝒑
b. p = 0.08 and n =100
np = 100*0.08 = 8 and n(1 – p) = 100*(0.92) = 92. Since both 8 and 92 are ≥ 5, both condition
are satisfied. The Central Limit Theorem for Proportions takes effect, and we can conclude that
̂ is approximately normal.
the sampling distribution of 𝒑
𝝁𝒑̂
𝟎. 𝟎𝟖 and 𝝈𝒑̂
√
𝒑 𝟏−𝒑
𝒏
𝟎.𝟎𝟖 𝟏−𝟎.𝟎𝟖
𝟏𝟎𝟎
√
≈ 0.02713
̂ is approximately normal with 𝝁𝒑̂
The sampling distribution of 𝒑
𝟎. 𝟎𝟖 and 𝝈𝒑̂
0.02713.
Q2. Let p = 0.043 represent the population proportion of unemployed workers in Texas. Find the
probability that a sample of Texas worker will have a proportion unemployed greater than 9% for
samples of size (a) 30 respondents and (b) 117 respondents.
a. This sample size of n = 30 does not meet the minimum sample size required for the sampling
̂ to be approximately normal. So we cannot conclude that the sampling
distribution of 𝒑
̂ is approximately normal. Thus, we cannot solve this problem.
distribution of 𝒑
b. 𝝁𝒑̂
𝟎. 𝟎𝟒𝟑 and 𝝈𝒑̂
𝒑 𝟏−𝒑
𝒏
𝟎.𝟎𝟒𝟑 𝟏−𝟎.𝟎𝟒𝟑
𝟏𝟏𝟕
√
√
≈ 0.01875
̂ is approximately normal with mean 𝝁𝒑̂
the sampling distribution of 𝒑
deviation 𝝈𝒑̂
𝟎. 𝟎𝟏𝟖𝟕𝟓. 𝒁
𝟎.𝟎𝟗−𝝁𝒑̂
𝝈𝒑̂
𝟎.𝟎𝟗−𝟎.𝟎𝟒𝟑
𝟎.𝟎𝟏𝟖𝟕𝟓
𝟎. 𝟎𝟒𝟑 and standard
≈ 𝟐. 𝟓𝟏
̂ > 𝟎. 𝟎𝟗) = P(Z > 2.51) = 1 – 0.09940 = 0.0060
P(𝒑
So the probability that the sample proportion of unemployed Texas worker will exceed 0.09 is
0.60%