* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Chapter 7 Sampling Distributions
Survey
Document related concepts
Transcript
Chapter 7 – Sampling Distributions Sampling error o Sampling error is the distance between the point estimate and its target parameter. Characteristic Mean Standard deviation Proportion Sample Statistic ̅ s ̂ Population parameter μ σ p Sampling error |̅ | | | |̂ | Sampling distribution of the sample mean ̅ o The sampling distribution of the sample mean ̅ for a given sample size n consists of the collection of the sample means of all possible samples of size n from the population. Fact 1: Mean of the Sampling Distribution of the Sample Mean ̅ The mean of the sampling distribution of the sample mean ̅ is the value of the population mean μ. It can be denoted as ̅ and read as “the mean of the sampling distribution of ̅ is μ.” Fact 2: Standard deviation of the Sampling Distribution of the Sample Mean ̅ The standard deviation of the sampling distribution of the sample mean ̅ is √ , where is the population standard deviation and n is the sample size. ̅ Fact 3: Sampling Distribution of the Sample Mean for a Normal Population The sampling distribution of the sample mean for a normal population is itself normal, regardless of sample size. Fact 4: Sampling Distribution of the Sample Mean for a Normal Population For a normal population, the sampling distribution of the sample mean ̅ is distributed as normal ( √ ), where is the population mean and is the population standard deviation. Fact 5: Standardizing a Normal Sampling Distribution for Means When the sampling distribution of ̅ is normal, we may standardize the produce the standard normal random variable Z as follows: ̅ ̅ ̅ where is the population mean, sample size. ̅ √ is the population standard deviation, and n is the Central Limit Theorem for Means o Given a population with mean μ and standard deviation σ, the sampling distribution of the sample mean ̅ becomes approximately normal ( √ ) as the sample size gets larger, regardless of the shape of the population. Rule of Thumb for When Central Limit Theorem for Means Takes Effect o We consider n ≥ 30 as large enough to apply the Central Limit Theorem for any population. Three Cases for the Sampling Distribution of the Sample Mean ̅ o Case 1: The population is normal. Then the sampling distribution of ̅ is normal. o Case 2: The population is either non-normal or unknown distribution and the sample size is at least 30. Then the sampling distribution of ̅ is approximately normal (Central Limit Theorem for Means). o Case 3: The population is either non-normal or of unknown distribution and the sample size is less than 30. Then we have insufficient information to conclude that the sampling distribution of the sample mean ̅ is either normal or approximately normal. Example 7.1 Q1. The U.S. Small Business Administration (SBA) provides information on the number of small businesses for each metropolitan area in the United States. The mean is μ = 12,485 and the standard deviation is σ = 21,973. Find the probability that a random sample of size n = 10 cities will have a mean number of small businesses greater than 17,000. A1. First we try to apply Case 1. Clearly, the population is not normal, so Case 1 does not apply. We cannot apply Case 2, since the sample size n = 10 is too small. We must default to Case 3. The population is skewed and the sample size is small. Therefore, we have insufficient ̅ is either normal or information to conclude that the sampling distribution of the sample mean 𝒙 approximately normal. We cannot find the probability that a random sample of size n = 10 cities will have a mean number of small businesses greater than 17,000. Q2. Now, try again to find the probability that a random sample of size n = 36 cities will have a men number of small businesses greater than 17,000. A1. We try to apply Case. Since the sample size n = 36 is large enough, the Central Limit Theorem applies. 𝝁𝒙̅ 𝒁 𝟏𝟕 𝟎𝟎𝟎−𝝁𝒙̅ 𝝈𝒙̅ 𝝁 𝟏𝟐 𝟒𝟖𝟓 and 𝝈𝒙̅ 𝟏𝟕 𝟎𝟎𝟎−𝟏𝟐 𝟒𝟖𝟓 𝟑𝟔𝟔𝟐.𝟏𝟔𝟔𝟕 𝝈 √𝒏 𝟐𝟏 𝟗𝟕𝟑 √𝟑𝟔 ≈ 𝟑𝟔𝟔𝟐. 𝟏𝟔𝟔𝟕 ̅> 17,000) ≈ P(Z > 1.23) = 1 – 0.8907 = 0.1093 ≈ 𝟏. 𝟐𝟑 Thus, P(𝒙 There is 10.93% probability that a random sample of 36 cities will have a mean number of small businesses greater than 17,000. Sample Distribution of the sample proportion ̂ o The Sample Distribution of the sample proportion ̂ for a given sample size n consists of the collection of the sample proportions of all possible samples of size n from the population. (continued…) Fact 6: Mean of the Sampling Distribution of the Sample Proportion ̂ The mean of the sampling distribution of the sample proportion ̂ is the value of the population proportion p. this may be denoted as ̂ and read as “the mean of the sampling distribution of ̂ is p.” Fact 7: Standard deviation of the Sampling Distribution of the Sample Proportion p The standard deviation of the sampling distribution of the sample proportion ̂ is ̂ √ − , where p is the population proportion and n is the sample size. Fact 8: Conditions for Approximate Normality for the Sampling Distribution of the Sample Proportion ̂ The sampling distribution of the sample proportion ̂ may be considered approximately normal only if both the following conditions hold: (1) np ≥ 5 and (2) n(1 – p) ≥ 5. The minimum sample size required to produce approximate normality in the sampling distribution of ̂ is the larger of either n1 = or n2 = − Central Limit Theorem for Proportions o The sampling distribution of the sample proportion ̂ follows an approximately normal distribution with mean ̂ and standard deviation ̂ √ − when both the following condition are satisfied: (1) np ≥ 5 and (2) n(1 – p) ≥ 5. Example 7.2 Q1. We learned that color blindness linked to the X chromosome afflicts 8% of men. Describe the ̂ , the proportion of men who have color blindness linked the X chromosome, sampling distribution of 𝒑 for sample of size (a) 50 and (b) 100. a. p = 0.08 and n = 50. np = 50*0.08 = 4 and n(1 – p) = 50*(0.92) = 46. Since 4 is not ≥ 5, then the first condition is not satisfied. The Central Limit Theorem for Proportions cannot be used. We cannot conclude that ̂ is approximately normal. the sampling distribution of 𝒑 b. p = 0.08 and n =100 np = 100*0.08 = 8 and n(1 – p) = 100*(0.92) = 92. Since both 8 and 92 are ≥ 5, both condition are satisfied. The Central Limit Theorem for Proportions takes effect, and we can conclude that ̂ is approximately normal. the sampling distribution of 𝒑 𝝁𝒑̂ 𝟎. 𝟎𝟖 and 𝝈𝒑̂ √ 𝒑 𝟏−𝒑 𝒏 𝟎.𝟎𝟖 𝟏−𝟎.𝟎𝟖 𝟏𝟎𝟎 √ ≈ 0.02713 ̂ is approximately normal with 𝝁𝒑̂ The sampling distribution of 𝒑 𝟎. 𝟎𝟖 and 𝝈𝒑̂ 0.02713. Q2. Let p = 0.043 represent the population proportion of unemployed workers in Texas. Find the probability that a sample of Texas worker will have a proportion unemployed greater than 9% for samples of size (a) 30 respondents and (b) 117 respondents. a. This sample size of n = 30 does not meet the minimum sample size required for the sampling ̂ to be approximately normal. So we cannot conclude that the sampling distribution of 𝒑 ̂ is approximately normal. Thus, we cannot solve this problem. distribution of 𝒑 b. 𝝁𝒑̂ 𝟎. 𝟎𝟒𝟑 and 𝝈𝒑̂ 𝒑 𝟏−𝒑 𝒏 𝟎.𝟎𝟒𝟑 𝟏−𝟎.𝟎𝟒𝟑 𝟏𝟏𝟕 √ √ ≈ 0.01875 ̂ is approximately normal with mean 𝝁𝒑̂ the sampling distribution of 𝒑 deviation 𝝈𝒑̂ 𝟎. 𝟎𝟏𝟖𝟕𝟓. 𝒁 𝟎.𝟎𝟗−𝝁𝒑̂ 𝝈𝒑̂ 𝟎.𝟎𝟗−𝟎.𝟎𝟒𝟑 𝟎.𝟎𝟏𝟖𝟕𝟓 𝟎. 𝟎𝟒𝟑 and standard ≈ 𝟐. 𝟓𝟏 ̂ > 𝟎. 𝟎𝟗) = P(Z > 2.51) = 1 – 0.09940 = 0.0060 P(𝒑 So the probability that the sample proportion of unemployed Texas worker will exceed 0.09 is 0.60%