Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapters 14 & 16: Introduction to Statistical Inferences Level of Confidence 1 Maximum Error E E = z(/2) s n Sample Size n 1 Chapter Goals • Learn the basic concepts of estimation • Consider questions about a population mean using two methods that assume the population standard deviation is known • Consider: what value or interval of values can we use to estimate a population mean? 2 The Nature of Estimation • Discuss estimation more precisely • What makes a statistic good ? • Assume the population standard deviation, s, is known throughout this chapter • Concentrate on learning the procedures for making statistical inferences about a population mean m 3 Point Estimate for a Parameter Point Estimate for a Parameter: The value of the corresponding statistic Example: x = 14.7 is a point estimate (single number value) for the mean m of the sampled population How good is the point estimate? Is it high? Or low? Would another sample yield the same result? Note: The quality of an estimation procedure is enhanced if the sample statistic is both less variable and unbiased 4 Unbiased Statistic Unbiased Statistic: A sample statistic whose sampling distribution has a mean value equal to the value of the population parameter being estimated. A statistic that is not unbiased is a biased statistic. Example: The figures on the next slide illustrate the concept of being unbiased and the effect of variability on a point estimate Assume A is the parameter being estimated 5 Illustrations A Negative bias Under estimate High variability Unbiased On target estimate A A Positive bias Over estimate Low variability 6 Notes 1. The sample mean, x ,is an unbiased statistic because the mean value of the sampling distribution is equal to the population mean: m x = m 2. Sample means vary from sample to sample. We don’t expect the sample mean to be exactly equal the population mean m. 3. We do expect the sample mean to be close to the population mean 4. Since closeness is measured in standard deviations, we expect the sample mean to be within 2 standard deviations of the population mean 7 Important Definitions Interval Estimate: An interval bounded by two values and used to estimate the value of a population parameter. The values that bound this interval are statistics calculated from the sample that is being used as the basis for the estimation. Level of Confidence 1 - : The probability that the sample to be selected yields an interval that includes the parameter being estimated Confidence Interval: An interval estimate with a specified level of confidence 8 Summary • To construct a confidence interval for a population mean m, use the CLT • Use the point estimate x as the central value of an interval • Since the sample mean ought to be within 2 standard deviations of the population mean (95% of the time), we can find the bounds to an interval centered at x : x 2(s x ) to x + 2(s x ) • The level of confidence for the resulting interval is approximately 95%, or 0.95 • We can be more accurate in determining the level of confidence 9 Illustration Distribution of x x 2(s x ) m x x + 2(s x ) • The interval x 2 s x to x + 2 s x is an approximate 95% confidence interval for the population mean m based on this x 10 Estimation of Mean m (s Known) • Formalize the interval estimation process as it applies to estimating the population mean m based on a random sample • Assume the population standard deviation s is known • The assumptions are the conditions that need to exist in order to correctly apply a statistical procedure 11 The Assumption... The assumption for estimating the mean m using a known s : The sampling distribution of x has a normal distribution Assumption satisfied by: 1. Knowing that the sampled population is normally distributed, or 2. Using a large enough random sample (CLT) Note: The CLT may be applied to smaller samples (for example n = 15) when there is evidence to suggest a unimodal distribution that is approximately symmetric. If there is evidence of skewness, the sample size needs to be much larger. 12 The 1- Confidence Interval of m • A 1- confidence interval for m is found by x z(/2) s to n x + z(/2) s n Notes: 1. x is the point estimate and the center point of the confidence interval 2. z(/2) : confidence coefficient, the number of multiples of the standard error needed to construct an interval estimate of the correct width to have a level of confidence 1- 1 /2 - z(/2) 0 /2 z(/2) z 13 Notes Continued 3. s / n : standard error of the mean The standard deviation of the distribution of x 4. z(/2) ( s / n ) : maximum error of estimate E One-half the width of the confidence interval (the product of the confidence coefficient and the standard error) 5. x z(/2) x + z(/2) ( s / n ) : lower confidence limit (LCL) ( s / n ) : upper confidence limit (UCL) 14 The Confidence Interval A Five-Step Model: 1. Describe the population parameter of concern 2. Specify the confidence interval criteria a. Check the assumptions b. Identify the probability distribution and the formula to be used c. Determine the level of confidence, 1 - 3. Collect and present sample information 4. Determine the confidence interval a. Determine the confidence coefficient b. Find the maximum error of estimate c. Find the lower and upper confidence limits 5. State the confidence interval 15 Example Example: The weights of full boxes of a certain kind of cereal are normally distributed with a standard deviation of 0.27 oz. A sample of 18 randomly selected boxes produced a mean weight of 9.87 oz. Find a 95% confidence interval for the true mean weight of a box of this cereal. Solution: 1. Describe the population parameter of concern The mean, m, weight of all boxes of this cereal 2. Specify the confidence interval criteria a. Check the assumptions The weights are normally distributed, the distribution of x is normal b. Identify the probability distribution and formula to be used Use the standard normal variable z with s = 0.27 c. Determine the level of confidence, 1 - The question asks for 95% confidence: 1 - = 0.95 16 Solution Continued 3. Collect and present information The sample information is given in the statement of the problem Given: n = 18; x = 9.87 4. Determine the confidence interval a. Determine the confidence coefficient The confidence coefficient is found using Table A or C: z(/2) 1 1.15 0.75 1.28 0.80 1.65 0.90 1.96 0.95 2.33 0.98 2.58 0.99 17 Solution Continued b. Find the maximum error of estimate Use the maximum error part of the formula for a CI E = z(/2) s 0.27 = = 196 . 01247 . n 18 c. Find the lower and upper confidence limits Use the sample mean and the maximum error: s n 9.87 01247 . 9.7453 9.75 x z(/2) s n 9.87 + 01247 . 9.9947 10.00 to x + z(/2) to to to 5. State the confidence interval 9.75 to 10.00 is a 95% confidence interval for the true mean weight, m, of cereal boxes 18 Example Example: A random sample of the test scores of 100 applicants for clerk-typist positions at a large insurance company showed a mean score of 72.6. Determine a 99% confidence interval for the mean score of all applicants at the insurance company. Assume the standard deviation of test scores is 10.5. Solution: 1. Parameter of concern The mean test score, m, of all applicants at the insurance company 2. Confidence interval criteria a. Assumptions: The distribution of the variable, test score, is not known. However, the sample size is large enough (n = 100) so that the CLT applies b. Probability distribution: standard normal variable z with s = 10.5 c. The level of confidence: 99%, or 1 - = 0.99 19 Solution Continued 3. Sample information Given: n = 100 and x = 72.6 4. The confidence interval a. Confidence coefficient: z(/2) = z(0.005) = 2 .58 b. Maximum error: E = z(/2) ( s / n ) = ( 2.58)(10.5 / 100 ) = 2.709 c. The lower and upper limits: 72.6 2.709 = 69.891 to 72.6 + 2.709 = 75309 . 5. Confidence interval With 99% confidence we say, “The mean test score is between 69.9 and 75.3”, or “69.9 to 75.3 is a 99% confidence interval for the true mean test score” Note: The confidence is in the process. 99% confidence means: if we conduct the experiment over and over, and construct lots of confidence intervals, then 99% of the confidence intervals will contain the true mean value m. 20 Sample Size • Problem: Find the sample size necessary in order to obtain a specified maximum error and level of confidence (assume the standard deviation is known) E = z(/2) s n Solve this expression for n: z(/2) s n= E 2 21 Example Example: Find the sample size necessary to estimate a population mean to within 0.5 with 95% confidence if the standard deviation is 6.2 Solution: z(/2) s 2 n= E . )(6.2) 2 (196 2= = [24 . 304] 590.684 n= 0.5 Therefore, n = 591 Note: When solving for sample size n, always round up to the next largest integer (Why?) 22