* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Confidence intervals
Survey
Document related concepts
Transcript
Confidence intervals Cécile Ané Stat 371 Spring 2006 Outline 1 Building a confidence interval 2 Planning a study: how much should I sample? 3 Conditions for validity 4 Confidence intervals for proportions Example (problem 6.10) Part of a study on the development of the thymus gland: weights of thymus gland from 5 chick embryos after 14 days of incubation: 29.6 21.5 28.0 34.6 44.9 (mg) We want to know µ, the mean weight of thymus glands in the entire population of chick embryos after 14 days of incubation (in the same incubator). ȳ = 31.72 is our best estimate for µ. How good is this estimate? How far is µ from 31.72? Standard error of the mean √ We know the standard deviation of Ȳ is σ/ n. But we don’t know σ. Hopefully, the standard deviation of the data, s, is close to σ. s SEȳ = √ n is the standard error of the mean. It is an estimate of the standard deviation of Ȳ . The deviation of Ȳ from its mean µ, is an error we will make. SEȳ gives us an idea of how far ȳ is from µ. What happens to s when the sample size increases? What happens to SEȳ when the sample size increases? Mechanics of a confidence interval 1 2 Choose a confidence level. Typically, 95%. Polls use 90% or 95%. Find the value t such that IP{−t ≤ T ≤ t} = confidence level. It also means IP{T ≥ t} = (1 − confidence level)/2 3 Refer to Student distribution on Table 4 (back cover), and use degree of freedom df= n − 1. Construct the interval: ȳ ± tSEȳ i.e. (ȳ − tSEȳ , ȳ + tSEȳ ) 4 Conclude: We are 90% confident that the mean thymus gland weight of all chick embryos at the age of 14 days of incubation (in the condition of the experiment) is between 23.41 and 40.03 mg. Mechanics of a confidence interval: Example 1 2 3 Confidence level. We will do both 90% and 95%. Find the value t: such that IP{T ≥ t} = .05 for level 90% and .025 for level 95%. Degree of freedom: df=5 − 1 = 4. t-Table gives: t = 2.13 for 90% confidence and t = 2.77 for 95% confidence. With R: > qt(.975, df=4) [1] 2.776445 Interval: We√had ȳ = 31.72, s = 8.73 then SEȳ = 8.73/ 5 = 3.90. Radius of interval (bull’s eye): t ∗ SEȳ = 8.31 (90% confidence) and 10.81 (95% confidence). The interval is 31.72 ± 8.31 or 31.72 ± 10.81, i.e. (23.41, 40.03) for 90% confidence (20.91, 42.53) for 95% confidence 4 Conclude. The Student distribution 0.2 0.1 0.0 density 0.3 Standard Normal curve 1.64 −10 −5 0 5 10 Degree of freedom: n − 1 Recall that s2 = 1 (y1 − ȳ )2 + · · · + (yn − ȳ )2 n−1 Suppose n = 3 and y1 − ȳ = , y1 − ȳ = . What is y3 − ȳ ? If we specify all but one of the deviations, we can compute the last one. The variance is completely specified by n − 1 deviations, or n − 1 pieces of information. Here the variance is: df = # pieces of information needed for computing s2 . Imagine a sample with a single observation. Back to the milk example n = 14 cows, ȳ = 36.2 lbs, s = 9.76 lbs. Find a 95% confidence interval for the population mean. 2 3 We want an area of .025 above t, and df=13: t = t.025,13 = 2.16. √ SE = 9.76/ 14 = 2.61 lbs, and multiplier is t = 2.16. Radius is then 2.16 ∗ 2.61 = 5.6, and interval 36.3 ± 5.6 i.e (30.6, 41.8). Conclusion We are 95% confident that the average daily milk yield of a cow in the herd the cows were sampled from is between 30.6 and 41.8 lbs. Confidence interval with R You need to have the raw data. > milk [1] 19 23 26 30 32 34 37 37 39 41 44 44 46 55 > t.test(milk) One sample t-test data: milk t = 13.8833, df = 13, p-value = 3.571e-09 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 30.57901 41.84956 sample estimates: mean of x 36.21429 > t.test(milk, conf.level = .80) ... 80 percent confidence interval: 32.69239 39.73618 True or False? 95% CI for the mean daily milk yield: (30.6, 41.8). With the same data, a 99% confidence interval would be larger. In a second sample of same size (14 cows), there is a 95% chance that the sample mean will be in (30.6, 41.8). The probability is 95% that the sample mean is in (30.6, 41.8). The probability is 95% that the population mean is in (30.6, 41.8). The confidence is 95% that the population mean is in (30.6, 41.8). In the population, 95% of all daily milk yields are in (30.6, 41.8). In the sample, 95% of all daily milk yields are in (30.6, 41.8). Planning a study: how big should n be? When planning a study, it is always a question we ask. How many people am I going to interview? How many blood samples to I need? How many plants to I need to grow? Trade-off between accuracy and cost. We want just the right number n to reach the conclusion. We need to set a goal. Ex: Polls: “margin of error” at least as small as 1%. Chick thymus gland weight: we will repeat the experiment, but with a different incubation conditions. We want the interval radius ≤ 1.5 mg. Or: we want the SE at least as small as a given size: SE ≤ 0.75 mg. Planning a study: how big should n be? Solving this problem requires a guess for the population SD. It usually involves preliminary data. Chicks: guess is that SD = s = 8.73 mg. Aim: SE ≤ 0.75 mg. √ Then we solve SE = SD / n n= guessed SD desired SE 2 n = (8.73/0.75)2 = 135.4896. (no unit!) We would sample 136 embryos. Conditions for validity 1 Most importantly: the sampling process needs to be like random sampling. Independence of observations, sampled from the target population. At the end, we should draw conclusions about the adequate population. If the sampling process is biaised, toward Wisonsin farms, or toward large farms for instance, the confidence interval will greatly overstate the confidence we should have. 2 The observations Y1 , . . . , Yn should be from a normal distribution if n is small, so that Ȳ is approximately normal. Detecting non-normality - Normal probability plot section 4.4 in the textbook How do we know whether condition 2 is met? How can we tell that Y1 , . . . , Yn are normally distributed? Compare the spacing among observations with the spacing expected from a normal distribution. We order the data. Milk yield, n = 14: 19 23 26 30 32 34 37 37 39 41 44 44 46 55 If Y1 , . . . , Y14 come from N (0, 1), we could also order them, and computers can calculate the expected value of the lowest, the second lowest,. . . , the largest. We get the “z-scores”: −1.8 − 1.24 − .92 − .67 − .46 − .27 − .09 .09 .27 .56 .67 .92 1.24 1.8 Is the spacing of the data looking like the “normal” spacing? We plot the y ’s vs. the z-scores. Detecting non-normality - Normal probability plot 55 Normal Q−Q Plot 55 Normal Q−Q Plot 50 ● 35 ● 30 ● ● 45 40 ● ● ● ● ● 25 25 ● ● ● ● 35 40 ● ● ● ● 30 ● Sample Quantiles 45 ● ● ● ● ● ● 20 ● 20 Sample Quantiles 50 ● ● −1 0 1 Theoretical Quantiles ● −1 0 1 Theoretical Quantiles If the points are close to a line, then we can say the data are normally distributed. It is hard to tell with small sample sizes. R demo and examples. Confidence intervals for proportions Example: What is the probability of getting the flu if one has gotten the shot during the Fall, and is in contact with the virus in the winter? Experiment: Randomly sample n = 37 persons, get them the shot in the Fall. Expose them to the virus in December. Y = # of persons in the experiment who get the disease (the shot didn’t give them enough protection). We observe y = 5. p = true value in the population: proportion or probability. p̂ = Y /n observed value. Ex: p̂ = 5/30 = 0.17. Goal: 95% confidence interval for p. Confidence intervals for proportions Sampling distribution of p̂: same shape as the binomial, but on values 0, 1/n, 2/n, . . . , (n − 1)/n, 1. Distribution of p̂ r p(1 − p) . n If n is large enough for np ≥ 5 and n(1 5, then p̂’s r − p) ≥ ! p(1 − p) distribution is approximately N p, . n r p(1 − p) p̂ lies in p ± 1.96 in 95% of the experiments, i.e. n r p(1 − p) p lies in p̂ ± 1.96 in 95% of the experiments. n Mean of p̂: IE(p̂) = p, standard deviation: Confidence intervals for proportions First idea: plug-in p̂ in place of p and use r p̂(1 − p̂) p̂ ± 1.96 n as a 95% confidence interval. Ex: q n = 30, y = 5 flu cases, so that p̂ = 5/30 = .17 and .17(1−.17) = .07. We get the interval 30 .17 ± 1.96 ∗ .07 = .17 ± .13 i.e. (.033, .300) q p̂) in BUT: this does not work very well. p lies in p̂ ± 1.96 p̂(1− n 84% of the experiments if n = 10 and p = .3, 65% of the experiments if n = 10 and p = .1, Confidence intervals for proportions Instead: We pretend we have 4 more observations (i.e. sample size is n + 4) and that out of those 4 extra observations, there are 2 successes and 2 failures (i.e. # successes is Y + 2). y +2 p̃ = n+4 r and SEp̃ = p̃(1 − p̃) n+4 A 95% confidence interval for p is p̃ ± 1.96 SEp̃ q p̃) p lies in p̃ ± 1.96 p̃(1− in n 95.2% of the experiments if n = 10 and p = .3, 93% of the experiments if n = 10 and p = .1, We are no more over-estimating our confidence! Confidence intervals for proportions Example: n = 30, y = 5 flu cases. We getp p̃ = (5 + 2)/(30 + 4) = .21 and SEp̃ = .21 ∗ .79/34 = .07. Our 95% confidence interval is (0.070, .342). Proportions: How big should n be? How many people should I sample so that my margin of error is at most 1% ? margin or error = 1.96∗SE, so it means SE at most 0.5%, i.e SEp̃ ≤ 0.005. But SEp̃ is r SEp̃ = p̃(1 − p̃) ≤ n+4 r 1/4 n+4 We then need (safe choice) n= 1 −4 4(Desired SE)2 Example: for SE at most 0.005, we need n ≥ 10, 000 − 4. That’s why polls are usually done on several thousands people.