Download Lecture 17-18. - Columbia Statistics

Review - Week 9 Read: Chapters 18-21 Review : The purpose of a confidence interval is to estimate an unknown parameter and give an indication of how accurate the estimate is and of how confident we are that the result is correct. Any confidence interval consists of two parts: (a) An interval computed from the data. (b) A confidence level The interval usually has the form: Estimate ± margin of error The confidence level states the probability that the method will give the correct answer. A level C confidence interval for a parameter is an interval computed from the data in such a way that C% of all random samples yield intervals containing the true value of the parameter. Suppose a SRS of size n is drawn from a large population with unknown proportion p of successes. A confidence interval for p is pˆ ± z * pˆ (1 − pˆ ) . n z * is called the critical value and is the number of standard deviations away from the mean that corresponds to the specified level of confidence. The sample size required to produce a confidence interval with a given margin of error m at a given confidence level is 2 ⎛ z* ⎞ n = ⎜⎜ ⎟⎟ pˆ (1 − pˆ ) ⎝m⎠ where z * is the critical value for the confidence level you specified. The margin of error is greatest when pˆ = 1 / 2 , so when we want to be conservative we can use: 2 ⎛ z* ⎞ ⎟⎟ . n = ⎜⎜ ⎝ 2m ⎠ Exercise 1: A simple random sample of size n=182 yielded pˆ = 0.73 (a) (b) (c) (d) (e) What is the standard error of p̂ . Find a 99% confidence interval for p. Find a 95% confidence interval for p. Find a 90% confidence interval for p. How does the margin of error change as the confidence level decreases? Exercise 2: In a clinical trial of 760 patients who received a daily dose of a certain drug, 43 reported a headache as a side effect. Construct a 90% confidence interval for the proportion of patients receiving the drug who will experience headache as a side effect. Exercise 3: Assuming p is near 0.3, find the sample size required to construct a 95% confidence interval for p with margin of error 0.01. Repeat the calculations, this time assuming p is 0.6. Exercise 4: A politicians wishes to measure her approval rating. What sample size is needed if she wishes the estimate to be within 4 percentage points with 90% confidence if (a) past estimates show her approval rating to be around 0.65? (b) she has no prior information about her approval rating. Review: Tests of significance are used to assess the evidence provided by the data against some statement about the population called the null hypothesis H 0 in favor of an alternative hypothesis H a . The hypotheses are stated in terms of the population parameters. A test is based on the statistic that estimates the parameter. A test statistic measures compatibility between the null hypothesis and the data. The probability, computed assuming H 0 is true, that the test statistic would take a value as or more extreme than that actually observed is called the P-value of the test. The smaller the P-value, the stronger the evidence against H 0 provided by the data. If the Pvalue is as small or smaller than some value α, we say that the data are statistically significant at significance level α. A significance test for the statement H 0 : p = p0 , is based on the one-sample z statistic: z= pˆ − p0 p0 (1 − p0 ) n . with P-values calculated from the N(0,1) distribution. For one-sided tests only values that differ in a specific direction from the null value counts against the null hypothesis. For two-sided tests values that differ in either direction from the null value counts against the null hypothesis. We can use significance tests to reject a certain hypothesis. But if the test does not give sufficient information to reject a hypothesis that does not mean that we accept it, only that we do not have information to justify rejecting it. Exercise 1: In 1998 a report showed that 42.1% of households in the US owned a personal computer. Set up the null and alternative hypothesis for testing whether the percentage of households that own a personal computer has (a) changed since 1998. (b) increased since 1998. (c) decreased since 1998. Exercise 2: Suppose we want to estimate the proportion women, p, in a certain population. A SRS of 100 people is selected from the population and we obtain pˆ = 0.65 . Test H 0 : p = 0.55 against H a : p > 0.55 . (a) (b) (c) (d) Calculate the P-value. Would you reject H 0 at the 5% level of significance? Would you reject H 0 at the 1% level of significance? Redo (a) - (c) using H a : p ≠ 0.55 Exercise 3: A poll was conducted where a simple random sample of adult Americans were asked if they were for or against a certain proposal up for debate in Congress. Suppose 800 people were asked their opinion and 360 replied that they supported the proposal, while the rest opposed it. Is there significant evidence at the 5% level that less than a majority of the population agrees with the proposal?

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 17-18. - Columbia Statistics