Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 17 Inference about a Population Mean 5/24/2017 Inference about µ 1 σ not known In practice, we do not usually know population standard deviation σ Therefore, we cannot calculate σx-bar Instead, we calculate this standard error of the mean: SE x s 5/24/2017 n Inference about µ 2 t Procedures Because σ is now known, we do NOT use z statistics. Instead, we use this t statistic x μ0 z σ n x μ0 t s n T procedures are based on Student’s t distribution 5/24/2017 Inference about µ 3 Student’s t Distributions • A “family” of distributions • Each family member has different degrees of freedom (df) • More area in their tails than Normal distributions (fatter tails) • As df increases, s becomes a better estimate of σ and the t distributions becomes more Normal • t with more than 30 df very similar to z 5/24/2017 Inference about µ 4 t Distributions 5/24/2017 Inference about µ 5 Table C “t Table” Table entries = t* critical values Rows = df; Columns = probability levels Familiarize yourself with the t table in the “Tables and Formulas for Moore” handout 5/24/2017 Inference about µ 6 Using Table C Question: What t critical value should I use for 95% confidence when df = 7? Answer: t* = 2.365 5/24/2017 Inference about µ 7 Confidence Interval for μ s x t n t* is the critical value with df = n−1 and C level of confidence Lookup in Table C 5/24/2017 Inference about µ 8 Example Statement : What is the population mean µ birth weight of the SIDS population? Data: We take an SRS of n = 10 from the population of SIDS babies and retrieve their birth certificates. This was their birth weights (grams): 2998, 3740, 2031, 2804, 2454, 2780, 2203, 3803, 3948, 2144 Plan: We will calculate the sample mean and standard deviation. We will then calculate and interpret the 95% CI for µ. 9 Example (Solution) x 2890.5 grams s 720 grams df n 1 10 1 9; For 95% confidence : t * 2.262 (Table C) s 95% CI for x t n 720 2890.5 2.262 10 2890.5 ± 515.1 * = (2375 to 3406) grams We are 95% confident population mean µ is between 2375 and 3406 gms. 10 One-Sample t Test (Hypotheses) • Draw simple random sample of size n from a large population having unknown mean µ • Test null hypothesis H0: μ = μ0 where μ0 ≡ stated value for the population mean – μ0 changes from problem to problem – μ0 is NOT based on the data – μ0 IS based on the research question • The alternative hypothesis is: – Ha: μ > μ0 (one-sided looking for a larger value) OR – Ha: μ < μ0 (one-sided looking for a smaller value) OR – Ha: μ ≠ μ0 (two-sided) 5/24/2017 Inference about µ 11 One-Sample t Test One-sample t statistic: t x μ0 s n with df n 1 P-value = tail beyond tstat (use Table C) 5/24/2017 Inference about µ 12 P-value: Interpretation • P-value (interpretation) Smaller-and-smaller Pvalues indicate stronger-and-stronger evidence against H0 • Conventions: .10 < P < 1.0 evidence against H0 not significant .05 < P ≤ .10 evidence against H0 marginally signif. .01 < P ≤ .05 evidence against H0 significant P ≤ .01 evidence against H0 highly significant Basics of Significance Testing 13 Example: “Weight Gain” Statement: We want to know whether there is good evidence for weight change in a particular population. We take an SRS on n = 10 from this population and find the following changes in weight (lbs). 2.0, 0.4, 0.7, 2.0, −0.4, 2.2, −1.3, 1.2, 1.1, 2.3 Calculate: x 1.020 lbs.; s 1.196 lbs. Do data provide significant evidence for a weight change? 5/24/2017 Inference about µ 14 Example “Weight Gain” (Hypotheses) • Under null hypothesis, no weight gain in population H0: μ = 0 Note: µ0 = 0 in this particular example • One-sided alternative, weight gain in population. Ha: μ > 0 • Two-sided alternative hypothesis, weight change: Ha: μ ≠ 0 5/24/2017 Inference about µ 15 Example (Test Statistic) t x μ0 s 1.020 0 2.70 1.196 n 10 df 10 1 9 5/24/2017 Inference about µ 16 Example (P-value) • Table C, row for 9 df • t statistic (2.70) is between t* = 2.398 (P = 0.02) and t* = 2.821 (P = 0.01) • One-sided P-value is between .01 and .02: .01 < P < .02 5/24/2017 Inference about µ 17 Two-tailed P-value • For two-sided Ha, P-value = 2 × onesided P • In our example, the one-tailed P-value was between .01 and .02 • Thus, the two-tailed P value is between .02 and .04 18 Interpretation • Interpret P-value in context of claim made by H0 • In our example, H0: µ = 0 (no weight gain) • Two-tailed P-value between .02 and .04 • Conclude: significant evidence against H0 19 Paired Samples Responses in matched pairs Parameter μ now represents the population mean difference 5/24/2017 Inference about µ 20 Example: Matched Pairs • Pollution levels in two regions (A & B) on 8 successive days • Do regions differ significantly? • Subtract B from A = last column • Analyze differences Day 1 A 2.92 B 1.84 A–B 1.08 2 3 4 1.88 5.35 3.81 0.95 4.26 3.18 0.93 1.09 0.63 5 6 7 4.69 4.86 5.81 3.44 3.69 4.95 1.25 1.17 0.86 8 5.55 4.47 1.08 x 1.0113 and s 0.1960 5/24/2017 Inference about µ 21 Example: Matched Pairs Hypotheses: H0: μ = 0 (note: µ0 = 0, representing no mean difference) Ha: μ > 0 (one-sided) Ha: μ ≠ 0 (two-sided) Test Statistic: t x μ0 s 14.59 0.1960 n 5/24/2017 1.0113 0 8 Inference about µ df n 1 8 1 7 22 Illustration (cont.) P-value: • Table C 7 df row • t statistic is greater than largest value in table: t* = 5.408 (upper p = 0.0005). • Thus, one-tailed P < 0.0005 • Two-tailed P = 2 × one-tailed P-value: P < 0.001 • Conclude: highly significant evidence against H0 5/24/2017 Inference about µ 23 95% Confidence Interval for µ Air pollution data: n = 8, x-bar = 1.0113, s = 0.1960 df = 8 1 = 7 For 95% confidence, use t* = 2.365 (Table C) x t s n 1.0113 2.365 0.1960 1.0113 0.1639 8 0.8474 to 1.1752 95% confidence population mean difference µ is between 0.847 and 1.175 5/24/2017 Inference about µ 24 Interpreting the Confidence Interval The confidence interval seeks population mean difference µ (IMPORTANT) Recall the meaning of “confidence,” i.e., the ability of the interval to capture µ upon repetition Recall from the prior chapter that the confidence interval can be used to address a null hypothesis 5/24/2017 Inference about µ 25 Normality Assumption • t procedures require Normality, but they are robust when n is “large” • Sample size less than 15: Use t procedures if data are symmetric, have a single peak with no outliers. If data are highly skewed, avoid t. • Sample size at least 15: Use t procedures except in the presence of strong skewness. • Large samples: Use t procedures even for skewed distributions when the sample is large (n ≥ ~40) 5/24/2017 Inference about µ 26 Can we use a t procedure? Moderately sized dataset (n = 20) w/strong skew. t procedures cannot be trusted 5/24/2017 Inference about µ 27 Word lengths in Shakespeare’s plays (n ≈ 1000) The data has a strong positive skew but since the sample is large, we can use t procedures. 5/24/2017 Inference about µ 28 Can we use t? The distribution has no clear violations of Normality. Therefore, we trust the t procedure. 5/24/2017 Inference about µ 29