Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Confidence Interval for a Population Mean: Student’s t-Statistic Small Sample with σ known Use the standard normal statistic z= x−µ σx x−µ = σ n Solution to Problem 2 (Small Sample with σ Unknown) Instead of using the standard normal statistic z= use the t–statistic x−µ σx x−µ = σ n x−µ t= s n in which the sample standard deviation, s, replaces the population standard deviation, σ. Conditions Required for a Valid SmallSample Confidence Interval for µ • A random sample is selected from the target population • The population has a relative frequency distribution that is approximately normal. Student’s t-Statistic The t-statistic has a sampling distribution very much like that of the z-statistic: mound-shaped, symmetric, with mean 0. The primary difference between the sampling distributions of t and z is that the t-statistic is more variable than the z-statistic. Degrees of Freedom The actual amount of variability in the sampling distribution of t depends on the sample size n. A convenient way of expressing this dependence is to say that the t-statistic has (n – 1) degrees of freedom (df). Student’s t Distribution Standard Normal Bell-Shaped Symmetric ‘Fatter’ Tails t (df = 13) t (df = 5) 0 The smaller the degrees of freedom for t-statistic, the more variable will be its sampling distribution. z t 1) 2) • We have a random sample of 15 cars of the same model. Assume that the gas milage for the population is normally distributed with a standard deviaition of 5.2 miles per galon. • A) Identify the bounds for a 90% confidence interval for the mean given a sample mean of 22.8 miles per gallon. • B) The car manufacturer of this particular model claims that the average gas milage is 26 miles per gallon. Discuss the validity of this claim using the 90% confidence interval calculated in A. Exercise 23.18 • In 1882 Michelson measured the speed of light. His values in km/sec and 299,000 substracted from them. He reported the results of 23 trials with a mean of 756.22 and a standard deviaition of 107.12. • Find a 95% confidence interval for the true speed of light from these statistics. • Interpret your result. Thinking Challenge • We have a random sample of customer order totals with an average of $78.25 and a population standard deviation of $22.5. • A) Calculate a 90% confidence interval for the mean given a sample size of 40 orders. • B) Calculate a 90% confidence interval for the mean given a sample size of 75 orders. • C) Explain the difference in the 90% confidence intervals calculated in A and B. • D) Calculate the minimum sample size needed to identify a 90% confidence interval for the mean assuming a $5 margin of error. What’s a Statistical Hypothesis? A statistical hypothesis is a statement about the numerical value of a population parameter. I believe the mean GPA of this class is 3.5! © 1984-1994 T/Maker Co. Hypothesis Testing for Population Mean Population ☺ ☺ ☺ ☺ ☺ ☺ I believe the population mean age is 50 (hypothesis). ☺ Random sample Mean ☺ ☺X = 20 Reject hypothesis! Not close. Null Hypothesis • The null hypothesis, denoted H0, represents the hypothesis that will be “retained” unless the data provide convincing evidence that it is false. This usually represents the “status quo” or some claim about the population parameter that the researcher wants to test. • You may think of null hypothesis as the “favored” hypothesis; we reject it in favor of the alternative hypothesis Ha if and only if the evidence provided by the sample data are strong against H0 and in favor of Ha . • “retain H0” is commonly referred to as “do not reject”. • Stated in one of the following forms H0: µ = (some value) Book uses this version… ( H0: µ ≤ (some value) ( H0: µ ≥ (some value) ( Alternative Hypothesis 1. Opposite of null hypothesis 2. The hypothesis that will be accepted only if the data provide convincing evidence of its truth 3. Designated Ha 4. Stated in one of the following forms Ha: µ ≠ (some value) ( Ha: µ < (some value) ( Ha: µ > (some value) ( Identifying Hypotheses Example 1: If the hypothesis of a researcher is that the population mean is not 3, set-up the hypotheses to be tested. Steps: • State the question statistically µ≠3 • State the opposite statistically µ=3 • State the null hypothesis statistically H0: µ = 3 • State the alternative hypothesis statistically Ha: µ ≠ 3 Identifying Hypotheses Example 2: If the hypothesis of a researcher is that the population mean is greater than 3, set-up the hypotheses to be tested. Steps: • State the question statistically µ>3 • State the opposite statistically µ≤3 • State the null hypothesis statistically H0: µ ≤ 3 or µ = 3 • State the alternative hypothesis statistically Ha: µ > 3 Identifying Hypotheses Example 3: Is the population average amount of TV viewing 12 hours? • State the question statistically: µ = 12 • State the opposite statistically: µ ≠ 12 • Select the alternative hypothesis: Ha: µ ≠ 12 • State the null hypothesis: H0: µ = 12 Identifying Hypotheses Example 4: Is the population average amount of TV viewing different from 12 hours? • State the question statistically: µ ≠ 12 • State the opposite statistically: µ = 12 • Select the alternative hypothesis: Ha: µ ≠ 12 • State the null hypothesis: H0: µ = 12 Identifying Hypotheses Example 5: Is the average cost per hat less than or equal to $20? • State the question statistically: µ ≤ 20 • State the opposite statistically: µ > 20 • Select the alternative hypothesis: Ha: µ > 20 • State the null hypothesis: H0: µ ≤ 20 or H0: µ = 20 Identifying Hypotheses Example 6: Is the average amount spent in the bookstore greater than $25? • State the question statistically: µ > 25 • State the opposite statistically: µ ≤ 25 • Select the alternative hypothesis: Ha: µ > 25 • State the null hypothesis: H0: µ ≤ 25 H0: µ = 25 Test Statistic The test statistic is a sample statistic, computed from information provided in the sample, that the researcher uses to decide between the null and alternative hypotheses. Determining test statistic Is the population standard deviation known? • Yes use below statistic even if you have a small sample (n<30). Test statistic Determining test statistic Is the population standard deviation known? • No If n≥30 Test statistic If n < 30 Test statistic If the population has a normal distribution Type I Error • A Type I error occurs if the researcher rejects the null hypothesis in favor of the alternative hypothesis when, in fact, H0 is true. • The probability of committing a Type I error is denoted by α. • It is also called “level of significance” Type II Error A Type II error occurs if the researcher retains the null hypothesis when, in fact, H0 is false. The probability of committing a Type II error is denoted by β. Conclusions and Consequences for a Test of Hypothesis True State of Nature Conclusion H0 True Ha True Do not reject H0 Correct decision Type II error (Assume H0 True) (probability β) Type I error Reject H0 (Assume Ha True) (probability α) Correct decision • How will we decide if we reject the null hypothesis? Example • Lets assume we would like to test H0: µ ≤ 2400 against Ha: µ < 2400 • We have a large sample. • By Central limit theorem sample mean will follow an approximately normal distribution. • So we will reject the null hypothesis if our sample mean takes a value which is far below 2400. • Lets assume sample mean=2000 Basic Idea Sampling Distribution It is unlikely that we would get a sample mean of this value ... If P(sample mean <2000) is very small, then we reject Η0 :µ = 2400. ... if in fact this were the population mean 2000 Area= P(sample mean <2000) µ = 2400 H0 Sample Means Basic Idea Sampling Distribution for z-statistics If P(Z <z) is very small, It is unlikely then we reject Η0 :µ = that we would get a 2400. sample mean of this value ... ... if in fact this were the population mean test statistic- z Area= P( Z <z)=p-value µ=0 H0 Sample Means p-Value • • • • • Probability of obtaining a test statistic more extreme (≤ or ≥) than actual sample value, given H0 is true Can be thought of as a measure of the “credibility” of the null hypothesisH0 . α is the nominal level of significance. This value is assumed by an analyst. p-value is also probability for making type-I error. But, p-value is called “observed level of significance”. • If p-value ≥ α, do not reject H0 • If p-value < α, reject H0 • The p-value shows our confidence to reject null hypothesis. • If this value is smaller than α, then the probability that we will reject null hypothesis when it is true is even smaller than the maximum tolerated error probability. • So we can conclude that null hypothesis is wrong and can be rejected in favor of alternative hypothesis. • The smaller the p-value is, the more confident we are with our decision to reject H0 . Steps for Calculating the p-Value for a Test of Hypothesis when σ is known or n≥30 1. Determine the value of the test statistic z corresponding to the result of the sampling experiment. Steps for Calculating the p-Value for a Test of Hypothesis when σ is known or n≥30 2a. If the test is one-tailed, the p-value is equal to the tail area beyond z in the same direction as the alternative hypothesis. Thus, if the alternative hypothesis is of the form > , the p-value is the area to the right of, or above, the observed z-value. Conversely, if the alternative is of the form < , the p-value is the area to the left of, or below, the observed z-value. Steps for Calculating the p-Value for a Test of Hypothesis when σ is known or n≥30 2b. If the test is two-tailed, the p-value is equal to twice the tail area beyond the observed z-value in the direction of the sign of z – that is, if z is positive, the p-value is twice the area to the right of, or above, the observed z-value. Conversely, if z is negative, the p-value is twice the area to the left of, or below, the observed z-value. Reporting Test Results as p-Values: How to Decide Whether to Reject H0 1. Choose the maximum value of α that you are willing to tolerate. 2. If the observed significance level (p-value) of the test is less than the chosen value of α, reject the null hypothesis. Otherwise, do not reject the null hypothesis. 3. Typical values for α are 0.01, 0.05, 0.10. Two-Tailed z Test p-Value Example Does an average box of cereal contain 368 grams of cereal? A random sample of 25 boxes showed x = 372.5. The company has specified σ to be 15 grams. Find the p-value. How does it compare to α = .05? 368 gm. Two-Tailed z Test p-Value Solution z= H0 : µ=368 Ha: µ≠ 368 x−µ σ 372.5 − 368 = = +1.50 15 25 n 0 1.50 z z value of sample statistic (observed) Two-Tailed Z Test p-Value Solution p-Value is P(z ≤ –1.50 or z ≥ 1.50) 1/2 p-Value 1/2 p-Value .4332 –1.50 0 From z table: lookup 1.50 1.50 .5000 – .4332 .0668 z z value of sample statistic (observed) Two-Tailed z Test p-Value Solution p-Value is P(z ≤ –1.50 or z ≥ 1.50) = .1336 1/2 p-Value .0668 –1.50 1/2 p-Value .0668 0 1.50 p-Value = .1336 ≥ α = .05 Do not reject H0. z One-Tailed z Test p-Value Example Does an average box of cereal contain more than 368 grams of cereal? A random sample of 25 boxes showed x = 372.5. The company has specified σ to be 15 grams. Find the pvalue. How does it compare to α = .05? 368 gm. One-Tailed z Test p-Value Solution p-Value is P(z ≥1.50) H0 : µ=368 Ha: µ> 368 Use alternative hypothesis to find direction p-Value .4332 0 From z table: lookup 1.50 1.50 .5000 – .4332 .0668 z z value of sample statistic One-Tailed z Test p-Value Solution p-Value is P(z ≥ 1.50) = .0668>α α=0.05 Do not reject H0 p-Value .0668 Use alternative hypothesis to find direction .4332 0 From z table: lookup 1.50 1.50 .5000 – .4332 .0668 z z value of sample statistic p-Value Thinking Challenge You’re an analyst for Ford. You want to find out if the average miles per gallon of Escorts is less than 32 mpg. You take a sample of 60 Escorts & compute a sample mean of 30.7 mpg and sample standard deviaiton of 3.8 mpg. What is the p-value? How does it compare to α = .01? • • • • H0 : µ=32 mpg Ha: µ< 32 mpg We have a large sample so CLT applies. Hence we will use z-statistics p-Value Solution* p-Value is P(z ≤ -2.65) = .004. p-Value < (α = .01). Reject H0. Use alternative hypothesis to find direction p-Value .004 .5000 – .4960 .0040 .4960 –2.65 0 z value of sample statistic z From z table: lookup 2.65 Calculating the p-Value for a Test of Hypothesis when σ is unknown and n<30 If below conditions are satisfied; 1. A random sample is selected from the target population. 2. The population from which the sample is selected has a distribution that is approximately normal. Calculating the p-Value for a Test of Hypothesis when σ is unknown and n<30 • We use t-statistic x−µ t= s n • Lower-tailed test ( Ha:µ< µ0) p-value=P(t n-1 < test statistic) • Upper-tailed test ( Ha: µ> µ0 ) p-value=P(t n-1 > test statistic) • Two-tailed test ( Ha: µ≠ µ0 ) p-value=2P(t n-1 > |test statistic|) Example Is the average capacity of batteries less than 140 ampere-hours? A random sample of 20 batteries had a mean of 138.47 and a standard deviation of 2.66. Assume a normal distribution. Test at the .05 level of significance. One-Tailed t Test Solution • • • • H0: µ = 140 Ha: µ < 140 α = .05 df = 20 – 1 = 19 x − µ 138.47 − 140 t= = = −2.57 s 2.66 P(t19 < -2.57)= p-value= P(t19 > 2.57) n 20 0.005<p-value<0.01 Reject at α = .05 -2.57 0 t There is evidence population average is less than 140 Thinking Challenge Does an average box of cereal contain 368 grams of cereal? A random sample of 25 boxes had a mean of 372.5 and a standard deviation of 12 grams. Test at the .05 level of significance. 368 gm. Thinking Challenge You work for the FTC. A manufacturer of detergent claims that the mean weight of detergent is 3.25 lb. You take a random sample of 16 containers. You calculate the sample average to be 3.238 lb. with a standard deviation of .117 lb. At the .01 level of significance, is the manufacturer correct? 3.25 lb.