Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Hypothesis testing in a Nutshell Summary by Pamela Peterson Drake Introduction The purpose of this reading is to discuss another aspect of statistical inference, hypothesis testing. A hypothesis is a statement about the value of a population parameter developed for the purpose of testing. Hypothesis testing is a procedure based on evidence from samples and probability theory to determine whether a hypothesis is a reasonable statement and should not be rejected, or is an unreasonable statement and should be rejected. Hypothesis testing involves a four--step procedure: Testing a hypothesis Step 1: Specify the null and the alternative hypotheses The null hypothesis (H0) is a statement about the value of a population parameter. The alternative hypothesis (Ha) is the statement that will be accepted if the sample data provides sufficient evidence repudiating the null hypothesis. You should remember that it is usually the alternative hypothesis that you are really trying to support. Why? You can never really prove anything with statistics, but you can discredit the null hypothesis -- implying that the alternative is valid. In other words, you either: reject the null hypothesis, or fail to reject the null hypothesis. The level of significance is defined as the probability of rejecting the null hypothesis when it is actually true. The significance level is called the level of risk or and commonly set at 5% or 1%. 1 A Type I error is the error of rejecting the null hypothesis, H0, when it is in fact true. The probability of committing a Type I error is the risk level or alpha risk that the researcher specifies. A Type II error is the error of failing to reject the null hypothesis when it is actually false. The chance of making a Type II error is called beta risk. Decision Truth Fail to reject the null hypothesis Reject null hypothesis in favor of Ha The null hypothesis is true The null hypothesis is false Correct decision Type II error Type I error The level of risk that the researcher wishes to assume in the analysis is represented by the probability of Type I and Type II errors. Given a specific test design (sample size, hypothesis), the researcher specifies the level of Type I risk, referred to as the significance level. The compliment of the significance level is the confidence level. A test with one rejection region is called a one-tailed test and a test with two rejection regions is called a two-tailed test. In general, a test is one-tailed when the alternate hypothesis states a direction like greater than or less than. The two-tailed test is usually stated as not being equal to some value. Correct decision Examples of hypotheses A one-tailed test: H0 : Financial analysts’ starting salaries are equal to or greater than $65,000. Ha : Financial analysts’ starting salaries are less than $65,000. A two-tailed test: H0 : The return on the portfolio is 12%. Ha : The return of the portfolio does not equal 12% Step 3: Calculate the test statistic The test of the hypothesis is based on the distribution of the test statistic. In general, the test statistic is: Test statistic = Sample statistic - hypothsized value of the population parameter Standard error of the sample statistic For example, suppose you want to test whether the population mean is $100. And suppose you draw a sample of 40 observations, with a mean of $102 and a sample standard deviation of $25 If we do not know the population variance (and this is likely the case in most examples), and the sample is either: large; or the sample is small, but the population distribution is normal, The test statistic is t-distributed: 2 tn-1 = X-μ s n where is the hypothesized population mean; X is the sample mean; s is the sample standard deviation; and n is the sample size. The t-distribution, or Student t-distribution, is a symmetric, bell-shaped curve, centered on a mean of zero. With a sufficiently large sample, the t-distribution is similar to the normal or z-distribution. A primary difference between the t-and the z-distributions is that the t-distribution depends on the number of degrees of freedom. For our test of a population mean, the calculated test statistic is: t 39 = $102-100 $2 = =3.2 $25 $0.625 40 Step 3: Establish the decision rule 4 3.6 3.2 2.8 2.4 2 1.6 1.2 0.8 0.4 0 -0.4 -0.8 -1.2 -1.6 -2 -2.4 -2.8 -3.2 -3.6 -4 The decision rule states when to reject the null hypothesis. The decision rule is based on the distribution of the test statistic. For example, a normally distributed test statistic would involve the z distribution – that is, the normal distribution scaled to a zero mean and a standard deviation of 1.0: For a two-tailed test we would specify two rejection regions based on the amount of Type I error we are willing to take on. If the Type I error is 5 percent, in a two-tailed test this would mean that we would reject the null hypothesis if the calculated test statistic is either below -1.96 or above +1.96: 3 95% 2.5% 2.5% -1.96 Reject the null hypothesis 0 1.96 Fail to reject the null hypothesis Reject the null hypothesis If the calculated test statistic falls in either of the two rejection regions, we reject the null hypothesis. If the test is a one-tail test, there is only one rejection region. If the alternative is “less than” (e.g., Ha: < 5), the rejection region is on the left-hand side; if the alternative is “greater than” (e.g., Ha: > 5), the rejection region is on the right-hand side. For example, a rejection region appropriate for a “greater than” alternative is: 95% 5% 0 Fail to reject the null hypothesis 1.64 Reject the null hypothesis For our example of the sample of 40 observations with a mean of $102, the critical t-values for 40-1=39 1 degrees of freedom is +2.023. 1 We obtain this value from a t-table. We could have also used Microsoft Excel’s TINV function. 4 For distributions other than the z distribution, the shape of the distribution depends on the degrees of freedom. If you are testing, for example, the difference of variances between two samples, you are testing this using an F-distributed test statistic and the shape of the F-distribution – and hence the selected critical values – depends on two degrees of freedom (in the case of the test of variances, n 1 – 1 and n2 -1). Step 4: Make the Decision The final step is to decide whether or not to reject the null hypothesis. The of significance, which established the rejection region, and the calculated test statistic. If the calculated test statistic falls in to the rejection region, we conclude that the sample statistic (e.g., mean) is sufficiently far away from the hypothesized value. Therefore, we reject the null hypothesis in favor of the alternative hypothesis when the calculated test statistic exceeds bounds created by the critical value (one-sided) or values (twosided). decision depends on the test Remember, you cannot actually accept the null hypothesis -- you can only reject or fail to reject the null hypothesis. In hypothesis testing you may want to report the probability, assuming the null hypothesis is true, of getting a test statistic value at least as extreme as the one just calculated. This probability is called the p-value and is compared with the significance level. If the pvalue is less than the significance level of the hypothesis test, H0 is rejected. If it is greater, then H0 is accepted. Using Microsoft Excel to determine the p-value You can make a decision using either the comparison of critical and calculated test statistics, or comparing the p-value associated with the test statistic with the level of Type I error, . χ2 CHIDIST(x,df) F FDIST(x,df1,df2) Microsoft Excel has functions that will return the p-value for a given calculated statistic: Z NORMDIST(x,0,1, false) t tailed TDIST(x,df,tails) tails = 1 for one-tailed, 2 for two- Decision Using the critical values Using the p-value Reject the null hypothesis, Ho Calculate t-statistic falls into the “Reject” region. p-value associated with the calculated test statistic is less than the chosen level of Type I error, . Fail to reject the null hypothesis, Ho Calculated t-statistic falls in to the “Fail to reject” region p-value associated with the calculated test statistic is greater than the chosen level of Type I error, . 5 Tests and test statistics Test Test statistic Test of mean when: Population normally distributed Population variance known z= Test a mean when: the population variance is unknown, and the sample size is greater than 30. t= Distribution X-μ σ n z X-μ s n Student’s t df=n-1 Test of mean of differences (a.k.a. paired comparison test) t= Student’s t d-μd sd n d=mean of the disfferences df=n-1 Test the difference between two population means: normally distribution population unknown population variance, unequal population variances. t= (X1 -X 2 )-(μ1 -μ2 ) sp2 n1 2 sp = + Student’s t sp2 n2 (n1 -1)s12 +(n2 -1)s 22 n1 +n2 -2 df=n1 +n2 -2 6 Tests and test statistics, continued Test Test statistic Test the difference between two population means: normally distribution population unknown population variance, population variances assumed equal. t= Distribution s12 s 22 + n1 n2 2 s12 n1 df= 2 s1 2 n2 2 s2 n1 χ2 = + s 22 2 n1 Test of the value of the population variance: of a normally distributed population or sample is drawn randomly. Student’s t (X1 -X 2 )-(μ1 -μ2 ) + 2 n2 n2 (n-1)s 2 Chi-squared σ 20 σ 20 =hypothesized variance df = n-1 Tests concerning the equality of two variances. s2 F= 1 s 22 df=n1 -1 and n2 -1 F (a.k.a. Fisher-Snedecor distribution) One rejection region (right-hand side); reject when the calculated F-statistic is greater than the critical F-value. Notation: df = degrees of freedom X = sample mean = population mean d = mean of the differences = population variance s = sample standard deviation n = sample size 7