Download Hypothesis testing in a nutshell

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Hypothesis testing in a Nutshell
Summary by Pamela Peterson Drake
Introduction
The purpose of this reading is to discuss another aspect of statistical inference, hypothesis testing. A
hypothesis is a statement about the value of a population parameter developed for the purpose of
testing. Hypothesis testing is a procedure based on evidence from samples and probability theory to
determine whether a hypothesis is a reasonable statement and should not be rejected, or is an
unreasonable statement and should be rejected.
Hypothesis testing involves a four--step procedure:
Testing a hypothesis
Step 1: Specify the null and the alternative hypotheses
The null hypothesis (H0) is a statement about the value of a population parameter. The alternative
hypothesis (Ha) is the statement that will be accepted if the sample data provides sufficient evidence
repudiating the null hypothesis. You should remember that it is usually the alternative hypothesis that you
are really trying to support. Why? You can never really prove anything with statistics, but you can
discredit the null hypothesis -- implying that the alternative is valid. In other words, you either:
reject the null hypothesis, or
fail to reject the null hypothesis.
The level of significance is defined as the probability of rejecting the null hypothesis when it is actually
true. The significance level is called the level of risk or and commonly set at 5% or 1%.
1
A Type I error is the error of rejecting the null hypothesis, H0, when it is in fact true. The probability of
committing a Type I error is the risk level or alpha risk that the researcher specifies. A Type II error is the
error of failing to reject the null hypothesis when it is actually false. The chance of making a Type II error
is called beta risk.
Decision
Truth
Fail to reject the null
hypothesis
Reject null
hypothesis in favor
of Ha
The null
hypothesis is true
The null
hypothesis is false
Correct decision
Type II error
Type I error
The level of risk that the researcher wishes to assume
in the analysis is represented by the probability of Type
I and Type II errors. Given a specific test design
(sample size, hypothesis), the researcher specifies the
level of Type I risk, referred to as the significance
level. The compliment of the significance level is the
confidence level.
A test with one rejection region is called a one-tailed
test and a test with two rejection regions is called a
two-tailed test. In general, a test is one-tailed when
the alternate hypothesis states a direction like greater
than or less than. The two-tailed test is usually stated
as not being equal to some value.
Correct decision
Examples of hypotheses
A one-tailed test:
H0 :
Financial analysts’ starting salaries
are equal to or greater than $65,000.
Ha :
Financial analysts’ starting salaries
are less than $65,000.
A two-tailed test:
H0 :
The return on the portfolio is 12%.
Ha :
The return of the portfolio does not
equal 12%
Step 3: Calculate the test statistic
The test of the hypothesis is based on the distribution of
the test statistic. In general, the test statistic is:
Test statistic =
Sample statistic - hypothsized value of the population parameter
Standard error of the sample statistic
For example, suppose you want to test whether the population mean is $100. And suppose you draw a
sample of 40 observations, with a mean of $102 and a sample standard deviation of $25
If we do not know the population variance (and this is likely the case in most examples), and the sample
is either:

large; or

the sample is small, but the population distribution is normal,
The test statistic is t-distributed:
2
tn-1 =
X-μ
s
n
where
is the hypothesized population mean;
X
is the sample mean;
s
is the sample standard deviation; and
n
is the sample size.
The t-distribution, or Student t-distribution, is a symmetric, bell-shaped curve, centered on a mean of zero.
With a sufficiently large sample, the t-distribution is similar to the normal or z-distribution. A primary
difference between the t-and the z-distributions is that the t-distribution depends on the number of
degrees of freedom.
For our test of a population mean, the calculated test statistic is:
t 39 =
$102-100
$2
=
=3.2
$25
$0.625
40
Step 3: Establish the decision rule
4
3.6
3.2
2.8
2.4
2
1.6
1.2
0.8
0.4
0
-0.4
-0.8
-1.2
-1.6
-2
-2.4
-2.8
-3.2
-3.6
-4
The decision rule states when to reject the null hypothesis. The decision rule is based on the distribution
of the test statistic. For example, a normally distributed test statistic would involve the z distribution – that
is, the normal distribution scaled to a zero mean and a standard deviation of 1.0:
For a two-tailed test we would specify two rejection regions based on the amount of Type I error we are
willing to take on. If the Type I error is 5 percent, in a two-tailed test this would mean that we would reject
the null hypothesis if the calculated test statistic is either below -1.96 or above +1.96:
3
95%
2.5%
2.5%
-1.96
Reject the null
hypothesis
0
1.96
Fail to reject the null hypothesis
Reject the null
hypothesis
If the calculated test statistic falls in either of the two rejection regions, we reject the null hypothesis.
If the test is a one-tail test, there is only one rejection region. If the alternative is “less than” (e.g., Ha: <
5), the rejection region is on the left-hand side; if the alternative is “greater than” (e.g., Ha:
> 5), the
rejection region is on the right-hand side. For example, a rejection region appropriate for a “greater than”
alternative is:
95%
5%
0
Fail to reject the null hypothesis
1.64
Reject the null
hypothesis
For our example of the sample of 40 observations with a mean of $102, the critical t-values for 40-1=39
1
degrees of freedom is +2.023.
1
We obtain this value from a t-table. We could have also used Microsoft Excel’s TINV function.
4
For distributions other than the z distribution, the shape of the distribution depends on the degrees of
freedom. If you are testing, for example, the difference of variances between two samples, you are
testing this using an F-distributed test statistic and the shape of the F-distribution – and hence the
selected critical values – depends on two degrees of freedom (in the case of the test of variances, n 1 – 1
and n2 -1).
Step 4:
Make the Decision
The final step is to decide whether or not to reject the null hypothesis. The
of significance, which established the rejection region, and the calculated
test statistic. If the calculated test statistic falls in to the rejection region,
we conclude that the sample statistic (e.g., mean) is sufficiently far away
from the hypothesized value. Therefore, we reject the null hypothesis in
favor of the alternative hypothesis when the calculated test statistic
exceeds bounds created by the critical value (one-sided) or values (twosided).
decision depends on the test
Remember, you cannot
actually accept the null
hypothesis -- you can only
reject or fail to reject the
null hypothesis.
In hypothesis testing you may want to
report the probability, assuming the null
hypothesis is true, of getting a test
statistic value at least as extreme as the
one just calculated. This probability is
called the p-value and is compared
with the significance level. If the pvalue is less than the significance level
of the hypothesis test, H0 is rejected. If
it is greater, then H0 is accepted.
Using Microsoft Excel to determine the p-value
You can make a decision using either
the comparison of critical and
calculated test statistics, or comparing
the p-value associated with the test
statistic with the level of Type I error, .
χ2
CHIDIST(x,df)
F
FDIST(x,df1,df2)
Microsoft Excel has functions that will return the p-value for a
given calculated statistic:
Z
NORMDIST(x,0,1, false)
t
tailed
TDIST(x,df,tails)
tails = 1 for one-tailed, 2 for two-
Decision
Using the critical values
Using the p-value
Reject the null hypothesis,
Ho
Calculate t-statistic falls
into the “Reject” region.
p-value associated with the
calculated test statistic is less
than the chosen level of Type I
error, .
Fail to reject the null
hypothesis, Ho
Calculated t-statistic falls
in to the “Fail to reject”
region
p-value associated with the
calculated test statistic is greater
than the chosen level of Type I
error, .
5
Tests and test statistics
Test
Test statistic
Test of mean when:
Population normally distributed
Population variance known
z=
Test a mean when:
the population variance is unknown,
and
the sample size is greater than 30.
t=
Distribution
X-μ
σ
n
z
X-μ
s
n
Student’s t
df=n-1
Test of mean of differences (a.k.a. paired
comparison test)
t=
Student’s t
d-μd
sd
n
d=mean of the disfferences
df=n-1
Test the difference between two population
means:
normally distribution population
unknown population variance,
unequal population variances.
t=
(X1 -X 2 )-(μ1 -μ2 )
sp2
n1
2
sp =
+
Student’s t
sp2
n2
(n1 -1)s12 +(n2 -1)s 22
n1 +n2 -2
df=n1 +n2 -2
6
Tests and test statistics, continued
Test
Test statistic
Test the difference between two population means:
normally distribution population
unknown population variance,
population variances assumed equal.
t=
Distribution
s12 s 22
+
n1 n2
2
s12
n1
df=
2
s1
2
n2
2
s2
n1
χ2 =
+
s 22
2
n1
Test of the value of the population variance:
of a normally distributed population or
sample is drawn randomly.
Student’s t
(X1 -X 2 )-(μ1 -μ2 )
+
2
n2
n2
(n-1)s 2
Chi-squared
σ 20
σ 20 =hypothesized variance
df = n-1
Tests concerning the equality of two variances.
s2
F= 1
s 22
df=n1 -1 and n2 -1
F (a.k.a. Fisher-Snedecor distribution)
One rejection region (right-hand side);
reject when the calculated F-statistic is
greater than the critical F-value.
Notation:
df = degrees of freedom
X = sample mean
= population mean
d = mean of the differences
= population variance
s = sample standard deviation
n = sample size
7
Related documents