Download Hypothesis Testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Two-Sample
Hypothesis Testing
Suppose you want to know if two populations have the same mean
or, equivalently, if the difference between the population means is
zero.
The population standard deviations are known to be  12 and  22 .
You have independent samples from the two populations.
Their sizes are n1 and n2.
The sample mean for the sample from the first population is X1.
The mean of X1 equals the mean of the first population , 1 ,
and the variance of X1 is
2
1
.
n1
Similarly, the sample mean for the sample from the 2 nd population is X 2 .
The mean of X 2 equals the mean of the 2 nd population ,  2 ,
and the variance of X 2 is
2
2
n2
.
E( X1 - X2 )  1 - 2 .
Since X1 and X2 are independen t,
V( X1 - X 2 ) 
 12
n1

 22
n2
.
and the standard deviation of X1 - X 2 is
 12
n1

 22
n2
.
Also X1 - X2 is approximat ely normally distribute d.
So we have a standard normal distribution
Z
(X1  X 2 )  ( 1   2 )

2
1
n1


2
2
n2
We’ll use this formula to test
whether the population means
are equal.
Example
Suppose from a large class, we sample 4 grades: 64, 66, 89, 77.
From another large class, we sample 3 grades: 56, 71, 53.
We assume that the class grades are normally distributed, and that
the population variances for the two classes are both 96.
Test at the 5% level H 0 : 1  2 versus H1 : 1   2 .
Averaging the grades in the first sample, we have X1  74.
Averaging the grades in the second sample, we have X2  60.
(74  60)  (0)
(X1  X 2 )  ( 1   2 )

Then, Z 
96 96
 12  22


4
3
n1 n 2
14
14


24  32
56
14

 1.87
7.4833
As we’ve found before, the Z-values for a two tailed 5%
test are 1.96 and -1.96, as indicated below.
.475
.475
crit. reg.
crit. reg.
acceptance region
.025
-1.96
.025
0
1.96
Z
Since our Z-statistic, 1.87, is in the acceptance region, we accept
H0: 1- 2= 0, concluding that the population means are equal.
What do you do if you don’t
know the population variances
in this formula?
Replace the population variances
with the sample variances and the Z
distribution with the t distribution.
The number of degrees of freedom is
the integer part of this very messy
formula:
Z
(X1  X 2 )  ( 1   2 )
 12
n1
t

 22
n2
(X1  X 2 )  ( 1   2 )
s12 s22

n1 n 2
2
s s 
  
 n1 n2 
2
2
2
2
 s1   s2 
   
 n1    n2 
n1  1 n2  1
2
1
2
2
Example
Consider the same example as the last one but without the
information on the population variances. Again test at the 5%
level H 0 : 1  2 versus H1 : 1   2 .
Class 1
We need to determine
the sample means and
sample variances.
As before, the sample
means are 74 and 60.
Class 2
X1
X2
64
56
66
71
89
53
77
296
180
X1 
296
 74
4
X2 
180
 60
3
n
Remember : the sample variance is
s2 
2
(X

X
)
 i
i 1
n -1
Class 1
So we subtract the
sample mean from
each of the grades.
Class 2
X1
X1  X1
64
-10
56
-4
66
-8
71
11
89
15
53
-7
77
3
296
X2 X 2  X 2
180
X1 
296
 74
4
X2 
180
 60
3
n
s2 
sample variance
Then we square
those differences
and add them up.
2
(X

X
)
 i
i 1
n -1
Class 1
X1
Class 2
X1  X1 (X1  X1 ) 2 X2 X 2  X 2 (X 2  X2 ) 2
64
-10
100
56
-4
16
66
-8
64
71
11
121
89
15
225
53
-7
49
77
3
9
296
398
X1 
296
 74
4
180
186
X2 
180
 60
3
n
s2 
sample variance
Then we divide
that sum by n-1
to get the sample
variance.
2
(X

X
)
 i
i 1
n -1
Class 1
X1
Class 2
X1  X1 (X1  X1 ) 2 X2 X 2  X 2 (X 2  X2 ) 2
64
-10
100
56
-4
16
66
-8
64
71
11
121
89
15
225
53
-7
49
77
3
9
296
398
296
 74
4
398
s12 
 132.67
3
X1 
180
186
180
 60
3
186
s 22 
 93.0
2
X2 
What are the dof & critical t value?
2
2
Since we have: s1  132.67, s2  93.0, n1  4, and n2  3,
our very messy dof formula yields
2
2
 s12 s22 
 132.67 93.0 






4
3 
 n1 n2 


= 4.860
2
2
2
2
2
2
 s1   s2 
 132.67   93.0 

 

   
4
3

 

 n1    n2 
3
2
n1  1 n2  1
So the degrees of freedom is
the integer part of 4.86 or 4.
0.95
0.025
For a 5% two-tailed test & 4
dof, the t value is 2.7764 .
-2.7764
0.025
0
2.7764
t4
Next we need to compute our test statistic.
Then, t 4 
n1  4;
X1  74;
s12  132.67
n 2  3;
X2  60;
s 22  93.0
(X1  X 2 )  ( 1  2 )
s12
s22

n1
n2
Since our t-value, 1.748,
is in the acceptance
region, we accept
H0: 1 = 2
(74  60)  (0)

 1.748
132.67 93

4
3
0.95
0.025
-2.7764
0.025
0
2.7764
t4
Sometimes we don’t know the population variances, but we
believe that they are equal.
So we need to compute an estimate of the common variance,
which we do by pooling our information from the two samples.
We denote the pooled sample variance by sp2.
sp2 is a weighted average of the two sample variances, with
more weight put on the sample variance that was based on the
larger sample.
If the two samples are the same size, sp2 is just the sum of the
two sample variances, divided by two.
In general,
2
2
(n

1
)
s

(n

1
)
s
1
2
2
s 2p  1
n1  n 2  2
Let’s return for a moment to the
statistic that we used to compare
population means when the
population variances were known.
Since the population variances
2
2
 12
n1
Z
 and  are believed to be
2
1
Z
(X1  X 2 )  ( 1   2 )
2
equal, let' s denote them both by  .
n1
Then we can factor out the 2
and replace the 2 by sp2
and the Z by t.
The number of degrees
of freedom is n1 + n2 -2.
n2
(X1  X 2 )  ( 1   2 )
2
Z

 22

2
n2
(X1  X 2 )  ( 1   2 )
1
1 
   
 n1 n 2 
2
t
(X1  X 2 )  ( 1   2 )
1
1 
s   
 n1 n 2 
2
p
Let’s do the previous example again, but this time assume that the
unknown population variances are believed to be equal. We had:
n 2  3; X2  60; s 22  93.0
n1  4;
X1  74; s12  132.67;
2
2
(n

1
)
s

(n

1
)
s
(3)(132.67)  (2)(93.0)
1
2
2
s 2p  1

 116.8
n1  n 2  2
43 2
t
(X1  X 2 )  ( 1   2 )
1
1 
s   
 n1 n 2 
2
p

(74  60)  (0)
1 1
116.8  
 4 3
 1.70
The number of degrees of freedom is n1 + n2 -2,
and we are doing a 2-tailed test at the 5% level.
Since our t-statistic 1.70 is in the
acceptance region, we accept H0: 1 = 2.
crit. reg.
.025
-2.571
Acceptance
region
0
crit. reg.
.025
2.571
t5
In the previous three hypothesis tests, we tested
whether 2 populations has the same mean, when
we had 2 independent samples.
We can’t use those tests, however, if the 2 samples
are not independent.
For example, suppose you are looking at the weights
of people, before and after a fitness program.
Since the weights are for the same group of people,
the before and after weights are not independent of
each other.
In this type of situation, we can use a hypothesis test
based on matched-pairs samples.
The hypotheses are
H 0 :   0 and H1 :   0,
where  is the population mean difference .
The test statistic is
D-
t n 1 
sD
n
where D is the sample mean difference ,
s D is the sample standard deviation of the difference s,
and n is the number of pairs of observatio ns.
Example
Before and after a fitness program, the following sample of weights
is observed. Test at the 5% level whether t he program causes a
weight change, i.e. : H 0 :   0 versus H1 :   0 .
person
Before
After
1
168
160
2
195
197
3
155
150
4
183
180
5
169
163
D = A-B
2
D  D (D  D)
First we calculate the weight differences.
person
Before
After
D = A-B
1
168
160
-8
2
195
197
2
3
155
150
-5
4
183
180
-3
5
169
163
-6
2
D  D (D  D)
Then we add up the differences and determine the mean.
person
Before
After
D = A-B
1
168
160
-8
2
195
197
2
3
155
150
-5
4
183
180
-3
5
169
163
-6
-20
D
- 20
 4
5
2
D  D (D  D)
Next we need to calculate the sample standard deviation for the
weight differences.
The sample standard deviation is s D 
 (D  D )
2
n 1
person
Before
After
D = A-B
1
168
160
-8
2
195
197
2
3
155
150
-5
4
183
180
-3
5
169
163
-6
-20
D
- 20
 4
5
2
D  D (D  D)
We subtract the mean difference from each of the D values.
2
D  D (D  D)
person
Before
After
D = A-B
1
168
160
-8
-4
2
195
197
2
6
3
155
150
-5
-1
4
183
180
-3
1
5
169
163
-6
-2
-20
D
- 20
 4
5
We square the values in that column, and add up the squares.
2
D  D (D  D)
person
Before
After
D = A-B
1
168
160
-8
-4
16
2
195
197
2
6
36
3
155
150
-5
-1
1
4
183
180
-3
1
1
5
169
163
-6
-2
4
-20
D
- 20
 4
5
58
Then since
sD 
 (D  D )
n 1
2
,
we divide by n-1 = 4, and take the square root.
2
D  D (D  D)
person
Before
After
D = A-B
1
168
160
-8
-4
16
2
195
197
2
6
36
3
155
150
-5
-1
1
4
183
180
-3
1
1
5
169
163
-6
-2
4
-20
D
- 20
 4
5
58
sD 
58
 14.5  3.81
4
Next we assemble our statistic.
D-
t n 1 
sD
n
-4-0

 - 2.35
3.81
5
2
D  D (D  D)
person
Before
After
D = A-B
1
168
160
-8
-4
16
2
195
197
2
6
36
3
155
150
-5
-1
1
4
183
180
-3
1
1
5
169
163
-6
-2
4
-20
D
- 20
 4
5
58
sD 
58
 14.5  3.81
4
Since we had 5 people and 5 pairs of weights, n=5, and
the number of degrees of freedom is n-1 = 4.
We’re doing a 2-tailed t-test at the 5% level, so the
critical region looks like this:
crit. reg.
crit. reg.
Acceptance region
.025
-2.776
.025
0
2.776
t4
Since our t-statistic, -2.35, is in the acceptance region, we accept
the null hypothesis that the program would cause no average
weight change for the population as a whole.
Hypothesis tests on the difference between 2
population proportions, using independent
samples H0 : 1 - 2  0 versus H1: 1   2
If you look at the statistics we have used in our hypothesis
tests, you will notice that they have a common form:
(point estimate) - (mean of the point estimate)
std dev, or estimate of the std dev, of the pt estimate
In our hypothesis tests on the difference between 2 population
proportions, we are going to use that same form.
The point estimate is p1 - p2 , which is the difference in the sample proportions.
The mean of the point estimate is 1 - 2 , which is the population proportion.
We still need to determine the standard deviation, or an
estimate of the standard deviation, of our point estimate.
We start with V(p1 -p2 ).
Under our assumption that the samples are independent,
 (1-1 )  2 (1- 2 )
V(p1 -p2 )  V(p1 )  V(p2 )  1

.
n1
n2
According to the null hypothesis,  1   2 ,
so we'll call them both  .
So, V(p1 -p2 ) 
 (1   )
n1

 (1- )
n2
 1
1 
  (1- ) 


n2 
 n1
 1
1 
We have V(p1 -p2 )    (1- ) 

,
n2 
 n1
but we don't know what  is.
We need to estimate the hypothetically common value of  .
Let X1 be the number of "successes" in the 1st sample, which is of size n1 ,
and X 2 be the number of "successes" in the 2nd sample, which is of size n 2 .
Our estimate of the common value for  will be the proportion of successes
in the combined sample or p 
X1  X 2
.
n1  n 2
 1
1 
So our estimate of V(p1 - p2 ) is p(1-p)    , and our estimate
 n1 n 2 
of the standard deviation of p1 - p2 is the square root of that
expression.
Assembling the pieces, we have
Z 
(p1 - p 2 ) - (1 - 2 )
 1
1 
p(1-p)   
 n1 n 2 
where
X1  X 2
p
n1  n 2
.
Suppose the proportions of Democrats in samples of 100 and 225
from 2 states are 33% and 20%. Test at the 5% level the hypothesis
that the proportion of Democrats in the populations of the 2 states
H0 : 1 - 2  0 versus H1: 1 - 2  0.
are equal.
The number of Democrats in the first sample is (.33)(100)  33,
and the number in the second sample is (.20)(225)  45.
So the proportion in the combined sample is
33  45
78
p

 0.24 , and
100  225 325
Z 
(p1 - p 2 ) - ( 1 - 2 )
 1
1 
p(1-p)   
 n1 n 2 

1-p  .76.
(.33 - .20) - (0)
1 
 1
(.24)(.76) 


 100 225 
 2.53
We’re doing a 2-tailed Z-test at the 5% level, so the critical region
looks like this:
crit. reg.
crit. reg.
Acceptance region
.025
-1.96
.025
0
1.96
Z
Since our Z-statistic, 2.53, is in the critical region, we reject the
null hypothesis and accept the alternative that the proportions of
Democrats in the 2 states are different.
Sometimes you want to test whether two
independent samples have the same variance.
If the populations are normally distributed,
we can use the F-statistic to perform the test.
The F-statistic is
Fn1 1, n2 1
where s 
2
1
n1

i 1
s2 
2
n2

i 1
2
1
2
2
s

s
( X 1i  X 1 ) 2
is the sample variance for the first sample,
n1  1
( X 2i  X 2 ) 2
is the sample variance for the second sample,
n2  1
and s12 is the larger of the two sample variances.
Notice that, because s12 is larger than s 22 , this F-statistic will always be greater than 1.
So our critical region will always just be the upper tail.
This F-statistic has n1-1 degrees of freedom for the numerator, and
n2-1 degrees of freedom for the denominator.
The distribution of our F-statistic,
Fn1 1, n2 1
s12
 2 ,
s2
with the tail for the critical region looks like this:
f(F)
acceptance region
critical
region
Fn1 1, n2 1
Two-sided versus one-sided tests
for equality of variance
While you are always using the upper tail of the
F-test on tests of equality of variance, the size of
the critical region you sketch varies with whether
you have a two-sided or a one-sided test.
Let’s see why this is true.
For a two-sided test, we have: H0 : 12   22 versus H1 : 12   22 .
While, for our samples, the sample variance from the first group was
greater, s12  s22 ,
our alternative hypothesis indicates that we think that the population
variance could have been larger or smaller for the first population:
12   22 or 12   22 .
Our sketch of the critical region is based on the situation in which the
variance is greater for the first group, but we admit that, if we had
information for the entire population, we might find that the situation is
reversed.
So there is an implicit second sketch of an F-statistic in which the sample
variance of the second group is in the numerator.
Thus, for each of the sketches, the sketch we draw and the implicit sketch,
the area of the critical region is α/2, half of the test level α.
So, for example, if you are doing a two-sided test at the 5% level, your
sketch will show a tail area of 0.025.
What if we are performing a one-sided test?
H0 : 12   22 versus H1 : 12   22
Now we are looking at a situation in which the sample variance is again
larger for the first group. This time however, we want to know if, in
fact, the population variance is really larger for the first group. So we
have the one-sided alternative shown above.
Keep in mind that, as usual with one-sided tests, the null hypothesis is
the devil’s advocate view. Here the devil’s advocate is saying: nah,
the population variance for the first group isn’t really any larger than
for the second group.
For a one-sided test with level α, your critical region will have area α.
For example, if you are performing a one-sided test at the 5% level, the
critical region will have area 0.05.
Example: You are looking at test results for two groups of students.
There are 25 students in the first group, for which you have calculated
the sample variance to be 15. There are 30 students in the second
group, for which you have calculated the sample variance to be 10.
Test at the 10% level whether the populations variances are the same.
F24,
29
2
1
2
2
s

s
15

 1.5
10
f(F)
Because 1.5 is in the acceptance
region, you cannot reject the
null hypothesis and you
conclude that the variances of
the two populations are the
same.
There are 25-1 = 24 degrees of
freedom in the numerator and
30-1=29 degrees of freedom in
the denominator.
This is a two-sided test, so the
critical region has area 0.05.
acceptance region
critical
region
1.90
0.05
F24, 29
In the two sections we have just completed, we did 9
different types of hypothesis tests.
1.
2.
3.
4.
5.
6.
7.
8.
9.
population mean - 1 sample - known population variance
population mean - 1 sample - unknown population variance
population proportion - 1 sample
difference in population means - 2 independent samples known population variances
difference in population means - 2 independent samples unknown population variances
difference in population means - 2 independent samples unknown population variances that are believed to be equal
difference in population means - 2 dependent samples
difference in population proportions - 2 independent samples
Difference in population variances - 2 independent samples
The statistics for these tests are compiled on a summary sheet
which is available at my web site.
Related documents