Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Hypothesis Testing
Next to interval estimates, test of hypothesis is the other leg
of statistical inference. With hypothesis testing we start with
a hypothesis or a claim about the population parameter and
use the sample data to test for the validity of the hypothesis.
The purpose of the test is to see if the evidence provided by
the sample statistic is significantly different from (or
significantly contradicts) the hypothesized or claimed value
for the population parameter. Any hypothesis test thus
results in two mutually exclusive outcomes: you either reject
the hypothesis and conclude that the difference between the
sample statistic and hypothesized value for the population
parameter is significant, or not reject the hypothesis and
conclude that the difference is not significant.
Since there are two mutually exclusive outcomes for a
hypothesis test, there are two attributes to any hypothesis
test statement:
1. The hypothesis that the population mean is a certain
value. This is called the NULL HYPOTHESIS and is
denoted by H0.
2. The mutually exclusive attribute that the population
mean is not that claimed value. This is called the
ALTERNATIVE HYPOTHESIS and denoted by H1.
Test of Hypothesis for the Population Mean
Example 1
We want to test the claim or hypothesis that the average
textbook expense per semester by IUPUI undergraduate
students is (equal to) $500.
Page 1 of 15
The first step in a hypothesis test is to write the null and
alternative hypothesis. Since the claim is that the population
mean is $500, then the null hypothesis is written as:
H0: µ = $500. The mutually exclusive attribute, the
alternative hypothesis, is that mean is not equal to $500:
H1: µ ≠ $500.
H0: µ = $500
H1: µ ≠ $500
A sample of n = 110 students yielded the following textbook
expenditure data (in dollars). Using Excel, compute the
sample mean x̅ and standard deviation s.
649
243
852
243
805
425
223
658
354
738
207
840
304
434
532
335
529
626
309
431
880
366
x̅ = $547.33
674
206
839
319
753
418
372
276
442
742
578
755
340
525
821
282
283
299
501
856
771
326
588
674
666
803
684
610
685
222
235
667
220
603
718
780
240
832
824
541
478
666
395
569
264
607
747
449
839
217
410
878
207
678
687
329
700
260
355
229
725
477
217
675
459
780
841
207
397
651
855
340
348
574
575
593
847
573
873
504
744
689
791
386
878
831
789
670
s = $217.02
The sample mean x̅ = $547.33 differs (deviates) from the
hypothesized mean µ₀ = $500 by
x̅ − µ0 = $47.3
Is this deviation significant?
That is, 95% of x̅ values would fall within MOE = ±$40.56
from µ = $500.
If we prove this deviation is significant, then we reject the
null hypothesis
H0: µ = $500
and opt for the alternative hypothesis
H1: µ ≠ $500
Conclude that the population mean is different from $500.
How do we determine if the deviation x̅ − µ₀ = $47.33
is significant?
P(µ − 40.56 < x̅ < µ + 40.56) = 0.95
That benchmark value is the margin of sampling error
(MOE).
Where is x̅ = 547.33 located relative to this interval? This x̅
falls outside the 95% interval. That is, the deviation x̅ − µ₀ =
$47.33 exceeds the benchmark MOE = $40.56. Thus, we
conclude that this x̅ value does not belong to this sampling
distribution. It belongs to a different sampling distribution
with a different center of gravity: µ ≠ 500.
We know from the theory of sampling distribution that, if the
population mean were µ = $500, then 95% of the sample
means from samples of size n = 110 would fall within
Therefore, we conclude that the deviation x̅ − µ₀ = $47.33 is
significant, reject the null hypothesis H₀: µ = 500, and opt for
the alternative hypothesis H₁: µ ≠ 500.
We need a benchmark value to compare this deviation to.
𝑥𝐿 , 𝑥𝑈 = µ ± 𝑀𝑂𝐸 = µ ± 𝑧0.025
𝑥𝐿 , 𝑥𝑈 = 500 ± 1.96
217.02
√110
𝑠
√𝑛
= 500 ± 40.56
𝑥𝐿 , 𝑥𝑈 = ($459.44, $540.56)
Page 2 of 15
483
517
213
392
277
532
721
364
715
618
275
Type I Error versus Type II Error
In this example we rejected H₀ because the sample mean x̅ =
547.33 was not one of the 95% of sample means that fell
within MOE. If the population mean were in fact $500, then
1 – 0.95 = 0.05 (5%) of the sample means would fall outside
the margin of error. Thus, this allows for a 5% probability of
rejecting a true null hypothesis.
In hypothesis testing, if we reject a null hypothesis that turns
out to be true, then we have committed a TYPE I Error. In
this example, we have allowed a 5% probability for Type I
Error.
Now consider another sample of n = 110 from the same
population.
Page 3 of 15
x̅ = $533.85
444
849
642
661
659
654
354
422
293
694
420
699
751
740
554
603
499
709
520
663
506
235
264
871
770
302
677
781
399
539
529
478
596
726
781
239
328
739
809
320
239
625
862
351
632
839
452
878
386
606
813
777
297
767
866
755
393
226
506
223
438
521
508
562
442
525
376
524
401
733
508
279
796
727
306
553
509
328
736
847
435
708
262
264
412
352
557
788
665
486
290
292
875
540
297
463
345
332
423
s = $192.65
As with the previous sample, compute the 95% interval
around µ₀ = $500.
𝑥𝐿 , 𝑥𝑈 = 500 ± 1.96
192.65
= 500 ± 36 = (464, 536)
√110
This time x̅ = 533.85 falls within this interval. (See the
diagram below.)
We can therefore conclude that x̅ − µ₀ = $33.85 is not
significant (x̅ − µ₀ < MOE), and not reject H₀.
Now suppose the population mean turns out to be a value
other than $500, say $550. Therefore, we have not rejected a
false null hypothesis.
Here we have committed a TYPE II ERROR—not rejecting a
false null hypothesis. In short, the Type I and Type II errors
can be stated as:
In the diagram below, x̅ = 533.85 belongs to the sampling
distribution H₁ with a center of gravity µ = 550, even though
we have concluded (incorrectly) that it belongs to the
distribution H₀ with µ = 500.
Type I Error:
Reject a true null hypothesis.
Type II Error: Not reject a false null hypothesis.
Page 4 of 15
The following is a graphic representation of two possible outcomes of a hypothesis test (“reject” versus “not reject”) and the Type
I and Type II errors.
How to Set Up the Decision Rule Regarding H₀
Back to example 1.
Step 1: Write the null and alternative hypotheses.
H₀: µ = 500
H₁: µ ≠ 500
Step 2: Choose a probability for Type I Error (the probability of rejecting a true H₀).
The probability of Type I Error is denoted by α and is called the “level of significance” of the test. Typically, 5% is selected for α.
Page 5 of 15
Now all the ingredients of the hypothesis test are available:
n = 110
x̅ = $547.33
s = $217.02
DECISION RULE A
Reject H₀ if |x̅ − μ₀| > MOE
x̅ − μ₀ = 547.33 – 500 = 47.33
217.02
s
MOE = zα/2
= 1.96
= 40.56
110
n
x̅ − μ₀ = 47.33 > MOE = 40.56
Reject H₀. Conclude the mean is different
from $500.
α = 0.05
DECISION RULE B
Reject 𝐻0 if test statistic (TS) > critical value (CV)
Start with Decision Rule A
x̅ − μ₀ > zα/2se(x̅ )
Divide both sides by se(x̅ ).
x μ
> zα/2
se( x )
The left-hand-side of inequality is the zscore and is called the “test statistic”.
547 .33  500
z=
= 2.29
20.692
The z-score corresponding to the given
tail area is called the “critical value”.
zα/2 = z0.025 = 1.96
TS = z = 2.29 > CV = z0.025 = 1.96
Reject H₀
Page 6 of 15
DECISION RULE C
Reject the null if 𝑝𝑟𝑜𝑏 value < α
The “probability value” is 2 × the tail area
corresponding the test statistic.
2 × P(z > 2.29) = 2 × 0.0111 = 0.0222
𝑝𝑟𝑜𝑏 value = 0.0222 < α = 0.05
Reject H₀.
Two-Tails versus One-Tail Tests
A hypothesis test is said to be a two-tails test if Ho is equal (=)
to a value and H1 is not equal (≠) to that value. The above
textbook expense example was a two-tails test. Here is
another example:
Example 2
Two-tails test
To test the hypothesis, at a 5% level of significance, that the
mean vehicle speed on a freeway is 80 mph a random sample
of 120 vehicles were clocked.
H₀: µ = 80 mph
H₁: µ ≠ 80 mph
72
68
86
78
88
80
82
66
84
87
66
86
x̅ = 78.67
83
69
75
77
86
80
89
83
79
90
86
92
86
77
87
77
80
65
85
68
69
67
77
88
86
74
86
69
76
74
92
83
66
82
78
66
s = 7.98
66
74
76
70
73
71
80
91
70
89
68
90
86
87
77
83
66
79
68
77
81
65
74
76
76
84
85
82
68
67
85
68
73
90
67
75
92
75
83
79
77
66
89
79
89
87
65
83
90
85
88
73
71
91
75
85
67
81
73
89
se(𝑥 ) = 7.98⁄√120 = 0.728
Decision Rule A
Decision Rule B
Decision Rule C
Reject H₀ if |x̅ − μ₀| > MOE
Reject H₀ if TS > CV
Reject the null if 𝑝𝑟𝑜𝑏 value < α
|x̅ − µ₀| = |78.67 – 80| = 1.33
MOE = zα/2se(x̅ ) = 1.96
7.98
= 1.43
120
|x̅ − µ₀| = 1.33 < MOE = 1.43
Do not reject H₀. Conclude that the mean
is equal to 80 mph.
Page 7 of 15
88
83
75
83
83
75
84
89
69
70
82
80
𝑇𝑆 = 𝑧 =
|𝑥 − 𝜇0 | |78.67 − 80|
=
se(𝑥)
0.728
78.67  80
= 1.83
0.728
CV = zα/2 = z0.025 = 1.96
TS = z = =
TS = 1.83 < CV = 1.96
Do not reject H₀. Conclude that the mean
is equal to 80 mph.
𝑝𝑟𝑜𝑏 value = 2 × P(z > 0.78)
= 2 × 0.0336 = 0.0672
𝑝𝑟𝑜𝑏 value = 0.0672 > α = 0.05
Do not reject H₀. Conclude that the mean
is equal to 80 mph.
Lower-Tail Test
The lower tail test applies when we want to test if the sample evidence is significantly lower (less) than the hypothesized value.
Example 3
Suppose we are planning to set up a new check-out counter design in a nationwide supermarket which is claimed to reduce the
customer waiting time to below 10 minutes. But before implementing the new plan in all the stores, the design was tested in a
random sample of 40 stores for one month. The new design will be adopted if the test provides significant proof that the mean is
less the 10 minutes. The following sample data representing the average waiting time in each test store was obtained.
8.0
7.9
9.3
9.5
9.5
7.4
11.1
11.8
8.8
11.2
7.5
11.4
8.4
10.4
8.3
11.6
11.1
11.8
11.0
7.1
11.3
11.3
10.1
7.8
11.2
11.9
10.8
11.6
12.0
9.7
7.2
11.4
7.9
8.6
9.2
9.4
9.0
8.4
9.2
8.9
Does the sample data provide proof that the mean waiting time is significantly less than 10 minutes? Perform the test at a 5%
level of significance.
The null and alternative hypotheses are written as follows. Note that the statement the mean waiting time is “significantly less
than” indicates that the “µ < 10” should be the alternative hypothesis. The mutually exclusive statement is “µ ≥ 10”. This should
be the null hypothesis.
H0: µ ≥ 10
H₁: µ < 10
The ingredients of the test are:
n = 40
x̅ = 9.75
s = 1.56
se(𝑥 ) = 1.56⁄√40 = 0.247
α = 0.05
Note that since this is lower tail test the deviation of sample statistic 𝑥 from the null mean 𝜇0 will be negative:
𝑥 − 𝜇0 = 9.75 − 10 = −0.25
The “−“ sign should be taken into account when writing the decision rules.
Page 8 of 15
Decision Rule A
Reject H₀ if x̅ − μ₀ < −𝑀𝑂𝐸
Decision Rule B
Reject H₀ if TS < −CV
Decision Rule C
Reject H₀ if 𝑝𝑟𝑜𝑏 value < α
In lower tail test use the –MOE
In lower tail test use the –CV
x̅ − µ₀ = 9.75 – 10 = −0.25
TS: t =
x  μ0
= −0.25 ∕ 0.247 = −1.012
se( x )
CV: t0.05, 39 = 1.685
Since this is a lower-tail test, we are
interested only in the one tail of the t
distribution.
TS = −1.012 > −CV = −1.685
𝑝𝑟𝑜𝑏 value = P(t < −1.012) = 0.1589
To find the 𝑝𝑟𝑜𝑏 value you must use
Excel.
MOE = tα, df se(x̅ )
Note two changes in MOE:
1. We use t instead of z (n < 100)
2. We use α instead of α/2. We are
interested only in the lower tail of
the sampling distribution of x̅ .
t0.05, 39 = 1.685
MOE = 1.685 × 0.247 = 0.42 min.
x̅ − µ₀ = −0.25 > −MOE = −0.42
Do not reject H₀. The mean is not
significantly less than 10 minutes. Do not
adopt the new design.
Page 9 of 15
Do not reject H₀.
Excel 2010:
=T.DIST(x, deg_freedom, cumulative)
=T.DIST(-1.012,39,1)
or,
=T.DIST.RT(x, deg_freedom)
=T.DIST.RT(1.012,39)
Older versions: =TDIST(x, deg_freedom, tails)
=TDIST(1.012,39,1)
𝑝𝑟𝑜𝑏 value = 0.1589 > α = 0.05
Do not reject H₀.
Upper-Tail Test
The upper tail test applies when we want to test if the sample evidence is significantly higher (greater) than the hypothesized
value.
Example 4
A random sample n = 32 reimbursements for office visits to physicians paid by Medicare provided the following data:
109
102
102
102
103
105
90
108
120
113
105
92
101
97
102
118
92
103
98
96
93
110
93
102
117
118
118
100
98
108
116
97
The sample mean is x̅ = $104. Does the sample provide significant evidence that the mean reimbursement is greater than $100?
Perform the test of hypothesis at a 5% level of significance.
The null and alternative hypotheses are written as follows. Note that the statement, the mean reimbursement is “greater than”
$100, indicates that the “µ > $100” should be the alternative hypothesis. The mutually exclusive statement is “µ ≤ 10”. This
should be the null hypothesis.
H0: µ ≤ 100
H₁: µ > 100
The ingredients of the test are:
n = 32
Page 10 of 15
x̅ = 104
s = 8.69
se(𝑥 ) = 8.69⁄√32 = 1.536
α = 0.05
Decision Rule A
Decision Rule B
Decision Rule C
Reject H₀ if x̅ − μ₀ > MOE
Reject H₀ if test statistic > critical value
Reject H₀ if 𝑝𝑟𝑜𝑏 value < α
x̅ − µ₀ = 104 – 100 = $4.00
MOE = tα, df se(x̅ )
TS = t =
x  μ0
= 4 ∕ 1.536 = 2.604
se( x )
CV: t0.05, 31 = 1.696
Since this is an upper-tail test, we are
interested only in the one tail of the t
distribution.
TS > CV
𝑝𝑟𝑜𝑏 value = P(t > 2.604) = 0.0065
Reject H₀.
To find the 𝑝𝑟𝑜𝑏 value you must use
Excel.
t0.05, 31 = 1.696
MOE = 1.696 × 1.536 = $2.61.
x̅ − µ₀ = $4.00 > MOE = $2.61
Reject H₀. Conclude that the mean
reimbursement is greater than $100.
Excel 2010:
=T.DIST(x, deg_freedom, cumulative)
=T.DIST(-2.604,32,1)
or,
=T.DIST.RT(x, deg_freedom)
=T.DIST.RT(2.604,32)
Older versions: =TDIST(x, deg_freedom, tails)
=TDIST(2.604,32,1)
𝑝𝑟𝑜𝑏 value = 0.0065 < α = 0.05
Reject H₀.
Page 11 of 15
TWO IMPORTANT GUIDELINES YOU MUST OBSERVE IN STATING H₀ AND H₁
I.
The Eleventh Commandment:
Thou Shalt Not Put the “Equal” Sign in H₁
NEVER!!!
H₀: μ ≠ 100
H₁: μ = 100
H₀: μ < 100
H₁: μ ≥ 100
H₀: μ > 100
H₁: μ ≤ 100
ALWAYS!
H₀: μ = 100
H₁: μ ≠ 100
H₀: μ ≥ 100
H₁: μ < 100
H₀: μ ≤ 100
H₁: μ > 100
II. Set H₀ and H₁ such that the sample evidence conflicts with H₀ (conforms
with H₁).
INCORRECT!
CORRECT
x̅ = 95
H₀: μ ≤ 100
x̅ = 95
H₀: μ ≥ 100
H₁: μ > 100
H₁: μ < 100
̅ − 𝝁𝟎 < 𝟎, then the test is a lower
In a one-tail test, if 𝒙
tail test.
Page 12 of 15
Test of Hypothesis for the Population Proportion π
Two-Tail Test
Twenty seven percent (27%) of U.S. adult population have a higher education bachelor’s degree. To test, at a 5% level of
significance, if the same percentage of Indiana adult population has a higher education bachelor’s degree, in a random sample of
n = 1,000 of Hoosiers, the sample proportion of adults with a bachelor’s degree was p̅ = 0.258.
H₀: π = 0.27
H₁: π ≠ 0.27
n = 1000
p̅ = 0.258
Decision Rule A
Decision Rule B
Decision Rule C
Reject H₀ if p̅ − π₀ > MOE
Reject H₀ if TS > CV
Reject H₀ if 𝑝𝑟𝑜𝑏 value < α
|p̅ − π₀| = |0.258 – 0.27| = 0.012
MOE = zα/2 se(p̅ )
𝜋0 (1 − 𝜋0 )
se(𝑥 ) = √
𝑛
se(𝑥 ) = √
0.27(1 − 0.27)
= 0.014
1000
Note you must use the null proportion π₀
in the standard error formula, not p̅ .
MOE = 1.96 × 0.014 = 0.027
p̅ − π₀ = 0.012 < MOE = 0.027
Do not reject H₀. Conclude that the
Indiana percentage is the same as the
national percentage.
Page 13 of 15
α = 0.05
𝑇𝑆 =
|𝑝 − 𝜋0 | 0.012
=
= 0.86
se(𝑝)
0.014
𝑝𝑟𝑜𝑏 value = 2 × P(z > 0.86)
= 2 × 0.1949 = 0.3898
CV = zα/2 = z0.025 = 1.96
𝑝𝑟𝑜𝑏 value = 0.3898 > α = 0.05
TS = 0.86 < CV = 1.96
Do not reject H₀.
Do not reject H₀.
Lower-Tail Test
"Overall, 41 percent of teachers at U.S. public schools hold a master's degree." Test the hypothesis, at a 5% level of significance,
that less than 41 percent of Indiana public school teachers hold a master's degree. In a random sample of 500 Indiana public
school teachers, the sample proportion of teachers with a master’s degree was p̅ = 0.38.
H₀: π ≥ 0.41
H₁: π < 0.41
n = 500
p̅ = 0.38
α = 0.05
Decision Rule A
Reject H₀ if p̅ − π₀ < −MOE
In lower tail test use the –MOE
p̅ − π₀ = 0.38 – 0.41 = −0.03
MOE = zα/2 se(p̅ )
π0 (1  π0 )
n
0.41(1  0.41)
=
= 0.0220
500
se(p̅ ) =
MOE = 1.64 × 0.022 = 0.036
p̅ − π₀ = −0.03 > −MOE = −0.036
Do not reject H₀. Conclude that the
Indiana percentage is not less than the
national percentage.
Page 14 of 15
Decision Rule B
Reject H₀ if TS < −CV
In lower tail test use the –CV
p  π0
= −0.03 ∕ 0.022 = −1.36
se( p )
CV = zα = z0.05 = −1.64
TS = −1.36 > CV = −1.64
TS: z =
Do not reject H₀.
Decision Rule C
Reject H₀ if 𝑝𝑟𝑜𝑏 value < α
𝑝𝑟𝑜𝑏 value = P(z < −1.36)
= 0.0869
𝑝𝑟𝑜𝑏 value = 0.0869 > α = 0.05
Do not reject H₀.
Upper-Tail Test
A 2005 report stated that vehicle speed was a factor in 30 percent of fatal crashes. Test the hypothesis, at a 5% level of
significance, that currently more than 30 percent of fatal crashes involve speed. In a random sample 800 fatal crashes in 2010,
272 involved vehicle speed.
H₀: π ≤ 0.30
H₁: π > 0.30
n = 800
p̅ = 272 ∕ 800 = 0.34
α = 0.05
Decision Rule A
Decision Rule B
Decision Rule C
Reject H₀ if p̅ − π₀ > MOE
Reject H₀ if TS > CV
Reject H₀ if 𝑝𝑟𝑜𝑏 value < α
p̅ − π₀ = 0.34 – 0.30 = 0.04
MOE = zα/2 se(p̅ )
TS: z =
p  π0
= 0.04 ∕ 0.0162 = 2.47
se( p )
CV: zα = z0.05 = 1.64
TS = 2.47 > CV = 1.64
Reject H₀.
𝑝𝑟𝑜𝑏 value = P(z > 2.47)
= 0.0068
𝑝𝑟𝑜𝑏 value = 0.0068 < α = 0.05
π0 (1  π0 )
n
0.30(1  0.30)
=
= 0.0162
800
se(p̅ ) =
MOE = 1.64 × 0.0162 = 0.027
p̅ − π₀ = 0.04 > MOE = 0.027
Reject H₀. Conclude that in 2010 more
than 30% fatal crashes involved speed.
Page 15 of 15
Reject H₀.
Related documents