Download IE256-HypothesisTesting - Course ON-LINE

Document related concepts
no text concepts found
Transcript
IŞIKIE
Hypothesis Testing
Statistical Hypotheses
A problem often confronting the scientist or engineer is the formation of a
data-based decision procedure that can produce a conclusion about some
scientific system.
Examples
A medical researcher may decide on the basis of experimental evidence whether
coffee drinking increases the risk of cancer in humans.
An engineer might have to decide on the basis of sample data whether there is a
difference between the accuracy of two kinds of gauges.
A sociologist might wish to collect appropriate data to enable him or her to decide
whether a person’s blood type and eye color are independent variables.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 1
IŞIKIE
Hypothesis Testing
In each of these cases the scientist or engineer postulates or conjectures
something about a system.
In each case, the conjecture can be put in the form of a statistical hypothesis.
Definition 10.1. A statistical hypothesis is an assertion or conjecture concerning one
or more populations.
We take a random sample from the population of interest and use the data to
provide evidence that either supports or does not support the hypothesis.
•Evidence from the sample that is inconsistent with the stated hypothesis leads
to a rejection of the hypothesis
•Evidence supporting the hypothesis leads to its acceptance.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 2
IŞIKIE
Hypothesis Testing – The role of probability
Probability of a Wrong Conclusion
The decision procedure must be done with the awareness of the probability of a
wrong conclusion.
Example: The conjecture postulated by the engineer:
“The fraction of defective p in a certain process is 0.10.”
Suppose that 100 items are tested and 12 items are found defective. Is it likely to
have 12 defective items in a sample of 100?
The acceptance of a hypothesis implies that there is no sufficient evidence to refute
it.
Rejection implies that the sample evidence refutes it.
•Rejection means that there is small probability of obtaining the sample
information observed when, in fact, the hypothesis is true.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 3
Hypothesis Testing – The role of probability
IŞIKIE
The formal statement of a hypothesis is often influenced by the structure of the
probability of a wrong conclusion.
If the scientist is in strongly supporting a contention, he or she hopes to arrive at a
contention in the form of a rejection of a hypothesis.
•If the medical researcher wishes to show strong evidence in favor of the
contention that coffee drinking increases the risk of cancer, the hypothesis
should be of the form “there is no increase in cancer risk produced by drinking
coffee.”
•To support the claim that one kind of gauge is more accurate that another, the
engineer test the hypothesis that there is no difference in the accuracy of the
two kinds of gauges.
The formal statement of the hypothesis is very important when the data analyst
formalizes experimental evidence on the basis of hypothesis testing.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 4
IŞIKIE
Hypothesis Testing – Formal Statement
The Null and Alternative Hypotheses
H0 : the null hypothesis; any hypothesis we wish to test is denoted by H0
H1 : alternative hypothesis; the complement of H0
A null hypothesis concerning a population parameter will always be stated so as to
specify exact value of the parameter, where as the alternative hypothesis allows
for the possibility of several values.
p  0.05
If H0 is p = 0.05 H1 would be one of the following
IE256 Engineering Statistics – Summer 2010
p  0.05
p  0.05
Hypothesis Testing 5
Testing a Statistical Hypothesis
IŞIKIE
Example: A certain type of cold vaccine is known to be only 25% effective after a
period of 2 years. To determine if a new vaccine is superior in providing protection
against the same virus, suppose that 20 people are chosen at random and
inoculated. Ifmore than 8 of these receiving the new vaccine surpass the 2 year
period without contracting the virus, the new vaccine will be considered superior to
the one presently in use.
The requirement that the number exceed represents a modest gain over the 5
people that could be expected to receive protection if the 20 people had been
inoculated with the vaccine already in use.
The alternative hypothesis is that the new vaccine is in fact superior. This is
equivalent to testing th hypothesis that the binomial parameter for the probability
of a success on a given trial is p = 0.25 against the alternative that p > 0.25.
H0 : p  0.25
H1 : p  0.25
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 6
Hypothesis Testing – Test Statistic and Critical Region
IŞIKIE
The Test Statistic is the value on which we base our decision. In the previous
example, the test statistic is X, the number of individuals in the test group who
receive protections at least 2 years.
All possible scores greater than 8 constitute the critical region.
The last number that we observe in passing into the critical region is called the
critical value.
In our illustration the critical value is the number 8. Therefore
If x > 8, we reject H0 in favor of the alternative hypothesis H1
If x ≤ 8, we fail to reject H0
0
1
acceptance region
critical region
Do not reject H0
Reject H0
2
3
4
5
6
7
8
IE256 Engineering Statistics – Summer 2010
9
10
11
12
13 14 15 16 17 18 19 20
Hypothesis Testing 7
IŞIKIE
Hypothesis Testing – Type I and II Errors
Definition 10.2. Rejection of the null hypothesis when it is true is called a type I
error.
Definition 10.3. Nonrejection of the null hypothesis when it is falseis called a type II
error.
In testing any statistical hypothesis, there are four possible situations that
determine whether our decision is correct or in error.
Do not reject H0
Reject H0
IE256 Engineering Statistics – Summer 2010
H0 is true
H0 is false
Correct decision
Type II error
Type I error
Correct decision
Hypothesis Testing 8
Hypothesis Testing – Type I and II Errors
IŞIKIE
The Probability of a Type I Error
The probability of committing a type I error, also called the level of significance, is
denoted by the Greek letter a .
In the previous example, a type I error will occur when more than 8 individuals
surpass the 2-year period without contracting the virus using the new vaccine and
the effective rate is 0.25.
a  P( type I error)  P( X  8 when p  0.25)

20
 b(x ; 20 ,0.25 )
x 9
8
 1   b( x ; 20 , 0.25 )
x 0
 1  b* (8 ; 20 , 0.25 )  1  0.9591  0.0409
We say that null hypothesis, p = 0.25, is being tested at the a = 0.0409 level of
significance. A critical region of size 0.0409 is very small; it would be most unusual
for more than 8 individuals to remain immune for a 2-year period with the new
vaccine assuming it is equivalent to the one now on the market.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 9
IŞIKIE
Hypothesis Testing – Type I and II Errors
The Probability of a Type II Error
If we test the null hypothesis that p = 0.25 against the alternative hypothesis that p
= 0.50, then we are able to compute the probability of not rejecting H0, when it is
false. We simply find the probability of obtaining 8 or fewer in the group that
surpass the 2-year period when p = 0. 5:
  P( type II error)  P( X  8 when p  0.50)

8
 b(x ; 20 ,0.50 )
 0.2517
x 0
Ideally, we like to use a test procedure for which the type I and type II error
probabilities are both small.
Assume, the only time he wishes to guard against the type II error is when the true
value of p is at least 0.7. If p = 0.70, this test procedure gives
  P( type II error)  P( X  8 when p  0.70)

8
 b(x ; 20 ,0.70)
 0.0051
x 0
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 10
Hypothesis Testing – Type I and II Errors
IŞIKIE
The Role of a ,  , and Sample Size
A reduction in  is always possible by increasing the size of the critical region. What
happens if we set our critical value to 7?
a

20
 b(x ; 20 ,0.25 )  1  0.8982  0.1018
x 8
7
 b(x ; 20 ,0.50 )  0.1316
x 0
For a fixed sample size, a decrease in the probability of one error usually result
in an increase in the probability of the other error.
The probability of committing both type of error can be reduced by increasing the
sample size.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 11
IŞIKIE
Hypothesis Testing – Type I and II Errors
Consider the same problem but this time with a sample of 100 individuals. If more
than 36 of the group surpass the 2-year period we reject the null hypothesis that
p = 0.25 and accept the alternative hypothesis that p > 0.25
Acceptance region
Critical region
x
x
x  36 
x  36 
Use the normal curve approximation:
  np  (100)(0.25)  25
  npq  100(0.25 )(0.75 )  4.33
36.5  25 

4.33 

 1  P Z  2.66   1  0.9961  0.0039
a  P type I error   P X  36 when p  0.25  P  Z 
  P type II error   P X  36 when p  0.50  P  Z 
 P Z  2.7   0.0035

36.5  50 

5.00 
Obviously, the type I and type II errors will rarely occur if the experiment contsists
of 100 individuals.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 12
Hypothesis Testing – Type I and II Errors
IŞIKIE
P X  x when p  0.25
x
a  0.0039
36.5  25 

a  P X  36 when p  0.25  P Z 

4
.
33


IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 13
IŞIKIE
Hypothesis Testing – Type I and II Errors
  4.33
  5.00
  25

  50
36.5
a

36.5  25 

4.33 


36.5  50 

5.00 

a  Ptype I error   P Z 
  Ptype II error   P Z 
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 14
IŞIKIE
One and Two Tailed Tests
Illustration with a Continuous Random Variable
Consider the null hypothesis that average weight of male students in a certain
college is 68 kilograms against the alternative hypothesis that it is unequal to 68
kilograms.
H0 :   68
H1 :   68
The alternative hypothesis allows for the possibility that  < 68 or  > 68
A sample mean that falls close to the hypothesized value of 68 would be
considered evidence in favor of H0. On the other hand, a sample mean that is
considerably less than or more than 68 would be evidence inconsistent with H0 and
therefore favoring H1.
The sample mean, X, is the text statistic in this case. A critical region for the test
statistic might arbitrarily be chosen to be the two intervals x  67 and x  69 . The
acceptance region will then be the interval 67  x  69 .
acceptance
region
critical region
...
Reject H0
63
64
65
66
67
IE256 Engineering Statistics – Summer 2010
Do not
reject H0
68
critical region
...
Reject H0
69
70
71
72
73
Hypothesis Testing 15
IŞIKIE
Hypothesis Testing
Assume that the standard deviation of the population of weights to be  = 3.6 .
For large samples we may substitute s for  if no other estimate of  is available.
Our decision statistic, based on a random sample of size n = 36, will be X, the
most efficient estimator of  . From the central limit theorem, we know that the
sampling distribution of X is approximately normal with standard deviation
 X   n  3.6 6  0.6 . The probability of committing a type I error, or the level
of significance of our test, is equal to the sume of the areas that have been shaded
red in each tail of the distribution in the following figure.
a /2
a /2
67
  68
69
x
a  P X  67 when   68  P X  69 when   68
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 16
Hypothesis Testing
IŞIKIE
a  P X  67 when   68  P X  69 when   68
67  68  
69  68 
 P Z 

P
Z

 

0.6  
0.6 

 PZ  1.67  PZ  1.67  2PZ  1.67  0.0950
Thus 9.5% of all samples of size 36 would lead us to reject  = 68kilograms when, in
fact, it is true. To reduce a , we have a choice of increasing the sample size or
widening the fail-to-reject (acceptance) region. Suppose that we increase the
sample size to n = 64. Then  X  3.6 8  0.45. Now
67  68  
69  68 

  P Z 

0.45  
0.45 

 PZ  2.22   PZ  2.22   2PZ  2.22   0.0264
a  P Z 
The reduction in a is not sufficient by itself to guarantee a good testing procedure.
We must also evaluate  for various alternative hypothesis.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 17
IŞIKIE
Hypothesis Testing
Illustration of the probability of committing a type I error for sample sizes n = 36
and n = 64.
 X  68
 X  0.6
a
2
a  0.0950
a
 0.0475
2
67
  68
 0.0475
x
69
 X  68
 X  0.45
a  0.0264
a
2
a
 0.0132
2
67
IE256 Engineering Statistics – Summer 2010
  68
69
 0.0132
x
Hypothesis Testing 18
IŞIKIE
Hypothesis Testing
If it is important to reject H0 when the true mean is some value  ≥ 70 or  ≤ 66,
then the probability of committing a type II error should be computed and
examined for the alternatives  = 66 and  = 70. Because of symmetry, it is only
necessary to consider the probability of not rejecting the null hypothesis that
 = 68 when the alternative  = 70 is true. A type II error will result when the
sample mean x̄ falls between 67 and 69 when H1 is true. Therefore (figure 10.6 p33)
69  70 
 67  70
Z

0
.
45
0
.
45


 P 6.67  Z  2.22   PZ  2.22   PZ  6.67
 0.0132  0.0000  0.0132
  P67  X  69 when   70  P
If the true value of  is the alternative  = 66 , the value of  will again be 0.0132.
For all possible values of  < 66 or  > 70 , the value of  will be even smaller when
n = 64, and consequently there would be little chance of not rejecting H0 when it is
false.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 19
IŞIKIE
Hypothesis Testing
Illustration of the probability of committing a type I error for sample size n = 64
when the alternative hypothesis is  = 70
 X  70
 X  0.45
 X  68
 X  0.45

67
68
69
70
71
72
x
  P67  X  69 when   70  0.0132
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 20
IŞIKIE
Hypothesis Testing
The probability of committing a type II error increases rapidly when the true value
of  approaches, but is not equal to, the hypothesized value. Of course, this is
usually the situation where we do not mind making a type II error.
For example, if the alternative hypothesis  = 68.5 is true, we do not mind
committing a type II error by concluding that the true answer is  = 68. The
probability making such an error will be high when n = 64.
69  68.5 
 67  68.5
Z

0.45 
 0.45
 P 3.33  Z  1.11  PZ  1.11  PZ  3.33
 0.8665  0.0004  0.8661
  P67  X  69 when   68.5  P
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 21
IŞIKIE
Hypothesis Testing
Illustration of the probability of committing a type I error for sample size n = 64
when the alternative hypothesis is  = 70
 X  68
 X  0.45
 X  68.5
 X  0.45

67
68 68.5 69
70
71
x
  P67  X  69 when   68.5  0.8661
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 22
Hypothesis Testing
IŞIKIE
Often different types of tests are compared by contrasting power properties.
Consider the previous illustration in which we were testing H0 :  = 68 vs H1 :  ≠ 68 .
As before, suppose we are interested in assessing the sensitivity of the test.
The test is governed by the rule that we do not reject H0 if 67 ≤  ≤ 69 . We seek the
capability of the test for properly rejecting when indeed  = 68.5 . We have seen
that the probability of a type II error is given by  = 0.8661 . Thus the power of the
test is 1 – 0.8661 = 0.1339 . In a sense, the power is a more succinct measure of how
sensitive the test is for “detecting differences” between a mean of 68 and 68.5. In
this case, if  is truly 68.5, the test would not be a good one if it is important that
the analyst have a reasonable chance of truly distinguishing between a mean of
68.0 (specified by H0) and a mean of 68.5. From the discussion so far, it should be
clear that to produce a desirable power (say, greater than 0.8 or 0.9), one must
either increase a or increase the sample size.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 23
Hypothesis Testing – Type I and II Errors
IŞIKIE
Important Properties of Test of Hypothesis
1.
The type I error and the type II are related. An decrease in the probability of
one generally results in an increase in the probability of the other.
2.
The size of the critical region, and therefore the probability of committing a
type I error, can always be reduced by adjusting the critical values
3.
An increase in the sample size n will reduce a ,  simultaneously
4.
If the null hypothesis is false,  is a maximum when the true value of a
parameter approaches the hypothesized value. The greater the distance
between the true value and the hypothesized value, the smaller  will be.
Definition 10.4. The power of a test is the probability of rejecting H0 given that a
specific alternative is true. The power of a test can be computed as 1- .
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 24
IŞIKIE
One and Two Tailed Tests
One-Tailed Tests
A test of any statistical hypothesis, where the alternative is one sided, such as
H0 :    0
H1 :    0
or perhaps
H0 :    0
H1 :    0
is called a one-tailed test.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 25
IŞIKIE
Hypothesis Testing
Two-Tailed Tests
A test of any statistical hypothesis, where the alternative is two sided, such as
H0 :    0
H1 :    0
is called a two-tailed test.
Example: Classify the following examples covered previously as one- or two-tailed
tests:
The vaccine effectiveness test
The male students’ average weight test
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 26
Hypothesis Testing
IŞIKIE
Example 10.1. A manufacturer of a certain brand of rice cereal claims that average
saturated fat content does not exceed 1.5 grams. State the null and alternative
hypotheses to be used in testing this claim and determine where (which tail or tails)
the critical region is located.
Example 10.2. A real estate agent claims that 60% of all private residences being
built today are 3-bedroom homes. To test this claim, a large sample of new
residences is inspected; the proportion of these homes with 3 bedrooms is
recorded and used as our test statistic. State the null and alternative hypotheses to
be used in this test and determine the location of the critical region.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 27
IŞIKIE
Hypothesis Testing
The Use of P-Values for Decision Making
In testing hypothesis in which the test statistic is discrete, the critical region may be
chosen arbitrarily and its size (level of significance) determined. If a is too large, it
can be reduced by making an adjustment in the critical value. It may be necessary
to increase the sample size to offset the decrease that occurs automatically in the
power of the test.
Over a number of generations of statistical analysis, it had become customary to
choose an a of 0.05 or 0.01 and select the critical region accordingly. Then, of
course, strict rejection or nonrejection of H0 would depend on that critical region.
For example, if the test is two tailed and a is set at 0.05 level of significance and
the test statistic involves, say, the standard normal distribution, then a z-value is
observed from the data and the critical region is
z  1.96 or z  1.96
where the value 1.96 is found as z0.025. A value of z in the critical region prompts the
statement, “The value of the test statistic is significant.” We can translate that into
the user’s language. For example, if the hypothesis is given by H0 :  = 10, H1 :  ≠ 10
one might say: “The mean differs significantly from the value 10.”
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 28
IŞIKIE
Hypothesis Testing
Preselection of a Significance Level
The preselection of a significance level a has its roots in the philosophy that the
maximum risk of making a type I error should be controlled. However, this
approach does not account for values of test statistics that are “close” to the
critical region.
For example, in illustration with H0 :  = 10 vs H1 :  ≠ 10 , suppose that a value of
z = 1.87 is observed; strictly speaking, with a = 0.05 the value is not significant. But
the risk of committing a type I error if one rejects H0 in this case could hardly be
considered severe. In fact, in a two-tailed scenario one can quantify this risk as
Pvalue  2P( Z  1.87 when   10)  2(0.0307)  0.0614
As a result, 0.0614 is the probability of obtaining a value of z as large or larger (in
magnitude) than 1.87 when in fact  = 10 . Although this evidence against H0 is not
as strong as that which result from a rejection at an a = 0.05 level, it is important
information to the user. Indeed, continued use of a = 0.05 or a = 0.01 is only a
result of what standards have been passed through the generations.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 29
IŞIKIE
Hypothesis Testing
The P-value approach has been adopted extensively by users in applied statistics.
The approach is designed to give the user an alternative in terms of probability to a
mere “reject” or “do not reject” conclusion. The P-value computation also gives the
user important information when the z-value falls well into the ordinary critical
region.
For example, if z is 2.73, it is informative for the user to observe that
Pvalue  2(0.0032)  0.0064
and thus the z-value is significant at a level considerably less than 0.05.
A P-value is the lowest level of significance at which the observed value of the test
statistic is significant.
The calculation of the P-value will be illustrated through the following two
examples, 10.1 on saturated fat content of cereal boxes and 10.2 on proportion of 3bedroom houses.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 30
IŞIKIE
Hypothesis Testing
Shape of the Critical Region as a Function of a
Let us reconsider Example 10.1 (p333) presented above. Assume that the standard
deviation of the saturated fat content of all boxes of rice cereal manufactured by
this manufacturer is 0.3 grams and the sample size is 40. We are testing H0 :  = 1.5,
H1 :  > 1.5 , hence the critical region has the can be expressed as { x | x > a } where a
is the critical value. The critical value, a, hence the location of the critical region
determines the level of significance, a , namely the probability of a type I error.
Conversely, when we are given a , we can find the critical value a that yields the
given level of significance. That is exactly what we shall do now:

a  1.5
 P Z 
0.3 40

P X  a when   1.5  a
Since PZ  za   a 

  a

a  1.5
 za
0.3 40
a  1.5  za
0.3
40
a  1.5  za
IE256 Engineering Statistics – Summer 2010
0.3
40
Hypothesis Testing 31
IŞIKIE
Hypothesis Testing
Therefore for different values of a we get different critical regions:
a  10%
za  1.28
a  1.561
a  10%
1.5
1.561
a  5%
za  1.64
a  1.578
a  5%
1.5
1.578
a  1%
za  2.33
a  1.610
a  1%
1.5
IE256 Engineering Statistics – Summer 2010
1.610
Hypothesis Testing 32
IŞIKIE
Hypothesis Testing
Notice how the critical region and the boundary of the critical region, namely the
critical value a, moves as the significance level a changes. In order to calculate the
P-value, we need to find the significance level a such that the observed test
statistic is right at the boundary of the critical region. Namely, the P-value is the
significance level such that the observed statistic equals the observed test statistic.
P-value is the a that satisfies this equation: x  1.5  za
0.3
40
Namely we need to look up the following za to find a : za 
x  1.5
0.3 40
Example 10.1 p333 (cont)
Assume that the standard deviation of the saturated fat content of all boxes of rice
cereal manufactured by this manufacturer is 0.3 grams. If the sample average
saturated fat content of 40 boxes of this brand of cereal is 1.545 grams, compute
the associated P-value.
1.545  1.5
x  1.545
za 
 0.95  a  17.1%
0.3 40
With such a high probability of committing a type I error, we shall not reject the null
hypothesis.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 33
IŞIKIE
Hypothesis Testing
The calculation of the P-value for a two-sided (two-tailed) test will be demonstrated
using Example 10.2 (p333) with n = 100 as follows:
a  10%
za / 2  1.64
a  0.681
b  0.519
a / 2  5%
0.519
0.60
0.681
a  5%
za / 2  1.96
a  0.696
b  0.504
a / 2  2.5%
0.504
0.60
0.696
a  1%
za / 2  2.58
a  0.726
b  0.474
0.474
IE256 Engineering Statistics – Summer 2010
a / 2  0.5%
0.60
0.726
Hypothesis Testing 34
IŞIKIE
Hypothesis Testing
The critical values, a and b, are calculated as follows:
a  0.60  za /2
0.60  0.40
100
b  0.60  za /2
0.60  0.40
100
We need to select the significance level, a , so that either a or b is equal the
observed statistic value. Use the appropriate one of the following equations to
solve for a/2 and obtain P-value = a
If p̂ > 0.60 then solve za /2 
pˆ  0.60
0.60  0.40 / 100
If p̂ < 0.60 then solve za /2 
0.60  pˆ
0.60  0.40 / 100
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 35
IŞIKIE
Hypothesis Testing
Example 10.2 p333 (cont)
Calculate the corresponding P-value if the real estate agent found 70 homes to
have 3-bedrooms out of 100 inspected. Find the probability of committing a type II
error if the null hypothesis that p = 0.60 is not rejected when the alternative
hypothesis p = 0.75 is true.
a
0.70  0.60
70
za 2 
 2.04 
 0.0206
 0.70
0.60  0.40 100
2
100
P-value = 2·0.0206 = 4.12% . With a low probability of committing a type I error we
reject the null hypothesis that p = 0.60. Since we are rejecting the null hypothesis a
type II error is not applicable. But if we were to calculate the probability of a type II
error for the stated specific alternative when for the critical region that
corresponds to the P-value (i.e. a = 4.12%) here is how:
pˆ 
The critical region when a = 4.12% is { p̂ | p̂ > 0.70 or p̂ < 0.50}
  P0.50  pˆ  0.70 when p  0.75

 P

0.50  0.75
Z
0.75  0.25 100
IE256 Engineering Statistics – Summer 2010
0.70  0.75
0.75  0.25 100

  12.41%

Hypothesis Testing 36
IŞIKIE
Hypothesis Testing – Summary 3
Hypothesis Testing with Fixed Probability of Type I Error
1. State the null and alternative hypothesis.
2. Choose a fixed significance level a .
3. Choose an appropriate test statistic and establish the critical
region based on a .
4. From the computed test statistic, reject H0 if the test statistic is in
the critical region. Otherwise, do not reject.
5. Draw scientific or engineering conclusions.
Significance Testing (P-value Approach)
1. State the null and alternative hypothesis.
2. Choose an appropriate test statistic.
3. Compute P-value based on computed value of test statistic.
4. Use judgement based on P-value and knowledge of scientific
system.
Homework #8
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 37
IŞIKIE
Hypothesis Testing
Tests on a Single Mean, Variance Known
So far we have been using the statistic X̄ in testing. We would like to standardize
our testing procedure from now on and use the statistic Z instead:
Z
X
/ n
Hence, for the following two-sided test
H0 :   0
H1 :   0
we will use the critical region given by
z  za /2
or
z  za /2
x  0
where z 
and za/2 is as defined before.
/ n
With this definition of the critical region, the probability of committing a type I error
is a .
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 38
Hypothesis Testing
IŞIKIE
Similarly for the following one-sided test
H0 :   0
H1 :   0
the critical region is given by
z  za
And finally
H0 :   0
H1 :   0
the critical region is given by
z   za
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 39
IŞIKIE
Hypothesis Testing
Example 10.3 p340
A random sample of 100 recorded deaths in the United States during the past year
showed an average life span of 71.8 years. Assuming a population standard
deviation of 8.9 years, does this seem to indicate that the mean life span today is
greater than 70 years? Use a 0.05 level of significance.
The hypothesis:
Critical region:
Computations:
H0 :   70 years
H1 :   70 years
z  z0.05
x  0
z  1.645 where z 
 n
71.8  70
z
 2.02
8.9 / 100
Decision: Reject H0 and conclude that the mean life span today is
greater than 70 years. The P-value is
P  value  PZ  2.02   0.0217
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 40
IŞIKIE
Hypothesis Testing
Example 10.4 p340
A manufacturer of sports equipment has developed a new synthetic fishing line
that he claims has a mean breaking strength of 8 kilograms with a standard
deviation of 0.5 kilogram. Test the hypothesis that  = 8kilograms against the
alternative hypothesis that  ≠ 8kilograms if a random sample of 50 lines is tested
and found to have a mean breaking strength of 7.8 kilograms. Use a 0.01 level of
significance.
The hypothesis:
H0 :   8 kilograms
H1 :   8 kilograms
Critical region:
z > z0.005 or z < –z0.005
z > 2.575 or z < –2.575
Computations:
z
x  0
where z 
/ n
7.8  8
 2.83
0.5 50
Decision: Reject H0 and conclude that the average breaking strength
is not equal to 8 but is, less than 8 kilograms. The P-value is
Pvalue  2  PZ  2.83   2  0.0023  0.0046
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 41
Hypothesis Testing
IŞIKIE
Relationship to Confidence Interval Estimation
The hypothesis testing approach to statistical inference we have covered so far is
very closely related to the confidence interval approach we have seen in the
previous chapter.
Confidence interval estimation involves computation of bounds for which it is
“reasonable” that the parameter in question is inside the bounds. On the
otherhand, hypothesis testing involves checking to see if the test statistic is in a
“reasonable” region if the null hypothesis is in fact true. Let us demonstrate this
equivalence in the case of a single population mean  with  2 known. The
structure of both hypothesis testing and confidence interval estimation is based on
the random variable
X
Z
/ n
It turns out that the testing of H0 :  = 0 against H1 :  ≠ 0 at a significance level a
is equivalent to computing a 100(1 – a )% confidence interval on  and rejecting H0
if 0 is outside the confidence interval. If 0 is inside the interval the hypothesis is
not rejected. The equivalence is very intuitive and quite simple to illustrate.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 42
IŞIKIE
Hypothesis Testing
If H0 is not rejected then –za/2 < Z < za/2
 za 2 
x  0
 za 2
/ n
which is equivalent to
x  za /2

n
 0  x  za /2

n
The confidence interval equivalence to hypothesis testing extends to difference
between two means, variances, ratios of variance, and so on. We should not
consider confidence interval estimation and hypothesis testing as separate forms
of statistical inference.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 43
IŞIKIE
Hypothesis Testing
Tests on a Single Mean, Variance Unknown
Under the assumptions we have made regarding the confidence interval for the
population mean when the variance is not known, the following T statistic has a
Student t-distribution:
X
T
S/ n
In this situation, the following critical regions will be used with the corresponding
tests:
H0 :   0
H1 :   0
H0 :   0
H1 :   0
H0 :   0
H1 :   0
t  ta /2, n1 or t  ta /2, n1
t  ta ,n1
t  ta ,n1
x  0
where t 
and ta,n-1 is as defined before.
s/ n
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 44
Hypothesis Testing
IŞIKIE
Example 10.5 p343
The Edison Electric Institute has published figures on the annual number of kilowatt
hours expended by various home appliances. It is claimed that a vacuum cleaner
expends an average of 46 kilowatt hours per year. If a random sample of 12 homes
included in a planned study indicates that vacuum cleaners expend an average of
42 kilowatt hours per year with a standard deviation of 11.9 kilowatt hours, does
this suggest at the 0.05 level of significance that vacuum cleaners expend, on the
average, less than 46 kilowatt hours annually? Assume the population of kilowatt
hours to be normal.
The hypothesis:
Critical region:
H0 :   46 kilowatt hours
t  t0.05
x  0
H1 :   46 kilowatt hours
t  1.7960
where t 
s n
42  46
t
 1.16
  11
Computations:
11.9 12
Decision: Do not reject H0 and conclude that the average number of
kilowatt hours expended annually by home vacuum
cleaners is not significantly less than 46. The P-value is
P  value  PT  1.16  0.1353
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 45
IŞIKIE
Hypothesis Testing
Tests on Two Means, Variances Known
The test statistic is
Z
( X 1  X 2 )  (  1  2 )
 12 / n1   22 / n2
The following critical regions will be used with the corresponding tests:
H0 :  1   2  d0
H 1 :  1   2  d0
z  za /2
H0 :  1   2  d0
H 1 :  1   2  d0
z  za
H0 :  1   2  d0
H 1 :  1   2  d0
z   za
where z 
( x 1  x2 )  d0
 12 / n1

 22 / n2
or z   za /2
and za is as defined before.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 46
IŞIKIE
Hypothesis Testing
Tests on Two Means, Variances Equal But Unknown
The test statistic is
( X  X 2 )  (  1  2 )
T 1
Sp 1 / n1  1 / n2
and
Sp 
S12 (n1  1)  S22 (n2  1)
n1  n2  2
The following critical regions will be used with the corresponding tests:
H0 :  1   2  d0
H 1 :  1   2  d0
t  ta / 2,n1 n2 2
H0 :  1   2  d0
H 1 :  1   2  d0
t  ta ,n1 n2 2
H0 :  1   2  d0
H 1 :  1   2  d0
t  ta ,n1 n2 2
( x 1  x 2 )  d0
t

, sp 
where
s p 1 / n1  1 / n2
or t  ta / 2,n1 n2 2
s12 (n1  1)  s22 (n2  1)
n1  n2  2
and ta ,n1+n2-2 is as
defined before.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 47
IŞIKIE
Hypothesis Testing
Example 10.6 p346
An experiment was performed to compare the abrasive wear of two different
laminated materials. Twelve pieces of material 1 were tested by exposing each
piece to a machine measuring wear. Ten pieces of material 2 were similarly tested.
In each case, the depth of wear was observed. The samples of material 1 gave an
average (coded) wear of 85 units with a sample standard deviation of 4, while the
samples of material 2 gave an average of 81 and a sample standard deviation of 5.
Can we conclude at the 0.05 level of significance that the abrasive wear of material
1 exceeds that of material 2 by more than 2 units? Assume the populations to be
approximately normal with equal variances.
H0 :  1  2  2
H1 :  1   2  2
t
Critical region:
t > 1.725
sp 
42 (11)  52 (9)
 4.478
12  10  2
( x1  x2 )  d0
(85  81)  2

 1.04
s p 1 n1  1 n2
4.478 1 12  1 10
  20
Do not reject H0; we are unable to conclude that the abrasive wear of material 1
exceeds that of material 2 by more than 2 units. The P-value is
Pvalue  PT  1.04   0.1553
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 48
IŞIKIE
Hypothesis Testing
Tests on Two Means, Unknown and Unequal Variances
The test statistic is
T 
( X 1  X 2 )  (  1  2 )
s12 / n1  s22 / n2
and


s12
n1

2

s12
n1  s22
(n1  1) 

n2
s22

2
n2

2
(n2  1)
The following critical regions will be used with the corresponding tests:
H0 :  1   2  d0
H 1 :  1   2  d0
t  ta / 2,  or t  ta / 2, 
H0 :  1   2  d0
H 1 :  1   2  d0
t  ta ,
H0 :  1   2  d0
H 1 :  1   2  d0
t  ta ,
where t 
( x 1  x 2 )  d0
s12 / n1

s22 / n2
and ta ,n is as defined before.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 49
IŞIKIE
Hypothesis Testing
Tests on Two Means, Paired Observations
The test statistic is
D  D
T
SD / n
The following critical regions will be used with the corresponding tests:
H0 :  D  d0
H 1 :  D  d0
t  ta / 2, n1 or t  ta / 2, n1
H0 :  D  d0
H 1 :  D  d0
t  ta , n1
H0 :  D  d0
H 1 :  D  d0
t  ta , n1
where t 
d  d0
and ta ,n-1 is as defined before.
sD / n
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 50
Hypothesis Testing
IŞIKIE
Example 10.7 p346
Androgen (ng/ml)
In a study conducted in the forestry and
Time of 30 Minutes after
Deer
di
Injection
wildlife department at Virginia Polytechnic
Injection
1
2.76
7.02
4.26
Institute and State University, J. A. Watson
2
5.18
3.10
-2.08
examined the influence of the drug
3
2.68
5.44
-2.76
succinylcholine on the circulation levels of
4
3.05
3.99
0.94
5
4.10
5.21
1.11
androgens in the blood. Blood samples from
6
7.05
10.26
3.21
the wild, free-ranging deer were obtained
7
6.60
13.91
7.31
via the jugular vein immediately after an
8
4.79
18.53
13.74
9
7.39
7.91
0.52
intramuscular injection of succinylcholine
10
7.30
4.85
-2.45
using darts and a capture gun. Deer were
11
11.78
11.10
-0.68
bled again approximately 30 minutes after
12
3.90
3.74
-0.16
13
26.00
94.03
68.03
the injection and then released. The levels
14
67.48
94.03
26.55
of androgens at time of capture and
15
17.04
41.70
24.66
30 minutes later, measured in nanograms
per milliliter (ng/ml), for 15 deer are given on the right. Assuming that the two
populations are normally distributed, test at the 0.05 level of significance whether
the androgen concentrations are altered after 30 minutes of restraint.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 51
IŞIKIE
Hypothesis Testing
Let 1 and 2 be the average androgen concentration at the time of injection and
30 minutes later, respectively. We proceed as follows:
H0 :  1  2 or D   1  2  0
H1 :  1  2 or D   1  2  0
Critical region:
t  2.145 or t  2.145
Computations:
d  9.848 and sd  18.474
9.848  0
t
 2.06
18.474 / 15
t
d  d0
sD / n
  14
Though the t-statistic is not significant at the 0.05 level, the P-value is
P  value  2  PT  2.06  2  0.0292  0.0584
As a result, there is some evidence that there is difference in mean circulating levels
of androgen.
Summary of Test Procedures see page 351 of the textbook for a summary of
the hypothesis testing procedures
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 52
IŞIKIE
Hypothesis Testing
Choice of Sample Size for Testing Means: A Single Mean
Suppose that we wish to test the hypothesis
H0 :   0
H1 :   0
with a significance level a when the variance  2 is known and the specific
alternative is  = 0 + d ,  is calculated as follows (see the figure on the next
page)
  P X  a when   0  d 
 X  ( 0  d ) a  ( 0  d )

 P

when   0  d 
/ n
 / n

a  0
d 
d 


 P Z 


P
Z

z




a
/ n / n 
/ n 


where a  0  za

n
. From which we conclude that  z   za 
IE256 Engineering Statistics – Summer 2010
d
/ n
Hypothesis Testing 53
IŞIKIE
Hypothesis Testing
H0 true
H1 true
 a
0
a
0d
P X  a when   0   a  a  0  za

n
  P X  a when   0  d 


d 

 P Z  za 
 n 

  z   za 
IE256 Engineering Statistics – Summer 2010
d
/ n
Hypothesis Testing 54
IŞIKIE
Hypothesis Testing
We obtain the sample size as
n
( za  z  ) 2  2
d2
In the case of a two-sided test we obtain the power 1 –  for a specified alternative
by  = 0 + d when
( za /2  z  ) 2  2
n
2
d
Example 10.8 p352
Suppose we wish to test the hypothesis H0 :  = 68kg; H1 :  > 68kg for the weights
of male students at a certain college using an a = 0.05 level of significance when it is
known that  = 5kg . Find the sample size required if the power of our test is to be
0.95 when the true mean is 69 kilograms.
Since a =  = 0.05 we have za = z = 1.645 . For the alternative  = 69, we have d = 1
and then
(1.645  1.645) 2 25
n
 270.6  271
1
In plain English, samples of size 271 are required if the test is to reject the null
hypothesis 95% of the time when, in fact,  is as large as 69 kg.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 55
IŞIKIE
Hypothesis Testing
Choice of Sample Size for Testing Means: Two Means
A similar procedure can be used to determine the sample size n = n1 = n2 required for
a specific power of the test in which two population means are being compared.
Suppose we wish to test the hypothesis
H0 :  1  2  d0
H 1 :  1   2  d0
with a significance level a when the variances 1 and 2 are known and the specific
alternative is 1 – 2 = d0 + d , the power of our test is
1    P X 1  X 2  d0  a when  1  2  d0  d 
where a  za /2 ( 12   22 ) / n
  P a  X 1  X 2  d0  a when  1  2  d0  d 

 P


 a d
( 12
  22 )

n
X 1  X 2  (d0  d )
( 12
  22 )

n
For the specific alternative 1 – 2 = d0 + d the statistic
has the standard normal distribution.
IE256 Engineering Statistics – Summer 2010


( 12   22 ) n 
a d
X 1  X 2  (d0  d )
( 12   22 ) / n
Hypothesis Testing 56
IŞIKIE
Hypothesis Testing
H0 true
a /2
d0a
H1 true
 a /2
d0
d0a
d0d
a  za /2 ( 12   22 ) / n
  P a  X 1  X 2  d0  a when  1   2  d0  d 
  P X 1  X 2  d0  a when  1   2  d0  d 
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 57
IŞIKIE
Hypothesis Testing
Solving for za/2 we get a  za /2 ( 12   22 ) / n
Then
 za /2 
a
( 12   22 ) / n


d
d


  P  za 2 
 Z  za 2 

( 12   22 ) n
( 12   22 ) n 


 

d
d




  P Z  za 2 
 P Z  za 2 

( 12   22 ) n  
( 12   22 ) n 



Approximating the very small quantity P Z  za /2  d / ( 12   22 ) / n as zero, we
obtain


d


  P Z  za /2 

( 12   22 ) / n 

From which we conclude that
 z   za /2 
d
( 12
  22 )
/n
IE256 Engineering Statistics – Summer 2010

n
( za /2  z  ) 2 ( 12   22 )
d2
Hypothesis Testing 58
IŞIKIE
Hypothesis Testing
Tests on a Single Proportion
There are three possible tests on a single proportion:
H0 : p  p0
H1 : p  p0
H0 : p  p0
H1 : p  p0
H0 : p  p0
H1 : p  p0
We can use the binomial test statistic X or we could just as well use P̂ = X / n .
Because X is a discrete binomial variable, it is unlikely that a critical region can be
established whose size is exactly equal to a prespecified value of a . For this reason
it is preferable, in dealing with small samples, to base our decisions on P-values.
Here is how we can calculate the P-values depending on the alternative hypothesis:
H0 : p  p0
H1 : p  p0
P-value  2P X  x when p  p0 
P-value  2P X  x when p  p0 
H0 : p  p0
H1 : p  p0
P-value  P X  x when p  p0 
H0 : p  p0
H1 : p  p0
P-value  P X  x when p  p0 
IE256 Engineering Statistics – Summer 2010
if x  np0
if x  np0
Hypothesis Testing 59
IŞIKIE
Hypothesis Testing
In the preceeding formulas, x is the number of successes in our sample of size n. If
the P-value is less than or equal to a , our test is significant at the a level and we
reject H0 in favor of H1.
Example 10.10 p363
A builder claims that heat pumps are installed in 70% of all homes being constructed
today in the city of Richmont, Virginia. Would you aggree with this claim if a random
survey of new homes in this city shows that 8 out of 15 had heat pumps installed?
Use a 0.10 level of significance.
The hypothesis:
H0 : p  70%
H1 : p  70%
Test statistic:
Binomial variable X with p = 0.7 and n = 15
Since x = 8 is less than np0 = (15)(0.7) = 10.5 the P-value is calculated as follows:
8
P  value  2  P X  8 when p  0.7  2   b( x;15,0.7)  0.2622  0.10
x 0
Decision: Do not reject H0. Conclude that there is insufficient reason to doubt
the builder’s claim.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 60
IŞIKIE
Hypothesis Testing
For small sample sizes (e.g. n < 20) we would use the binomial distribution to
calculate the P-values. For bigger samples we can approximate the P-value via the
normal approximation:
x  np0
z
np0q0
or
z
pˆ  p0
p0q0
n
Hence the critical regions become:
H0 : p  p0
H1 : p  p0
z  za /2
H0 : p  p0
H1 : p  p0
z  za
H0 : p  p0
H1 : p  p0
z   za
IE256 Engineering Statistics – Summer 2010
or z   za /2
Hypothesis Testing 61
IŞIKIE
Hypothesis Testing
Example 10.11 p363
A commonly prescribed drug for relieving nervous tension is believed to be only
60% effective. Experimental results with a new drug administered to a random
sample of 100 adults who were suffering from nervous tension show that 70
received relief. Is this sufficient evidence to conclude that the new drug is superior
to the one commonly described? Use a 0.05 level of significance.
The hypothesis:
H0 : p  60%
H1 : p  60%
Critical region:
z  z0.05
z  1.645
where z 
pˆ  p0
p0q0 / n
Computations:
pˆ 
x
70

 0.7
n
100
z
0.7  0.6
 2.04
(0.6)(0.4) / 100
Pvalue  PZ  2.04   0.0207
Decision: Reject H0 and conclude that the new drug is superior.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 62
IŞIKIE
Hypothesis Testing
Tests on Two Proportions
Although we could conceive of more types of hypotheses, we will focus on the
following three:
H0 : p1  p2
H1 : p1  p2
H0 : p1  p2
H1 : p1  p2
H0 : p1  p2
H1 : p1  p2
We use the random variable P̂1 – P̂2 as our test statistic. As discussed before, if the
sizes of the two samples from which P̂1 and P̂2 are computed are sufficiently
large, then the difference of these two point estimators is approximately normally
distributed with mean p1 – p2 and variance p1q1/n1 + p2q2/n2 . Therefore, our critical
region can be established by using the standard normal variable
(Pˆ1  Pˆ2 )  ( p1  p2 )
Z
p1q1 / n1  p2 q2 / n2
When H0 is true, we can substitute p1 = p2 = p and q1 = q2 = q (where p and q are the
common values)
Pˆ1  Pˆ2
Z
pq(1 / n1  1 / n2 )
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 63
IŞIKIE
Hypothesis Testing
However, we must estimate the parameters p and q that appear in the above
statistic. When the null hypothesis is assumed true, we basically assume that both
samples come from populations with equal proportion values. Therefore we can
pool the data from both samples to estimate the unknown parameter:
pˆ 1  pˆ 2
z
pˆ qˆ (1 / n1  1 / n2 )
where
pˆ 1 
x1
n1
x
pˆ 2  2
n2
As before, the critical regions are:
H0 : p1  p2
z  za /2
H1 : p1  p2
H0 :
H1 :
p1  p2
p1  p2
z  za
H0 :
H1 :
p1  p2
p1  p2
z   za
IE256 Engineering Statistics – Summer 2010
x x
pooled estimate of
pˆ  1 2 
the proportion p
n1  n2
or z   za /2
Hypothesis Testing 64
Hypothesis Testing
IŞIKIE
Example 10.12 p365
A vote is to be taken among the residents of a town and the surrounding county to
determine whether a proposed chemical plant should be constructed. The
construction site is within the town limits, and for this reason many voters in the
county feel that the proposal will pass because the large proportion of town voters
who favor the construction. To determine if there is a significant difference in the
proportion of town voters and county voters favoring the proposal, a poll is taken.
If 120 of 200 town voters favor the proposal and 240 of 500 county residents favor
it, would you agree that the proportion of town voters favoring the proposal is
higher than the proportion of county voters? Use an a = 0.05 level of significance.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 65
IŞIKIE
Hypothesis Testing
Let p1 and p2 be the true proportion of voters in the town and county, respectively,
favoring the proposal.
The hypothesis: H0 : p1  p2
H1 : p1  p2
Critical region:
z  z0.05  z  1.645
Computations:
pˆ 1 
z
120
 0.60
200
pˆ 2 
240
 0.48
500
pˆ 
120  240
 0.51
200  500
0.60  0.48
 2.90
(0.51)(0.49)(1 / 200  1 / 500)
P  value  PZ  2.90   0.0019
Decision: Reject H0 and agree that the proportion of town voters favoring
the proposal is higher than the proportion of county voters.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 66
IŞIKIE
Hypothesis Testing
Tests on a Variance
When testing a population variance we need to consider the following three
hypothesis:
H0 :  2   02
H0 :  2   02
H0 :  2   02
H1 :  2   02
H1 :  2   02
H1 :  2   02
We base our rejection/nonrejection decision on the sample variance information
obtained from the data. Assuming the population being sampled has normal
distribution, the chi-squared value for testing  2   02 is given by  2  (n  1) s 2 /  02
where s2 is the sample variance, and  02 is the value of  2 given by the null
hypothesis.
H0 :  2   02
2
2
2
2



or



a /2
1a /2
H1 :  2   02
H0 :  2   02
H1 :  2   02
 2  a2
H0 :  2   02
H1 :  2   02
 2  12a
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 67
IŞIKIE
Hypothesis Testing
Example 10.13 p367
A manufacturer of car batteries claims that the life of his batteries is approximately
normally distributed with a standard deviation equal to 0.9 year. If a random
sample of 10 of these batteries has a standard deviation of 1.2 years, do you think
that  > 0.9 year? Use a 0.05 level of significance.
The hypothesis:
H0 :  2  0.81
Critical region:
 2   02.05
H1 :   0.81
2
Computations:
2 
  16.919
2
(10  1)(1.2) 2
P  value  P
(0.9)
2
where  
2
(n  1) s 2
 02
 16.0
 2  16.0  0.07
Decision: The 2-statistic is not significant at the 0.05 level. However,
based on the P-value 0.07, there is evidence that  > 0.9 year.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 68
IŞIKIE
Hypothesis Testing
Testing the Equality of Two Variances
When testing the equality of two variances the usual alternatives are:
H0 :  12   22
H1 :  12   22
H0 :  12   22
H1 :  12   22
H0 :  12   22
H1 :  12   22
For independent samples of size n1 and n2, respectively, we use the f-value f  s12 / s22
for testing  12   22 where s12 and s22 are the variances computed from the two
samples. If the two populations are approximately normally distributed and the null
hypothesis is true, the ratio f  s12 / s22 is a value of the F-distribution with n1 = n1 – 1
and n2 = n2 – 1 degrees of freedom. Therefore the critical regions are:
H0 :  12   22
H1 :  12   22
f  fa /2( 1 , 2 )
H0 :  12   22
H1 :  12   22
f  fa ( 1 , 2 )
H0 :  12   22
H1 :  12   22
f  f1a ( 1 , 2 )
IE256 Engineering Statistics – Summer 2010
or f  f1a /2( 1 , 2 )
Hypothesis Testing 69
IŞIKIE
Hypothesis Testing
Example 10.14 p369
In testing for the difference in abrasive wear of the two materials in Example 10.6,
we assumed that the two unknown population variances are equal. Were we
justified in making this assumption? Use a 0.10 level of significance.
The hypothesis:
H0 :  12   22
Critical region:
f  f0.05 (11,9) or
f  3.11
H1 :  12   22
or
f  f0.95 (11,9)
f  0.34
where f  s 12 / s22
Computations:
f
16
 0.64
25
Decision: Do not reject H0. Conclude that there is insufficient evidence that
the variances differ (i.e. are no equal).
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 70
Review Questions
IŞIKIE
Exercise 10.14. A manufacturer has developed a new fishing line, which he claims
has a mean breaking strength of 15 kilograms with a standard deviation of 0.5
kilograms. To test the hypothesis that mu=15 kilograms against the alternative that
mu<15 kilograms, a random sample of 50 lines were tested. The critical region was
defined to be x_bar<14.9.
(a) Find the probability of type I error
(b) Evaluate type II error for the alternatives mu=14.8 and mu=14.9 kilograms
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 71
Review Questions
IŞIKIE
10.36. A large automobile manufacturing company is trying to decide whether to
purchase brand A or brand B tires for its new models. To help arrive at a decision,
an experiment is conducted using 12 of each brand. The tires are run until they wear
out. The results are as shown on the table.
Test the hypothesis that there is no difference in the average wear of 2 brands of
tires. Assume the populations are approximately normally distributed with equal
variances. Use a P-value.
IE256 Engineering Statistics – Summer 2010
Hypothesis Testing 72
Related documents