Download Hypothesis Testing - One Population Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Hypothesis Testing – Population Mean
Z-test About One Mean
The Z-test about a mean of population we are using is applied in the following three
cases:
a. The population distribution is normal and the population standard deviation
 is known. Here the sample size is irrelevant i.e., we can use this test with large
or small samples.
b. The sample size is large and the population standard deviation  is known.
In this case we are using the power of Central Limit Theorem.
c. The sample size is large and the population distribution is unknown. In case
we do not know the population standard deviation σ we can use sample standard
deviation s instead. In this case we also use the power of Central Limit Theorem.
The null hypothesis is made about the mean of the population as
H 0 :   0
(2.1)
and the alternative hypothesis H0 is made as either one of the following:
  0 ,   0 ,    0 .
(2.2)
As before we shall call the test “right-tail,” “left-tail,” or “two-tail” test depending on the
choice of alternative hypothesis in (2.2).
In all three cases above we have that the test statistic
X  0
2
(2.3)
n
is approximately standard normal. In fact in case (a) the variable (2.3) is precisely
standard normal.
Calculation needed to warrant the rejection of the null hypothesis is as in the following
table, assuming level of significance of the test as α,
H0
H1
Right-Tailed
  0
  0
Left-Tailed
  0
  0
Two-Tailed
  0
  0
In the two-tailed test the rejection criterion
X  0
 /n
2
 z /2 or that
X  0
2 /n
X  0
2 /n
Rejection Criterion
X  0
2 /n
X  0
2 /n
X  0
2 /n
 z
  z
 z /2
 z /2 simply means that
  z /2 .
This means that the test statistic is in one of the tails of the rejection region.
Left-tailed:
Right-tailed:
Two-tailed:
Fig 1.
Example 1.
Farmer Bill has grown tomatoes for many years and he used to have tomatoes with the
average weight of 150 gm. A random sample of a new batch of 45 tomatoes has average
weight of 167 gm. The population standard deviation is known to be 48 gm. Can we
conclude, with the level of significance of 10%, that his tomatoes are bigger this year?
Answer:
In our test classification this is a (b) case for Z-test.
We assume the tomatoes are equally big this year as in the past. The alternative
hypothesis is that they are bigger,
H 0 :   150
H1 :   150
X  0
2 /n

167  150
 2.3758  1.282  z0.10 .
48
45
Fig 2.
Apparently we can reject the null hypothesis so the answer is yes, the tomatoes are
getting bigger.
The p-value of the test is fairly small, only 0.0088, see the figure 3. This means we would
be able to reject the null hypothesis even if the level of significance was 1%.
Fig 3.
Example 2.
A sample of 120 men frequenting certain local bar shows their average education level is
12.3 with a standard deviation 2.6. The average education level in the US is 13. Can we
conclude the men in our local bar are less educated that the population on average, using
1% significance level?
Answer:
This is the case (c) for Z-test. Since we do not know population standard deviation we
shall use s instead of σ.
We assume they are equally educated as the general population. The alternative
hypothesis is they are less educated,
H 0 :   13
H1 :   13
X  0
12.3  13
 2.95  2.33   z0.01 .
2.6
2 /n
120
They are less educated (we reject the null hypothesis).
The p-value of the test is 0.0016.

Example 3.
An insurance company is reviewing its current policy rates. When originally setting the
rates they believed that the average claim amount was $3,500. They are concerned that
the true mean is actually higher than this, because they could potentially lose a lot of
money. They randomly selected 80 claims, and calculated the sample mean of $3,720.
Assuming that the standard deviation of claims is $1,450, use significance level of 5% to
see if the insurance company should be concerned about the rates.
Answer:
Once again we are dealing with the (c) case for Z-test of the population mean.
H 0 :   3500
H1 :   3500
X  0
2 /n

3720  3500
 1.357  1.645  z0.05
1450
80
Fig 4.
Based on this we cannot claim that the average value of claims has increased recently.
The p-value of the test is 0.0874 which means we would be able to reject the initial
hypothesis at 10% significance level.
T-test About One Mean
If the population is distributed normally and the population standard deviation σ is
unknown the test statistic to use is instance of Student’s T random variable with n – 1
degrees of freedom where n is the sample size,
.
X  0
(2.4)
s2 / n
In either case our table for testing is now looking just a little different,
H0
H1
Right-Tailed
  0
  0
X  0
Left-Tailed
  0
  0
X  0
Two-Tailed
  0
  0
Rejection Criterion
S2 / n
S2 / n
X  0
S2 / n
 t
 t
 t /2
where the critical values are with respect to n – 1 degrees of freedom. Yet again
X  0
2
S /n
 t /2 simply means that
X  0
2
S /n
 t /2 or that
X  0
S2 / n
 t /2 .
This means that the test statistic is in one of the tails of the rejection region.
The difference in testing now is that we are referring to the areas under the T-curve.
Example 4.
Let X be the tumor growth (in millimeters) per day induced in a lab mouse. We have
measurements in 9 consecutive days. There is a reason to believe the tumor growth
follows normal distribution with mean 4 (null hypothesis) and unknown standard
deviation. However our sample of 9 shows sample mean of 4.7 and sample standard
deviation of 1.2. If we use level of significance of 0.05 should we reject the hypothesis
(initial belief)?
Answer:
H0 :   4
H1 :   4
X  0
S2 / n
 1.75  1.860  t0.05 .
Fig 5.
Therefore the null hypothesis cannot be rejected based on the data we collected at the
level of significance of 5%.
The p-value of the test is not available in tables, we need to use computer software of TI
calculator. This value is 0.0591. We can see that we would reject the null hypothesis is
the significance level were 10%.
Example 5.
The US National Research Council currently recommends that females between the ages
of 11 and 50 intake 15 milligrams of iron daily. From a sample of 25 females researchers
found sample mean iron intake of 14.1 milligrams. The sample standard deviation was
2.367 milligrams. Can we conclude that the average iron intake of American female is
less than the recommended by USNRC? Use significance of 5% and assume that the
intake is normally distributed.
Answer:
H 0 :   15
H1 :   15
X  0
S2 / n

14.1  15
 1.901  1.711  t0.05
2.367
25
Fig 6.
Yes, we can conclude that the average population intake is less than recommended.
The p-value of the test is 0.0347.
Fig 8.
Homework
In couple of these problems you will need to calculate the sample parameters yourself
(mean, standard deviation).
8.2: 4 (two-tail), 8 (left-tail), 9 (right-tail), 15.
8.3 (Assume population normally distributed): 3, 5 (two-tail), 9 (left-tail), 13 (right-tail).
More:
1. Average age of male lion in captivity is 20 years. The survey of 52 lions that died in
zoos on American soil showed the average age at the time of death 19.4 years and
standard deviation of 3.4 years.
Do lions in American zoos live shorter, using 5% significance level? What is the value of
the test statistic?
2. The average age at graduation of students of a certain state university is 23.7. The
average age of graduating math students, from the sample of 83, was 24.2 with a standard
deviation of 1.6. Do math students graduate later that the rest of the students in the
university, using 1% significance level? What is the value of the test statistic?
3. The average IQ in the population is distributed normally with the mean of 100 and
standard deviation of 15. A survey of 22 NBA players has the average IQ of 103 with the
standard deviation of 18. Using significance level of 5%, can you say that NBA players
have higher IQ than the average population? What is the value of the test statistic?