Download \documentclass{article}

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 7: Point Estimation
Definition:
A (point) estimator, ˆ , is a statistic (some function of the sample X 1 ,..., X n ) used to produce a single
value estimate of a parameter . An estimate is the value an estimator takes for a particular sample.
Statistic
Parameter
Population mean, 
Sample mean, median, trimmed mean, …
Estimator for
Sample variance, S 2
Population variance,  2
Sample proportion, p
Population proportion, 
There will be a range of possible estimators for a population parameter, . However, some estimators will
be sensible to use and some will not. To help us decide whether ˆ is good to use, we look at its sampling
distribution.
Definition:
ˆ is an unbiased estimator of  if
E[ˆ]   .
So on average the observed value of an unbiased estimator will be the true value of the parameter it is trying
to estimate.
Result 1:
X is an unbiased estimator of .
Proof:
1 n
 1 n
1 n
E[ X ]  E   X i    E[ X i ]      .
n 1
n 1
 n 1
Therefore as E[X ]   , X is an unbiased estimator of .
Result 2:
S 2 is an unbiased estimator of  2 .
Note: This is why we choose n - 1 rather than n as the divisor in the definition of the sample variance.
1
Ideally we want an estimator with small bias and small standard error.
 Example:
Suppose that X 1 ,..., X n , n > 1, is a random sample from N[ ,  2 ]. Show that X1, the first observation, is
an unbiased estimator of  . If you were given a choice of using X1 or X as your estimator for  , which
would you prefer?
 Now,
X1 ~ N[ ,  2 ], so E[X1] =  . Therefore X1 is an unbiased estimator for  .
Both X1 and X are unbiased estimators, so we'll choose the one with the smallest standard error.
 s.e.[X1 ] = s.d.[X1 ] = ,

s.e.[ X ] =

.
n
So as n > 1, s.e.[ X ] < s.e.[ X1] and so we would prefer to use X as an estimator of .
2
Chapter 8: Interval Estimation
8.1 Introduction
The heights (cm) of a random sample of 12 primary school children of a certain age were as follows:
114, 137, 132, 140, 125, 116, 110, 118, 136, 131, 122, 128.
We might be interested in learning about the mean height, , of all children of that age. We know that the
sample mean can be used as a point estimate for - here x  125 .75 cm. However, because of sampling
variability, the true value of  may be quite different from this estimated value.
It would be more useful if we could use the data to identify an interval within which we believe the true
mean  would lie. We call this a confidence interval.
We can show the above data diagrammatically on a dotplot:
A dotplot showing the heights of children
110
120
130
140
C1
It is unlikely that 
would be here.
The true value of  is
likely to be somewhere in
the centre of the data.
Likewise, it is not
likely that  would be
here (if the sample
were random).
In Statistics, the degree of confidence we have that an interval contains the parameter we are trying to
estimate is expressed as a percentage. For example, if a 95% confidence interval were produced then we
would be 95% confident that the resulting interval would contain the true value of the parameter.
Alternatively, we could produce a 99% confidence interval- this would be wider than a 95% confidence
interval.
8.1.1.
Definitions
Definition:
An interval [T1 , T 2 ] is a 100(1 - )% confidence interval for a parameter  if it contains 
with probability (1 - ).
An alternative way of thinking of this is as follows…. If the method for deriving, for example, a 95%
confidence interval were to be repeated a large number of times then approximately 95% of the intervals
produced would contain the true value of .
Note: We have to be very careful when talking about confidence intervals. It is not acceptable, for
example, to refer to  having a given probability of lying in a confidence interval. This is because, by
attaching a probability to  lying within the interval, you are creating the impression that  is not a fixed
3
quantity. It is the end-points of the confidence interval that are random quantities varying from sample to
sample.
8.1.1
Example (continued)
Returning to the simple introductory example about the heights of primary school children, a 95%
confidence interval for the population mean  is shown in the diagram below.
Dotplot of Heights of children
(with 95% t-confidence interval for the mean)
[
110
]
_
X
120
130
140
Heights of children
Later in the chapter, you will find out how to calculate this interval for yourselves. You will also discover
how to find confidence intervals for a population variance.
We start with the most basic situation, namely finding a confidence interval for a population mean when the
population variance is known.
8.2
Confidence Intervals for  (Known Population Variance)
8.2.1
Confidence intervals when data follow a normal distribution
Background:
Consider a random sample X 1 ,..., X n drawn from a N[  ,  2 ] distribution, where we assume
that the population variance,  2 , is known.
Problem:
Suppose that we wish to calculate a 100(1 - )% confidence interval for . Then we want to
find two statistics T1 and T2 such that:
P[T1 , T2 ]     1  
Note 1:
T1 and T2 are the random variables, not .
Note 2:
(1 - ) is usually taken to be 0.9, 0.95 or 0.99. The higher the value of (1 - ), the more
confident we are that the confidence interval does in fact contain . However, the higher (1 - ) is, the wider
the interval becomes and therefore the less informative it is about ’s location. So there exists a trade off.
Derivation of confidence interval: We know that if X 1 ,..., X n are normally distributed then
 2
X ~ N .
.
 n 
4
Thus, by applying the standardisation formula,
X 

~ N[0,1].
n
Therefore,



X 

 


P   z / 2 
 z / 2   1    P   z  / 2 
 X    z / 2 
  1  .

n
n




n
Rearranging further gives:



 

 
P  X  z / 2 
    X  z / 2 
   X  z / 2 
  1    P  X  z / 2 
  1  .
n
n
n
n


We can therefore see that we have the following result:
Result: When we have a sample X 1 ,..., X n from a N[  ,  2 ] distribution with known variance  2 then a
100(1 - )% confidence interval for  is given by:
X  z / 2 

.
n
 Example:
A biologist selects 15 beetles at random from a colony she is studying. The weights of these beetles (in g)
are as follows:
5.7, 4.9, 5.3, 5.0, 5.4, 5.1, 5.2, 5.2, 5.3, 5.4, 5.7, 5.1, 5.6, 5.0, 5.3.
Assuming that the weights follow a normal distribution with known population standard deviation 0.2 g,
calculate a 95% confidence interval for the population mean weight.
5.7  4.9  5.3  ...  5.3 79.2

 5.28 g.
15
15
From normal percentage point tables, z 0.025  1.96.
Thus, the 95% confidence interval is

0.2
X  z / 2 
 5.28  1.96 
 (5.179, 5.381).
n
15
 Sample mean =
 Exercise:
A new drug to lower blood pressure is given to 20 volunteers and their fall in BP is recorded. From previous
work the standard deviation of the change in BP is known to be 8mmHg and the falls are believed to follow
a normal distribution. The mean fall in the sample is 6mmHg. Find a 99% confidence interval for the mean
fall in BP.
8.2.2
Confidence intervals when sample size n is large
The assumption that the data follow a normal distribution can be relaxed if the sample size, n, is large (rule
of thumb, n > 30). This is because in such situations the Central Limit Theorem can be applied ensuring that
the sample mean, X , will approximately be normally distributed. Thus we have the result:
5
Result: When we have a sample X 1 ,..., X n from any distribution with mean  and known variance  2
then, if the sample size n is large, a 100(1 - )% confidence interval for  is approximately given by:
X  z / 2 

.
n
 Example:
Michael is a keen cyclist and rides his bicycle every day. On a random sample of 44 days he averages 18
miles per day. The standard deviation for all days is known to be 5 miles. Find a 90% confidence interval for
his mean daily mileage.
 In this example, the sample size is n = 44, i.e. large enough for the Central Limit Theorem to apply.
From tables, the 5% point for a normal distribution is z 0.05  1.65 . Therefore, the 90% confidence interval
is:

5
X  z / 2 
 18  1.65 
 (16.76, 19.24).
n
44
So we are 90% confident that the mean number of miles travelled per day lies in the interval (16.76, 19.24).
Important: It is the interval which varies from sample to sample, not . So for example, if we generated a
95% confidence interval for each of 100 different samples, we would expect 95 of them to contain 
8.3
Confidence Interval for  (Unknown Population Variance)
In most situations the population variance is not known. In such situations, some amendment is needed to
the formulae presented in Section 6.2. We deal first with the case where the sample size, n, is small.
8.3.1
Confidence intervals when  is unknown and n is small
The formulae presented in the previous section for a confidence interval for  are written in terms of the
population variance,  2 . When this is unknown, the formulae cannot be applied.
A confidence interval is instead derived from the following result:
Important result: Suppose X 1 ,..., X n is a random sample from a N[  ,  2 ] distribution, where both
parameters are unknown. If S 2 denotes the sample variance, then
X 
~ t n 1
S
n
Derivation of confidence interval:
From the above result, we know that
6


X 


P  t n 1, / 2 
 t n 1, / 2   1  
S


n
We now need to rearrange the inequality so that  is in the centre:

S
S 
P  t n 1, / 2 
 X    t n 1, / 2 
 1
n
n


S
S 
 P  X  t n 1, / 2 
     X  t n 1, / 2 
 1
n
n


S
S 
 P  X  t n 1, / 2 
   X  t n 1, / 2 
 1
n
n

Thus the upper and lower end-points for the 100(1 - )% confidence interval are given by:
S
X  t n1, / 2 
.
n
Result: Suppose that X 1 ,..., X n ~ N[  ,  2 ] where  2 is unknown. Then a 100(1 - )% confidence
interval for  is given by:
S
X  t n1, / 2 
n
where S is the sample standard deviation.
Comparing this result to that give in the previous section where  was assumed known, two changes can be
clearly seen:
 A percentage point from a t distribution is used in place of a normal percentage point;
 The population standard deviation is replace by the sample standard deviation, S.
 Example:
The number of hours spent by 10 randomly chosen computer science students completing their assessed
coursework were as follows:
5.5, 1.5, 3.6, 7.2, 2.4, 3.8, 4.0, 1.9, 5.3, 2.7.
Calculate a 99% confidence interval for the mean time spent on the coursework in the population of all
students.
 Here, the population variance is unknown. So we must begin by finding the sample mean and variance:
10
 xi
37.9
 3.79 hours.
10
10
( xi ) 2  1 
1
37.9 2 
S 2   xi2 
  172.49 
  3.205  S  1.790 hours.
9 i
10  9 
10 


The appropriate percentage point here comes from a t distribution with 10 – 1 = 9 degrees of freedom:
t 9,0.005  3.250.
x
i 1

Thus the 99% confidence interval for the population mean  is
S
1.79
x  t 9,0.005 
 3.79  3.250 
 (1.95, 5.63).
n
10
7
Note that in producing this confidence interval we need to assume that the data are normally distributed.
 Exercise:
A tennis player wishes to examine his service performance in a particular match. The speeds (in mph) of 8
randomly selected serves were as follows:
98, 92, 101, 80, 94, 99, 88, 96.
Calculate a 95% confidence interval for this player’s mean service speed in this match.
8.3.2
Confidence intervals for  when  is unknown and n is large
We noted in Chapter 3 that a t distribution looks very much like the standard normal distribution when the
degrees of freedom are large.
Therefore, in producing a confidence interval for  in situations when
i. the population variance is unknown and
ii. the sample size, n, is large (e.g. n > 30),
we can approximate the percentage point t n 1, / 2 that occurs in the formula by z / 2 .
Moreover, when the sample size is large, the assumption that the data follow a normal distribution is less
critical (because the Central Limit Theorem can then be applied). We therefore have the following result:
Given a large (n > 30) sample X 1 ,..., X n , drawn from a distribution with mean  and (unknown) variance, a
100(1  )% confidence interval for  is (approximately) given by:
S
X  z / 2 
.
n
 Example:
A nursery is growing a large number of tomato plants. A sample of 45 plants was taken at random and their
heights were found. If the sample mean and standard deviation were 5.2 cm and 1.3 cm respectively,
calculate a 90% confidence interval for the mean height of the tomato plants in the nursery.
 Here the sample size is n = 45, so can be considered large. Consequently, we can take our percentage
points from the standard normal distribution rather than from t 44 (which incidentally does not appear in
tables). The population standard deviation is unknown, so we use the sample standard deviation as an
estimate.
From statistical tables, the appropriate 5% point is z 0.05  1.645 . Therefore, the 90% confidence interval is:
S
1.3
X  z 0.05 
 5.2  1.645 
 (4.88, 5.52).
n
45
 Exercise:
75 randomly selected smokers were asked how many cigarettes they had smoked the previous day. The
sample mean and variance were 20 and 196 respectively. Calculate a 95% confidence interval for the
population mean.
8.2 Confidence Intervals for the Population Variance
8
Let us assume that we have a random sample, X 1 ,..., X n , drawn from a normal distribution, N[  ,  2 ], with
both parameters unknown. In the previous section, we learnt how to produce a confidence interval for .
We now look at producing a 100(1  )% confidence interval for  2 .
Idea:
We need to find T1 and T2 such that P(T1   2  T2 )  1   . These values can be obtained by
making use of the sampling distribution of S 2 .
Derivation of confidence interval:
(n  1) S 2
~  n21 . Therefore,
We know that
2




(n  1) S 2
1
2
1
2

  1  .

P  n21,1 / 2 



1



P


n 1, / 2 
2
2
2

2


(
n

1
)
S



n 1,1 / 2 
 n1, / 2
Multiplying throughout by (n  1) S 2 gives:
 (n  1) S 2
(n  1) S 2 
P 2
 2  2
 1  .



n

1
,

/
2
n

1
,
1


/
2


Upper % point
Lower % point
We therefore have the following result:
Result: Given a random sample X 1 ,..., X n from N[  ,  2 ], a 100(1  )% confidence interval for  2 is
given by
 (n  1) S 2 (n  1) S 2 

.
, 2
 2


n 1,1 / 2 
 n1, / 2
 Example:
The blood cholesterol levels in a sample of 11 people are as follows:
270, 256, 330, 324, 291, 279, 329, 344, 308, 297, 310.
Calculate 95% confidence intervals for the population mean and standard deviation.
 We first need to calculate the sample mean and variance:
270  ...  310 3338
x

 303.45.
11
11
(  xi ) 2  1 
1 
3338 2 
S 2   xi2 
  1020584 
  765.273  S  27.66 .
10  i
11  10 
11 


Confidence intervals for  and  can only produced if the data are normally distributed. So we need to make
this assumption.
A 95% confidence interval for  is then
9
S
X  t n1, / 2 
 303.45  2.228 
27.66
n
 (284.87, 322.03).
11
t10,0.025  2.228
2
The upper and lower 2.5% points from a  10
distribution are 20.48 and 3.247 respectively. The 95%
confidence interval for  2 is therefore
 (n  1) S 2 (n  1) S 2   10  765 .273 10  765 .273 


, 2
,
  (373 .67, 2356 .86).
 2
 
20.48
3.247


n

1
,

/
2
n

1
,
1


/
2


Taking square roots of the upper and lower end-points results in the following confidence interval for :
(19.33, 48.55).
 Exercises:
1. A machine puts rice into 400g packets and the standard deviation over a long period is 2.5g. A new
machine is evaluated by means of a random sample of 21 packets whose sample standard deviation is
3.2g. Find a 90% confidence interval for the standard deviation of the new machine.
2. The speeds in mph of 15 randomly selected cars passing a police speed checkpoint were as follows:
27, 31, 34, 30, 32, 38, 26, 30, 32, 34, 31, 29, 41, 35, 33.
Calculate a 99% confidence interval for the population mean and variance.
8.3
Choosing the Sample Size
All the confidence intervals we've looked at depend on the sample size n. For example, the confidence
interval for  with  known is
x  z / 2 

.
n
As n gets larger, the width of the confidence interval decreases, which means that the interval becomes more
informative about the unknown parameter.
 Example:
There is interest in learning about the mean I.Q. of students at UKC. If the standard deviation of I.Q.s can
be assumed to be 20, find the sample size that will ensure that the width of a 99% confidence interval for 
is less than 4 units.
 As the population s.d. is known, the appropriate formula for the confidence interval for  is:
x  z / 2 
The width of this is 2  z / 2 

n

.
n
. Because the appropriate percentage point is z 0.005  2.5758 , to find n
we need to solve
2  2.5758 
20
4 
103.032
 4  n  25.758  n  663.5 .
n
n
We would need around 664 students in the sample therefore.
10
Chapter 9: Hypothesis testing
9.1
Introduction
Introductory scenario
Proponents for a particular dieting regime claim that people will, on average, lose 14 pounds if the plan is
followed for six weeks. A nutritionist wishes to test this claim- she suspects it to be false.
In order to test the plausibility of the claim (or hypothesis) some data are needed. Relevant data here would
be the weight losses from a sample of say 50 people who followed the diet over 6 weeks. We then need to
assess how consistent the hypothesis is with the observed data.
It is important to note that we cannot absolutely prove or disprove a hypothesis, only gather evidence for or
against it.
Other examples:
 a manufacturer might claim that the mean lifetime of a brand of battery is 110 hours;
 a political party might claim that the proportion of voters who will vote for them in the next general
election is 45%.
In each case, a sample could be taken and the sample values used to determine whether or not the
hypothesised population value is reasonable or not.
To introduce hypothesis testing we shall use a specific example of testing hypotheses about a mean  when
the population variance (  2 ) is known.
9.2 Testing hypotheses for  (known population variance)
9.2.1
Terminology
In hypothesis testing, we wish to choose between two competing hypotheses. These are called the null
hypothesis (denoted H 0 ) and alternative hypothesis (denoted H1 ). Generally, the null hypothesis is the one
that we suspect could be false and the alternative hypothesis is the one that we usually hope to be true.
We illustrate this terminology through two examples.
 Example 1:
An IQ test is designed so that the average score in the population as a whole is 100 with s.d. 20, and so that
the scores follow a normal distribution.
A random sample of 25 children at a school under investigation takes the test. The sample mean score is
x  108 .3 . Is there any evidence that this school has children with an IQ different from the general
population?
Let  denoted the mean I.Q. for all children at that school. The null and alternative hypotheses would then
be as follows:
 H 0 :   100
(i.e. the school has the same IQ as the whole population);
H
:


100

(i.e. the school has a different mean IQ from the general population).
1
The null hypothesis here is the cautious hypothesis which we initially assume to be true- i.e. without any
sample data, given any group of children we would initially assume that their average IQ is the same as the
general population.
11
Note that we here have a two-sided alternative hypothesis as we are testing whether the mean IQ differs
from the hypothesised value of 100. If we were looking to test whether the mean IQ was greater (or smaller)
than this value, we would need to specify a one-sided alternative hypothesis (see later).
 Example 2:
Researchers have postulated that, due to differences in diet, Japanese children have a different mean blood
cholesterol level compared with British children. Suppose that the mean level for British children is known
to be 170. Let  represent the mean blood cholesterol level for Japanese children. What hypotheses should
the researchers test?
The null hypothesis represents what we initially assume to be true. So without any sample information about
Japanese children we'd initially assume that  = 170 and so
H 0 :   170 .
The alternative hypothesis is that the cholesterol level of Japanese children differs from that of British
children and so
H 1 :   170 .
General formulation (two-sided test)
In general suppose that we have the hypotheses:
H 0 :    0 versus H1 :    0 .
Background:
To test H 0 against H1 , we initially assume that H 0 is true. We then see how plausible data at least as
extreme as our observed data would be under this assumption. So if the probability of observing our sample
result is small under H 0 ’s distribution, then this means that we are unlikely to have observed what we
actually have if H 0 were true. In this case it therefore looks like H 0 is not true. On the other hand, if the
probability of observing our sample result is large, then we could plausibly have observed the sample we in
fact got, and therefore H 0 could be true.
The sample mean, X , is a good estimator for , so it makes sense to use our observed sample mean, x ,
to test hypotheses about .
Theory:
Consider the case where a sample X 1 ,..., X n is obtained from a N[  ,  2 ] distribution (where  2 is
assumed to be known). The hypotheses we are interested in testing are
H 0 :    0 versus H1 :    0 .
We know from earlier chapters that
 2
X ~ N  ,
.
n 

If the null hypothesis is true, then

X  0
2
X ~ N  0 ,
~ N[0, 1].
Z

n 

n
We will use the distribution of Z to decide whether our sample of data (summarised by the sample mean)
could plausibly have been obtained from a normal distribution with mean  0 . Z is referred to as the test
statistic.
12
The observed value of the test statistic is:
z
x  0

.
n
If H 0 is true, a sampled value z of Z will have come from N[0, 1]. In this case we are most likely to observe
a value of z which lies in the main body of the distribution (as these values would be the most probable
values of Z to observe). Therefore if we observe z in the main body of N[0, 1], this sampled value would
support H 0 . We would then have no evidence to reject H 0 , or equivalently we could say that we “accept”
H 0 to be true. Note that this is not the same thing as saying that H 0 is true, only that we have no evidence
to say that it is false.
Suppose now that the observed value z of Z lies in the tails of the standard normal distribution. Such a value
would have been unlikely to occur if H 0 were true. So if we observe z outside the main body of N[0, 1],
then this sample value would not support H 0 . We would therefore reject H 0 .
The range of values of the test statistic that would lead us to reject the null hypothesis is called the critical
region. The next problem, then, is to decide how to specify the exact values of our critical region.
In carrying out a hypothesis test there are two types of error we can make.
 Type 1 error. This is when H 0 is rejected when in fact it is true.
 Type 2 error. This is when you fail to reject H 0 , when in fact it is false.
P(type 1 error) is usually denoted  and we call it the size of the test. We can use this value to find a suitable
critical region for the test.
A type 1 error is usually thought to be the more serious and we therefore define our test so that we have a
suitably low value of . The values of  that are acceptable will vary from situation to situation. The most
usual values are 0.1, 0.05, 0.01 or 0.001. Now
  P(reject H 0 when it is true)  P(observe z in the tails of N[0, 1]).
So by setting a value for , we can find our critical region.
For example, if  = 0.05, then we will reject H 0 if we observe z  z 0.025 or z   z 0.025 , i.e. if z  1.96 or
z  1.96 . So our critical region is z : z  1.96 or z  1.96. We then say that we have a test at the 5%
significance level.
To test between the hypotheses
H 0 :    0 versus H1 :    0
when X 1 ,..., X n is a random sample from a normal distribution with known population variance:
X  0

use the test statistic Z 

n
reject H 0 at the 100% level if | z | z / 2 .

 Example:
Consider again the IQ example. Here we had a random sample of 25 children from a particular primary
school. The mean IQ in the sample was 108.3. The hypotheses of interest were
13
H 0 :   100 versus H1 :   100 .
The population standard deviation is known to be 20 and we can assume that IQs follow a normal
distribution.
 The test statistic in this situation is given by:
Z
X  0

.
n
The observed value of this test statistic then is:
108 .3  100 8.3
z

 2.075 .
20
20
5
25
For a 5% test, the critical values for the test statistic are  z 0.025  1.96. As the observed value of the test
statistic lies in the critical region, we can reject H 0 at the 5% significance level.
For a 1% test, the critical values would be  z 0.005  2.5758 . Since z  2.075  2.5758 , we would not be
able to reject H 0 at the 1% significance level.
We interpret these test results as follows… The data provide some evidence (but not strong evidence) to
suggest that the mean IQ of children in this primary school differs from the general population.
Important notes:
 Always state the level of significance you are using when rejecting or accepting H 0 .
 Rejection of H 0 means there is definite evidence to reject H 0 . “Acceptance” of H 0 means that there is
insufficient evidence to reject H 0 - i.e. H 0 may still be untrue, but we do not have enough data to reject
it. This is regularly misunderstood.
 Significance tests are commonly conducted at the following levels: 5%, 1% and 0.1%. These
significance levels provide varying degrees of evidence against H 0 :
5% level- some evidence against H 0 ;
1% level- strong evidence against H 0 ;
0.1% level- very strong evidence against H 0 .
 Exercise:
A machine is designed to produce bolts with a mean length of 25mm. The standard deviation of the length
of the bolts is known to be 0.23 mm. After a routine service, a random sample of bolts were measured and
the lengths (in mm) were found to be:
25.5
25.3
25.1
25.6
24.9
25.0
25.4
25.3
25.0
24.8
25.2
25.4.
Test to see whether the servicing of the machine has altered the mean length of the bolts it produces.
Assume that the standard deviation is unchanged and that the data can be assumed to follow a normal
distribution.
Note:
If the sample size is large, the assumption that the data are normally distributed is less critical. This is
 2
because the central limit theorem ensures that X ~ N   ,
 (approximately) whatever the distribution of
n 

X 1 ,..., X n when n is large.
14
 Example:
The manager of a telesales department claims that the average time that an operator spends talking to a
potential client is 70 seconds. The managing director of the company doubts this claim and times a random
sample of 40 telephone calls. The sample mean was 62 seconds. If the population standard deviation is
known to be 45 seconds, carry out a hypothesis test at the 5% significance level.
 If  denotes the population mean call length, the hypotheses are:
H 0 :   70 versus H1 :   70 .
The sample size here is large (n > 30) and so we do not need to assume that the call times follow a normal
distribution (the Central Limit Theorem ensures that the distribution of X is roughly normal).
The test statistic is
Z
X  0

n
giving an observed value
62  70
 1.124 .
45
40
For a 5% test, the critical values would be  z 0.025  1.96. So no evidence to reject H0 at this level, i.e. it
is plausible that the mean call length is 70 seconds.
z
9.1.3
Link between hypothesis tests and confidence intervals
Consider our usual hypotheses:
H 0 :    0 versus H1 :    0
We will be able to reject the null hypothesis at the 5% significance level if a 95% confidence interval for 
excludes  0 :
E.g.
95% C.I.
Interpretation:
*
* *
* * **
* * *
* *
Reject null hypothesis at the 5%
level
0
If a 95% confidence interval for  includes  0 then we do not have sufficient evidence at the 5% level to
reject the null hypothesis:
E.g.
95% C.I.
15
*
* * * * **
* * *
0
* *
Interpretation:
“Accept” the null hypothesis at the
5% level (  0 is a plausible value
for the population mean).
In general, we can accept H 0 at the 100% level if and only if a 100(1  )% C.I. for  excludes  0 .
9.1.3
p-values
Specifying the size of test, together with the conclusion about whether the result was statistically significant
at that level is one way in which a hypothesis test can be carried out. A more informative way of giving the
strength of evidence against a null hypothesis is to calculate a p-value.
The p-value gives the exact observed significance of the data, i.e. it specifies the probability of observing a
result at least as extreme as our sample result given that H 0 is true. The p-value is often simply denoted by
p.
 Example (IQ example continued):
In the IQ example we observed z = 2.075. To calculate the p-value we need to calculate the probability of
observing a result which is at least as extreme as this:
p  P(Z  2.075 or Z  2.075)  P(Z  2.075)  P(Z  2.075)  2  (1  (2.075))  2  0.019  0.038.
The observed level of significance is therefore 0.038.
This value is consistent with our earlier conclusions- we can reject the null hypothesis at the 5% level but
not at the 1% level.
9.1.3
One-sided tests
 Example (continued):
Consider again the earlier example concerning whether Japanese children have a different mean blood
cholesterol level than British children. Because a Japanese diet has less saturated fat than a British diet,
researchers might postulate that the mean cholesterol level for Japanese children is in fact lower than British
children (whose mean level is 170). They may then want to test this via a hypothesis test.
Once again the null hypothesis represents what we initially assume to be true and so again we'd set
H 0 :   170
However, the alternative hypothesis is now that the cholesterol level of Japanese children is less than that of
British children and so
H1 :   170 .
To test these hypotheses, we’ll reject H 0 in favour of H1 only if we observe small values of z. We
wouldn't reject H 0 if we observe large values of z this time, because large values of z are now more
consistent with H 0 than H1 . We'll therefore reject H 0 only if we observe z in the lower tail of N[0, 1]. So
as we're only rejecting H 0 if z falls in one of the tails of the distribution we call this a one-tailed test.
16
Note:
A one-tailed test is appropriate only when it is known that deviations from the null hypothesis will be in a
particular direction.
 Example:
The average mark in an A-level examination paper has traditionally been 58%. After a change in the
syllabus, it is suspected that the A-level paper will now be easier. The marks of 10 randomly chosen
candidates sitting the new syllabus are as follows:
64, 67, 35, 46, 78, 59, 53, 84, 60, 56.
If the population variance is known to be 225, perform a hypothesis test to see whether marks are now
significantly higher than before.
 The hypotheses we wish to test are as follows:
H 0 :   58 versus H1 :   58 .
To carry out this (one-sided) test, we need to assume that the data are normally distributed. The sample
mean is:
n
 xi
602

 60.2.
n
10
Therefore, the observed value of the test statistic is
x   0 60.2  58
z

 0.464 .

225
n
10
For a 5% test, we would reject the null hypothesis if z  z 0.05  1.6449 . The conclusion then must be to
“accept” the null hypothesis at this level. The data provide no evidence to support the view that the
examination marks are on average higher than before the syllabus change.
x
1
Incidentally, the p-value associated with this test can be found as follows:
p  P(Z  0.464)  1  (0.464)  1  0.6772  0.3228 (approximately).
Null hypothesis is one-sided so
we find the probability only of
larger values than we observed.
 Exercise:
Suppose that the mean systolic blood pressure for white males aged 35-44 is 127.2. A random sample of 13
diabetic males aged 35-44 was taken and their systolic blood pressure was measured. The results are given
below.
119.2, 130.2, 134.4, 120.1, 137.6, 128.0, 136.9, 129.1, 130.6, 127.9, 136.8, 135.4, 142.0.
Suppose that you are told that the standard deviation of systolic blood pressure for white males aged 35-44 is
6.726 and that it can be assumed that the data roughly follow a normal distribution. Investigate whether there
is evidence to suggest that the systolic blood pressure is
i. different
ii. higher
for diabetic 35-44 year old males than for the general population. Calculate the p-value in each case.
17
7.1.3
Calculating the probability of type 1 and 2 errors
 Example:
A coin is tossed 7 times. Suppose that we want to test the hypotheses:
H0: the coin is fair
versus
H1: the coin is biased in favour of heads.
A test is proposed which rejects H0 if 6 or more heads are observed.
a)
b)
What is the probability of a type 1 error?
What is the probability of a type 2 error if the the coin is in fact biased so that P(heads)=0.6?
Solution:
a)
P(type 1 error) = P(reject H0 | H0 true) = P(6 or more heads |coin fair)
7
6
1
 7  1   1 
1
          0.0546  0.0078  0.0624 .
 2
 6  2   2 
b)
7.1.7
P(type 2 error) = P(accept H0 | P(heads) = 0.6) = 1  P(reject H0 | P(heads) = 0.6)
= 1  P(6 or more heads | P(heads) = 0.6)
7
 1  0.67     0.6 6  0.4  1  0.131  0.028  0.841 .
 6
Power function
Definition: The power function of a test, which we'll denote  ( ) , is defined as
 (  )  P(reject H 0 |  ) .
So for each value of , we will have a different value for the power of the test.
9.2
Hypothesis tests for  (unknown population variance)
Recall that when finding a confidence interval for  when  2 is unknown we made use of the result:
X 
~ t n 1
S
n
where S is the sample standard deviation. We will use this result again in hypothesis testing.
9.2.1
One sample t-test
Consider the situation where we have a random sample of observations, X 1 ,..., X n , drawn from a normal
distribution with unknown mean and variance. We wish to use these data to compare the following
hypotheses about the population mean :
H 0 :    0 versus H1 :    0 .
The relevant test statistic would now be
X  0
T
S
n
18
which we know follows a t distribution with n – 1 degrees of freedom if the null hypothesis is true.
As in the previous section, if H0 is true then we would expect to observe values of T in the main body of the
t n 1 distribution. On the other hand, if H0 is not true, then we might expect to observe t in the tails of this tdistribution.
The critical values that we use as cut-off points between accepting and rejecting the null hypothesis are the
(/2)% points from the t n 1 distribution. Hence we reject H0 if we observe
t  t n1, / 2 or t  t n1, / 2 .
One-sample t-test: To test the hypotheses
H 0 :    0 versus H1 :    0 .
when X 1 ,..., X n follow a normal distribution with unknown variance:


X  0
S
n
reject H0 at the 100% level if
use the test statistic T 
| t |  t n1, / 2 .
This test can easily be adjusted if the alternative hypothesis is one-tailed. For example, if H1 took the form:
H1 :    0
then we reject the null hypothesis if t  t n1, .
Note: In performing a one-sample t-test we assume that the data are independently distributed as a normal
distribution. This assumption is less critical if the sample size is large (see later).
 Example:
Ten randomly selected ‘pints’ pulled from a campus bar are measured accurately. The amount of beer
(fl.oz) in these ‘pints’ was as follows:
19.96, 19.97, 19.94, 20.01, 19.99, 19.97, 19.95, 19.97, 20.00, 19.98.
Test between the hypotheses H 0 :   20 versus H1 :   20. Find the associated p-value.
 Here, we begin by finding the sample mean and variance:
19.96  ...  19.98 199.74
x

 19.974;
10
10
( xi ) 2  1 
1 
199.74 2 
2

S2 
x


3989
.
611

 0.000471  S  0.0217 .
 i
 9
n 1 i
n
10 



For testing between the hypotheses:
H 0 :   20 versus H1 :   20 ,
we use the following test statistic:
X  0
T
.
S
n
19
This test statistic follows a t 9 distribution if the null hypothesis is true. Here, its observed value is
19.974  20
t
 3.79.
0.0217
10
The relevant critical values for different sizes of test are:
5% test:
t 9, 0.025  2.262
1% test:
t 9, 0.005  3.250
0.1% test:
t 9, 0.0005  4.781 .
Conclusion: We can reject the null hypothesis at the 1% level. There is strong evidence that the average
beer contents are not 20 fl.oz. (i.e. a pint).
Note that in performing this test it is necessary to assume that the measurements follow a normal
distribution.
The p-value associated with this test is:
p  P(T  3.79 or T  3.79)  P(T  3.79)  P(T  3.79) .
But,
P(T  3.79)  1  P(T  3.79)  1  0.9979  0.0021 .
So, the p-value is 0.0021  2 = 0.0042 (or 0.42%).
 Example:
The widths (in mm) of a sample of 7 beetles, chosen from a particular island, were measured and found to
be:
29, 34, 26, 31, 38, 33, 36.
The mean length of the beetles on the island is usually 36 mm, but due to recent adverse weather conditions
it is believed that their growth may have been stunted. Perform a hypothesis test to assess whether the data
provide any evidence to support this view.
 We must again assume that the lengths follow a normal distribution. The hypotheses we wish to test are
H 0 :   36 versus H1 :   36 .
It can be shown that the sample mean and standard deviation are 32.4286 mm and 4.1173 mm respectively.
Therefore the observed test statistic is:
32.4286  36
t
 2.29.
4.1173
7
Because t 6, 0.025  2.447 , we are unable to reject the null hypothesis at the 5% level. We have no evidence
to suggest that the beetles’ average length has decreased.
 Exercise:
The mean weight (in kg) of British children of a certain age is 32 kg. A random sample of American
children of this same age gave the following set of weights:
38, 34, 35, 43, 47, 40, 31, 39, 37, 42, 36, 35, 29, 38.
Perform a test (stating the necessary distributional assumptions) to assess whether there appears to be a
difference in the mean weights of American and British children at this age.
9.1.4
Hypothesis tests for  for large samples (unknown population variance)
20
When the sample size is large (say n > 30) the distribution of the sample mean should be approximately
normal whatever the distribution of the original data. We therefore do not need to make the assumption of
normality in the one-sample t-test for large n.
Further, when the sample size is large, the distribution of the test statistic,
X  0
T
S
n
will be approximately a standard normal. [Recall: the t-distribution becomes approximately a N[0, 1] as the
degrees of freedom increase.]
 Example:
A machine putting cereal into boxes should be set so that the average content of each box weights 510 g.
The machine is serviced after which the weight of cereal in a random sample of 38 boxes is checked. The
sample mean was 513.4g and sample variance was 67.8 g2.
Test to see if there has been a change to the average content of the boxes.
 The hypotheses here are:
H 0 :   510 versus H1 :   510 .
The observed value of the test statistic is:
x   0 513.4  510

 2.545.
S
67.8
n
38
As the sample size is large, the critical points for this test should be approximately those from a standard
normal. Thus, z 0.025  1.96 and z 0.005  2.5758 . We can see that we can reject the null hypothesis at the
5% level, but that there is not quite enough evidence to reject it at the 1% level.
9.3
Hypothesis tests for the population variance
We here assume that X 1 ,..., X n , follow a normal distribution, N[  ,  2 ] , where  is unknown. We now are
interested in testing hypotheses about  2 .
When finding a confidence interval for  2 when  was unknown, we used the fact that
( n  1) S 2

We will use this fact to define a hypothesis test for  2 .
2
~  n21 .
Suppose that the null hypothesis is H 0 :  2   02 . Our test statistic then is
Y=
( n  1) S 2
 02
which has a chi-squared distribution with n – 1 degrees of freedom under H0.
Then, if H0 is true we would expect to observe values of Y in the main body of a  n21 distribution and if H0
is not true, then we might expect to observe Y in the tails of this distribution.
So, if we have H1 :  2   02 , then we'll reject H0 if we observe
y   n21, / 2 or y   n21,1 / 2
21
One-sided alternative hypotheses can be tested by using the critical points  n21, or  n21,1 , as
appropriate.
Result: To test the hypotheses
H 0 :  2   02 versus H1 :  2   02
when  is unknown and X 1 ,..., X n follow a normal distribution:
(n  1) S 2

use the test statistic Y 

reject H0 at the 100% level if
 02
;
y   n21, / 2 or y   n21,1 / 2 .
Adjust for 1-tailed tests accordingly.
 Example:
Historically it is known that the journey time between 2 points is normally distributed with a standard
deviation of 6 minutes. After roadworks a sample of 10 journey times is found to have a sample standard
deviation of 5 mins. Is there evidence of a change in the population variance?
 Want to test
H 0 :  2  36 versus H1 :  2  36
We've observed
9  25
 6.25.
36
If H0 is true, Y ~  n21   92 . For a 5% test , the appropriate critical points are
y
 92,0.975  2.70 and  92,0.025  19.02.
Our observed test statistic lies between these critical points. Therefore, there is no evidence at the 5%
significance level to reject H0. So no evidence for a change in the population variance.
22