Download Statistics for EES 3. From standard error to t-Test

Document related concepts
no text concepts found
Transcript
Statistics for EES
3. From standard error to t-Test
Dirk Metzler
http://evol.bio.lmu.de/_statgen
May 14, 2013
1 / 31
Contents
1
The Normal Distribution
2
The t-Test for paired samples
Example: Orientation of pied flycatchers
2 / 31
The Normal Distribution
Contents
1
The Normal Distribution
2
The t-Test for paired samples
Example: Orientation of pied flycatchers
3 / 31
The Normal Distribution
Density of the normal distribution
0.4
0.1
0.2
0.3
µ+σ
0.0
Normaldichte
µ
µ+σ
−1
0
1
2
3
4
5
4 / 31
The Normal Distribution
Density of the normal distribution
0.4
0.2
0.3
µ+σ
0.0
0.1
Normaldichte
µ
µ+σ
−1
0
1
2
3
4
5
The normal distribution is also called Gauß distribution
4 / 31
The Normal Distribution
Density of the normal distribution
The normal distribution is also called Gauß distribution
(after Carl Friedrich Gauß, 1777-1855)
4 / 31
0.025
The Normal Distribution
0.020
●●●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.015
●
●
●
●
●
●
●
●
●
●
0.005
0.010
●
0.000
dbinom(400:600, 1000, 0.5)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●●
●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
400
450
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
500
550
600
400:600
5 / 31
0.025
The Normal Distribution
0.020
●●●
●● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.015
●
●
●
●
●
●
●
●
●
●
0.005
0.010
●
0.000
dbinom(400:600, 1000, 0.5)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●●
●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
400
450
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
500
550
600
400:600
5 / 31
The Normal Distribution
Density of standard normal distribution
A random variable Z with density
x2
1
f (x) = √ · e− 2
2π
0.0
0.1
0.2
0.3
0.4
“Gaussian bell”
−3
−2
−1
0
1
2
3
is called standard normally distributed.
6 / 31
The Normal Distribution
Density of standard normal distribution
A random variable Z with density
x2
1
f (x) = √ · e− 2
2π
“Gaussian bell”
0.0
0.1
0.2
0.3
0.4
for short: Z ∼
N (0, 1)
−3
−2
−1
0
1
2
3
is called standard normally distributed.
6 / 31
The Normal Distribution
Density of standard normal distribution
A random variable Z with density
x2
1
f (x) = √ · e− 2
2π
“Gaussian bell”
0.2
0.3
0.4
for short: Z ∼
N (0, 1)
0.1
EZ = 0
0.0
Var Z = 1
−3
−2
−1
0
1
2
3
is called standard normally distributed.
6 / 31
The Normal Distribution
If Z is N (0, 1) distributed, then X = σ · Z + µ is normally
distributed with mean µ and variance σ 2 , for short:
X ∼ N (0, 1)
7 / 31
The Normal Distribution
If Z is N (0, 1) distributed, then X = σ · Z + µ is normally
distributed with mean µ and variance σ 2 , for short:
X ∼ N (0, 1)
X has the density
f (x) = √
1
2πσ
·e
−
(x−µ)2
2σ 2
.
7 / 31
The Normal Distribution
Keep in mind
If Z ∼ N (µ, σ 2 ), then:
Pr(|Z − µ| > σ) <
1
3
8 / 31
The Normal Distribution
Keep in mind
If Z ∼ N (µ, σ 2 ), then:
Pr(|Z − µ| > σ) < 31
Pr(|Z − µ| > 2 · σ) < 0.05
8 / 31
The Normal Distribution
Keep in mind
If Z ∼ N (µ, σ 2 ), then:
Pr(|Z − µ| > σ) < 31
Pr(|Z − µ| > 2 · σ) < 0.05
Pr(|Z − µ| > 3 · σ) < 0.003
8 / 31
The Normal Distribution
2πσ
·e
−
(x−µ)2
2σ 2
0.00
0.05
0.10
0.15
0.20
0.25
f (x) = √
1
0
2
µ−σ
4
µ µ+σ
6
8
10
9 / 31
The Normal Distribution
Densities need integrals
0.00
0.05
0.10
0.15
0.20
0.25
If Z is a random variable with density f (x)
0
2
4
6
a
8
b
10
,
then
Pr(Z ∈ [a, b]) =
10 / 31
The Normal Distribution
Densities need integrals
0.00
0.05
0.10
0.15
0.20
0.25
If Z is a random variable with density f (x)
0
2
4
6
8
a
b
then
Z
Pr(Z ∈ [a, b]) =
10
,
b
f (x)dx.
a
10 / 31
The Normal Distribution
Example: Let Z ∼ N (µ = 5, σ 2 = 2.25).
Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5)
11 / 31
The Normal Distribution
Example: Let Z ∼ N (µ = 5, σ 2 = 2.25).
Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5)
Computation of Pr(Z ∈ [3, 4]):
11 / 31
The Normal Distribution
Example: Let Z ∼ N (µ = 5, σ 2 = 2.25).
Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5)
Computation of Pr(Z ∈ [3, 4]):
Pr(Z ∈ [3, 4]) = Pr(Z < 4) − Pr(Z < 3)
11 / 31
The Normal Distribution
Example: Let Z ∼ N (µ = 5, σ 2 = 2.25).
Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5)
Computation of Pr(Z ∈ [3, 4]):
Pr(Z ∈ [3, 4]) = Pr(Z < 4) − Pr(Z < 3)
> pnorm(4,mean=5,sd=1.5)-pnorm(3,mean=5,sd=1.5)
11 / 31
The Normal Distribution
Example: Let Z ∼ N (µ = 5, σ 2 = 2.25).
Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5)
Computation of Pr(Z ∈ [3, 4]):
Pr(Z ∈ [3, 4]) = Pr(Z < 4) − Pr(Z < 3)
> pnorm(4,mean=5,sd=1.5)-pnorm(3,mean=5,sd=1.5)
[1] 0.1612813
11 / 31
The Normal Distribution
Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)?
12 / 31
The Normal Distribution
Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)?
Answer: For each x ∈ R holds Pr(Z = x) = 0
12 / 31
The Normal Distribution
Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)?
Answer: For each x ∈ R holds Pr(Z = x) = 0
P
How about EZ = a a · Pr(Z = a) ?
12 / 31
The Normal Distribution
Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)?
Answer: For each x ∈ R holds Pr(Z = x) = 0
P
How about EZ = a a · Pr(Z = a) ?
Needs a new definition:
Z
∞
x · f (x)dx
EZ =
−∞
12 / 31
The Normal Distribution
Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)?
Answer: For each x ∈ R holds Pr(Z = x) = 0
P
How about EZ = a a · Pr(Z = a) ?
Needs a new definition:
Z
∞
x · f (x)dx
EZ =
−∞
But we already know the result EZ = µ.
12 / 31
The t-Test for paired samples
Contents
1
The Normal Distribution
2
The t-Test for paired samples
Example: Orientation of pied flycatchers
13 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is A significantly different from B?
14 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is A significantly different from B?
No: A is in 2/3 range of µB
14 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is B significantly different from C?
14 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is B significantly different from C?
No: 2/3 ranges of µB and µC overlap
14 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is C significantly different from D?
14 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is C significantly different from D?
We do not know yet.
14 / 31
The t-Test for paired samples
sample means ± standard errors
A
B
C
D
Is C significantly different from D?
We do not know yet.
(Answer will be “no”.)
14 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Contents
1
The Normal Distribution
2
The t-Test for paired samples
Example: Orientation of pied flycatchers
15 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Pied flycatchers (Ficedula hypoleuca)
Foto (c) Simon Eugster
http://en.wikipedia.org/wiki/File:Ficedula hypoleuca NRM.jpg
16 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Wiltschko, W.; Gesson, M.; Stapput, K.; Wiltschko, R.
Light-dependent magnetoreception in birds: interaction of at least
two different receptors.
Naturwissenschaften 91.3, pp. 130-4, 2004.
Wiltschko, R.; Ritz, T.; Stapput, K.; Thalau, P.;
Wiltschko, W.
Two different types of light-dependent
responses to magnetic fields in birds.
Curr Biol 15.16, pp. 1518-23, 2005.
Wiltschko, R.; Stapput, K.; Bischof, H. J.;
Wiltschko, W.
Light-dependent magnetoreception in birds:
increasing intensity of monochromatic light
changes the nature of the response.
Front Zool, 4, 2007.
17 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Direction of flight with blue light.
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Direction of flight with blue light.
Another flight of the same bird with
blue light.
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
All flight’s directions of this bird in
blue light.
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
All flight’s directions of this bird in
blue light.
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
All flight’s directions of this bird in
blue light.
corresponding exit points
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Directions of this bird’s flights in
green light.
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Directions of this bird’s flights in
green light.
corresponding exit points
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
corresponding exit points
arrow heads: barycenter of exit
points for green light
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
arrow heads: barycenter of exit
points for green light
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
arrow heads: barycenter of exit
points for green light
the same for “blue” directions
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
arrow heads: barycenter of exit
points for green light
the same for “blue” directions
the more variable the directions the shorter the arrows!
18 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Question of interest
Does the color of monochromatic light
influence the orientation?
19 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Question of interest
Does the color of monochromatic light
influence the orientation?
Experiment: For 17 birds compare
barycenter vector lengths for blue and green
light.
19 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
0.5
Pies flycathchers :
barycenter vector lengths for
green light and for blue light b, n=17
0.4
●
0.3
●
●
0.2
●
●
●
●
●
●
●
●
0.1
●
●
●
●
0.0
with green light
●
0.0
0.1
0.2
0.3
0.4
0.5
with blue light
20 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Compute for each bird the distance to the
diagonal,
21 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Compute for each bird the distance to the
diagonal,
i.e.
x := “green length” − “blue length”
21 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Compute for each bird the distance to the
diagonal,
i.e.
x := “green length” − “blue length”
−0.05
0.00
0.05
0.10
0.15
0.20
21 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
22 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
x = 0.0518
22 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
x = 0.0518
s = 0.0912
22 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
x = 0.0518
s = 0.0912
s
SEM = √
n
22 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
x = 0.0518
s = 0.0912
s
0.912
SEM = √ = √
n
17
22 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
x = 0.0518
s = 0.0912
s
0.912
SEM = √ = √
= 0.022
n
17
22 / 31
The t-Test for paired samples
−0.05
0.00
0.05
0.10
Example: Orientation of pied flycatchers
0.15
0.20
Can the theoretical mean be µ = 0? (i.e. the expectation value
for a randomly chosen bird)
x = 0.0518
s = 0.0912
s
0.912
SEM = √ = √
= 0.022
n
17
22 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Is |x − µ| ≈ 0.0518 a large deviation?
23 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Is |x − µ| ≈ 0.0518 a large deviation?
Large?
23 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Is |x − µ| ≈ 0.0518 a large deviation?
Large? Large compared to what?
23 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
Is |x − µ| ≈ 0.0518 a large deviation?
Large? Large compared to what?
Large compared the the standard error?
23 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
|x − µ|
0.0518
√ ≈
≈ 2.34
0.022
s/ n
24 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
|x − µ|
0.0518
√ ≈
≈ 2.34
0.022
s/ n
So, x is more than 2.3 standard deviations away from µ = 0.
24 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
|x − µ|
0.0518
√ ≈
≈ 2.34
0.022
s/ n
So, x is more than 2.3 standard deviations away from µ = 0.
How probable is this when 0 is the true theoretical mean?
24 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
|x − µ|
0.0518
√ ≈
≈ 2.34
0.022
s/ n
So, x is more than 2.3 standard deviations away from µ = 0.
How probable is this when 0 is the true theoretical mean? in
other words:
Is this deviation significant?
24 / 31
The t-Test for paired samples
We know:
Example: Orientation of pied flycatchers
x −µ
√
σ/ n
is asymptotically (for large n) standard-normally distributed.
25 / 31
The t-Test for paired samples
We know:
Example: Orientation of pied flycatchers
x −µ
√
σ/ n
is asymptotically (for large n) standard-normally distributed.
σ is the theoretical standard deviation, but we do not know it.
25 / 31
The t-Test for paired samples
We know:
Example: Orientation of pied flycatchers
x −µ
√
σ/ n
is asymptotically (for large n) standard-normally distributed.
σ is the theoretical standard deviation, but we do not know it.
Instead of σ we use the sample standard deviation s.
25 / 31
The t-Test for paired samples
We know:
Example: Orientation of pied flycatchers
x −µ
√
σ/ n
is asymptotically (for large n) standard-normally distributed.
σ is the theoretical standard deviation, but we do not know it.
Instead of σ we use the sample standard deviation s. We have
applied the “2/3 rule of thumb” which is based on the normal
distribution.
25 / 31
The t-Test for paired samples
We know:
Example: Orientation of pied flycatchers
x −µ
√
σ/ n
is asymptotically (for large n) standard-normally distributed.
σ is the theoretical standard deviation, but we do not know it.
Instead of σ we use the sample standard deviation s. We have
applied the “2/3 rule of thumb” which is based on the normal
distribution.
For questions about the significance this approximation is too
imprecise if σ is replaced by s (and n is not extremely large).
25 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
General Rule
If X1 , . . . , Xn are independently drawn from a normal distributed
with mean µ and
n
s2 =
1 X
(Xi − X )2 ,
n−1
i=1
then
X −µ
√
s/ n
is t-distributed with n − 1 degrees of freedom (df).
26 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
General Rule
If X1 , . . . , Xn are independently drawn from a normal distributed
with mean µ and
n
s2 =
1 X
(Xi − X )2 ,
n−1
i=1
then
X −µ
√
s/ n
is t-distributed with n − 1 degrees of freedom (df).
The t-distribution is also called Student-distribution, since
Gosset published it using this synonym.
26 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
0.2
0.0
0.1
density
0.3
0.4
Density of the t-distribution
−4
−2
0
2
4
Density of t-distribution with 16 df
27 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
0.2
0.0
0.1
density
0.3
0.4
Density of the t-distribution
−4
−2
0
2
4
Density of t-distribution with 16 df
Density of standard normal distribution
27 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
0.2
0.0
0.1
density
0.3
0.4
Density of the t-distribution
−4
−2
0
2
4
Density of t-distribution with 16 df
Density of standard normal distribution µ = 0 and σ 2 =
16
14
27 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
0.2
0.0
0.1
density
0.3
Density of the t-distribution
−4
−2
0
2
4
Density of t-distribution with 5 df
Density of normal distribution with µ = 0 and σ 2 =
5
3
27 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
How (im)probable is a deviation by
2.35 standard errors?
Pr(T = 2.34) =
28 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
How (im)probable is a deviation by
2.35 standard errors?
Pr(T = 2.34) = 0
28 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
How (im)probable is a deviation by at least
2.35 standard errors?
Pr(T = 2.34) = 0
does not help!
28 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
How (im)probable is a deviation by at least
2.35 standard errors?
Pr(T = 2.34) = 0
does not help!
0.2
0.1
0.0
density
0.3
0.4
Compute Pr(T ≥ 2.34), the so-called p-value.
−4
−2
−2.34
0
2
4
2.34
28 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
How (im)probable is a deviation by at least
2.35 standard errors?
Pr(T = 2.34) = 0
does not help!
0.1
0.2
The total area
magenta-colored
is the p-value.
of the
regions
0.0
density
0.3
0.4
Compute Pr(T ≥ 2.34), the so-called p-value.
−4
−2
−2.34
0
2
4
2.34
28 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
R does that for us:
> pt(-2.34,df=16)+pt(2.34,df=16,lower.tail=FALSE)
[1] 0.03257345
29 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
R does that for us:
> pt(-2.34,df=16)+pt(2.34,df=16,lower.tail=FALSE)
[1] 0.03257345
Comparison with normal distribution:
> pnorm(-2.34)+pnorm(2.34,lower.tail=FALSE)
[1] 0.01928374
> pnorm(-2.34,sd=16/14)+
+ pnorm(2.34,sd=16/14,lower.tail=FALSE)
[1] 0.04060902
29 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
t-Test with R
> x <- length$green-length$blue
> t.test(x)
One Sample t-test
data: x
t = 2.3405, df = 16, p-value = 0.03254
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.004879627 0.098649784
sample estimates:
mean of x
0.05176471
30 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
We note:
p − Wert = 0.03254
31 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
We note:
p − Wert = 0.03254
i.e.: If the Null hypothesis “just random”, in our case µ = 0
holds, any deviation like the observed one or even more is very
improbable.
31 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
We note:
p − Wert = 0.03254
i.e.: If the Null hypothesis “just random”, in our case µ = 0
holds, any deviation like the observed one or even more is very
improbable.
If we always reject the null hypothesis if and only if the p-value is
below a level of significance of 0.05, then we can say following
holds:
31 / 31
The t-Test for paired samples
Example: Orientation of pied flycatchers
We note:
p − Wert = 0.03254
i.e.: If the Null hypothesis “just random”, in our case µ = 0
holds, any deviation like the observed one or even more is very
improbable.
If we always reject the null hypothesis if and only if the p-value is
below a level of significance of 0.05, then we can say following
holds:
If the null hypothesis holds, then the probability that we reject it,
is only 0.05.
31 / 31
Related documents