Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistics for EES 3. From standard error to t-Test Dirk Metzler http://evol.bio.lmu.de/_statgen May 14, 2013 1 / 31 Contents 1 The Normal Distribution 2 The t-Test for paired samples Example: Orientation of pied flycatchers 2 / 31 The Normal Distribution Contents 1 The Normal Distribution 2 The t-Test for paired samples Example: Orientation of pied flycatchers 3 / 31 The Normal Distribution Density of the normal distribution 0.4 0.1 0.2 0.3 µ+σ 0.0 Normaldichte µ µ+σ −1 0 1 2 3 4 5 4 / 31 The Normal Distribution Density of the normal distribution 0.4 0.2 0.3 µ+σ 0.0 0.1 Normaldichte µ µ+σ −1 0 1 2 3 4 5 The normal distribution is also called Gauß distribution 4 / 31 The Normal Distribution Density of the normal distribution The normal distribution is also called Gauß distribution (after Carl Friedrich Gauß, 1777-1855) 4 / 31 0.025 The Normal Distribution 0.020 ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.015 ● ● ● ● ● ● ● ● ● ● 0.005 0.010 ● 0.000 dbinom(400:600, 1000, 0.5) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 400 450 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 500 550 600 400:600 5 / 31 0.025 The Normal Distribution 0.020 ●●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.015 ● ● ● ● ● ● ● ● ● ● 0.005 0.010 ● 0.000 dbinom(400:600, 1000, 0.5) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ● ● ● ● ● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 400 450 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●●● ●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●● 500 550 600 400:600 5 / 31 The Normal Distribution Density of standard normal distribution A random variable Z with density x2 1 f (x) = √ · e− 2 2π 0.0 0.1 0.2 0.3 0.4 “Gaussian bell” −3 −2 −1 0 1 2 3 is called standard normally distributed. 6 / 31 The Normal Distribution Density of standard normal distribution A random variable Z with density x2 1 f (x) = √ · e− 2 2π “Gaussian bell” 0.0 0.1 0.2 0.3 0.4 for short: Z ∼ N (0, 1) −3 −2 −1 0 1 2 3 is called standard normally distributed. 6 / 31 The Normal Distribution Density of standard normal distribution A random variable Z with density x2 1 f (x) = √ · e− 2 2π “Gaussian bell” 0.2 0.3 0.4 for short: Z ∼ N (0, 1) 0.1 EZ = 0 0.0 Var Z = 1 −3 −2 −1 0 1 2 3 is called standard normally distributed. 6 / 31 The Normal Distribution If Z is N (0, 1) distributed, then X = σ · Z + µ is normally distributed with mean µ and variance σ 2 , for short: X ∼ N (0, 1) 7 / 31 The Normal Distribution If Z is N (0, 1) distributed, then X = σ · Z + µ is normally distributed with mean µ and variance σ 2 , for short: X ∼ N (0, 1) X has the density f (x) = √ 1 2πσ ·e − (x−µ)2 2σ 2 . 7 / 31 The Normal Distribution Keep in mind If Z ∼ N (µ, σ 2 ), then: Pr(|Z − µ| > σ) < 1 3 8 / 31 The Normal Distribution Keep in mind If Z ∼ N (µ, σ 2 ), then: Pr(|Z − µ| > σ) < 31 Pr(|Z − µ| > 2 · σ) < 0.05 8 / 31 The Normal Distribution Keep in mind If Z ∼ N (µ, σ 2 ), then: Pr(|Z − µ| > σ) < 31 Pr(|Z − µ| > 2 · σ) < 0.05 Pr(|Z − µ| > 3 · σ) < 0.003 8 / 31 The Normal Distribution 2πσ ·e − (x−µ)2 2σ 2 0.00 0.05 0.10 0.15 0.20 0.25 f (x) = √ 1 0 2 µ−σ 4 µ µ+σ 6 8 10 9 / 31 The Normal Distribution Densities need integrals 0.00 0.05 0.10 0.15 0.20 0.25 If Z is a random variable with density f (x) 0 2 4 6 a 8 b 10 , then Pr(Z ∈ [a, b]) = 10 / 31 The Normal Distribution Densities need integrals 0.00 0.05 0.10 0.15 0.20 0.25 If Z is a random variable with density f (x) 0 2 4 6 8 a b then Z Pr(Z ∈ [a, b]) = 10 , b f (x)dx. a 10 / 31 The Normal Distribution Example: Let Z ∼ N (µ = 5, σ 2 = 2.25). Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5) 11 / 31 The Normal Distribution Example: Let Z ∼ N (µ = 5, σ 2 = 2.25). Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5) Computation of Pr(Z ∈ [3, 4]): 11 / 31 The Normal Distribution Example: Let Z ∼ N (µ = 5, σ 2 = 2.25). Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5) Computation of Pr(Z ∈ [3, 4]): Pr(Z ∈ [3, 4]) = Pr(Z < 4) − Pr(Z < 3) 11 / 31 The Normal Distribution Example: Let Z ∼ N (µ = 5, σ 2 = 2.25). Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5) Computation of Pr(Z ∈ [3, 4]): Pr(Z ∈ [3, 4]) = Pr(Z < 4) − Pr(Z < 3) > pnorm(4,mean=5,sd=1.5)-pnorm(3,mean=5,sd=1.5) 11 / 31 The Normal Distribution Example: Let Z ∼ N (µ = 5, σ 2 = 2.25). Pr(Z < a) can be computed with pnorm(a,mean=5,sd=1.5) Computation of Pr(Z ∈ [3, 4]): Pr(Z ∈ [3, 4]) = Pr(Z < 4) − Pr(Z < 3) > pnorm(4,mean=5,sd=1.5)-pnorm(3,mean=5,sd=1.5) [1] 0.1612813 11 / 31 The Normal Distribution Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)? 12 / 31 The Normal Distribution Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)? Answer: For each x ∈ R holds Pr(Z = x) = 0 12 / 31 The Normal Distribution Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)? Answer: For each x ∈ R holds Pr(Z = x) = 0 P How about EZ = a a · Pr(Z = a) ? 12 / 31 The Normal Distribution Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)? Answer: For each x ∈ R holds Pr(Z = x) = 0 P How about EZ = a a · Pr(Z = a) ? Needs a new definition: Z ∞ x · f (x)dx EZ = −∞ 12 / 31 The Normal Distribution Let Z ∼ N (µ, σ 2 ). How to compute Pr(Z = 5)? Answer: For each x ∈ R holds Pr(Z = x) = 0 P How about EZ = a a · Pr(Z = a) ? Needs a new definition: Z ∞ x · f (x)dx EZ = −∞ But we already know the result EZ = µ. 12 / 31 The t-Test for paired samples Contents 1 The Normal Distribution 2 The t-Test for paired samples Example: Orientation of pied flycatchers 13 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is A significantly different from B? 14 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is A significantly different from B? No: A is in 2/3 range of µB 14 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is B significantly different from C? 14 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is B significantly different from C? No: 2/3 ranges of µB and µC overlap 14 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is C significantly different from D? 14 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is C significantly different from D? We do not know yet. 14 / 31 The t-Test for paired samples sample means ± standard errors A B C D Is C significantly different from D? We do not know yet. (Answer will be “no”.) 14 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Contents 1 The Normal Distribution 2 The t-Test for paired samples Example: Orientation of pied flycatchers 15 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Pied flycatchers (Ficedula hypoleuca) Foto (c) Simon Eugster http://en.wikipedia.org/wiki/File:Ficedula hypoleuca NRM.jpg 16 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Wiltschko, W.; Gesson, M.; Stapput, K.; Wiltschko, R. Light-dependent magnetoreception in birds: interaction of at least two different receptors. Naturwissenschaften 91.3, pp. 130-4, 2004. Wiltschko, R.; Ritz, T.; Stapput, K.; Thalau, P.; Wiltschko, W. Two different types of light-dependent responses to magnetic fields in birds. Curr Biol 15.16, pp. 1518-23, 2005. Wiltschko, R.; Stapput, K.; Bischof, H. J.; Wiltschko, W. Light-dependent magnetoreception in birds: increasing intensity of monochromatic light changes the nature of the response. Front Zool, 4, 2007. 17 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Direction of flight with blue light. 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Direction of flight with blue light. Another flight of the same bird with blue light. 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers All flight’s directions of this bird in blue light. 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers All flight’s directions of this bird in blue light. 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers All flight’s directions of this bird in blue light. corresponding exit points 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Directions of this bird’s flights in green light. 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Directions of this bird’s flights in green light. corresponding exit points 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers corresponding exit points arrow heads: barycenter of exit points for green light 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers arrow heads: barycenter of exit points for green light 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers arrow heads: barycenter of exit points for green light the same for “blue” directions 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers arrow heads: barycenter of exit points for green light the same for “blue” directions the more variable the directions the shorter the arrows! 18 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Question of interest Does the color of monochromatic light influence the orientation? 19 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Question of interest Does the color of monochromatic light influence the orientation? Experiment: For 17 birds compare barycenter vector lengths for blue and green light. 19 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers 0.5 Pies flycathchers : barycenter vector lengths for green light and for blue light b, n=17 0.4 ● 0.3 ● ● 0.2 ● ● ● ● ● ● ● ● 0.1 ● ● ● ● 0.0 with green light ● 0.0 0.1 0.2 0.3 0.4 0.5 with blue light 20 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Compute for each bird the distance to the diagonal, 21 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Compute for each bird the distance to the diagonal, i.e. x := “green length” − “blue length” 21 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Compute for each bird the distance to the diagonal, i.e. x := “green length” − “blue length” −0.05 0.00 0.05 0.10 0.15 0.20 21 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) 22 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) x = 0.0518 22 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) x = 0.0518 s = 0.0912 22 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) x = 0.0518 s = 0.0912 s SEM = √ n 22 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) x = 0.0518 s = 0.0912 s 0.912 SEM = √ = √ n 17 22 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) x = 0.0518 s = 0.0912 s 0.912 SEM = √ = √ = 0.022 n 17 22 / 31 The t-Test for paired samples −0.05 0.00 0.05 0.10 Example: Orientation of pied flycatchers 0.15 0.20 Can the theoretical mean be µ = 0? (i.e. the expectation value for a randomly chosen bird) x = 0.0518 s = 0.0912 s 0.912 SEM = √ = √ = 0.022 n 17 22 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Is |x − µ| ≈ 0.0518 a large deviation? 23 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Is |x − µ| ≈ 0.0518 a large deviation? Large? 23 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Is |x − µ| ≈ 0.0518 a large deviation? Large? Large compared to what? 23 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers Is |x − µ| ≈ 0.0518 a large deviation? Large? Large compared to what? Large compared the the standard error? 23 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers |x − µ| 0.0518 √ ≈ ≈ 2.34 0.022 s/ n 24 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers |x − µ| 0.0518 √ ≈ ≈ 2.34 0.022 s/ n So, x is more than 2.3 standard deviations away from µ = 0. 24 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers |x − µ| 0.0518 √ ≈ ≈ 2.34 0.022 s/ n So, x is more than 2.3 standard deviations away from µ = 0. How probable is this when 0 is the true theoretical mean? 24 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers |x − µ| 0.0518 √ ≈ ≈ 2.34 0.022 s/ n So, x is more than 2.3 standard deviations away from µ = 0. How probable is this when 0 is the true theoretical mean? in other words: Is this deviation significant? 24 / 31 The t-Test for paired samples We know: Example: Orientation of pied flycatchers x −µ √ σ/ n is asymptotically (for large n) standard-normally distributed. 25 / 31 The t-Test for paired samples We know: Example: Orientation of pied flycatchers x −µ √ σ/ n is asymptotically (for large n) standard-normally distributed. σ is the theoretical standard deviation, but we do not know it. 25 / 31 The t-Test for paired samples We know: Example: Orientation of pied flycatchers x −µ √ σ/ n is asymptotically (for large n) standard-normally distributed. σ is the theoretical standard deviation, but we do not know it. Instead of σ we use the sample standard deviation s. 25 / 31 The t-Test for paired samples We know: Example: Orientation of pied flycatchers x −µ √ σ/ n is asymptotically (for large n) standard-normally distributed. σ is the theoretical standard deviation, but we do not know it. Instead of σ we use the sample standard deviation s. We have applied the “2/3 rule of thumb” which is based on the normal distribution. 25 / 31 The t-Test for paired samples We know: Example: Orientation of pied flycatchers x −µ √ σ/ n is asymptotically (for large n) standard-normally distributed. σ is the theoretical standard deviation, but we do not know it. Instead of σ we use the sample standard deviation s. We have applied the “2/3 rule of thumb” which is based on the normal distribution. For questions about the significance this approximation is too imprecise if σ is replaced by s (and n is not extremely large). 25 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers General Rule If X1 , . . . , Xn are independently drawn from a normal distributed with mean µ and n s2 = 1 X (Xi − X )2 , n−1 i=1 then X −µ √ s/ n is t-distributed with n − 1 degrees of freedom (df). 26 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers General Rule If X1 , . . . , Xn are independently drawn from a normal distributed with mean µ and n s2 = 1 X (Xi − X )2 , n−1 i=1 then X −µ √ s/ n is t-distributed with n − 1 degrees of freedom (df). The t-distribution is also called Student-distribution, since Gosset published it using this synonym. 26 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers 0.2 0.0 0.1 density 0.3 0.4 Density of the t-distribution −4 −2 0 2 4 Density of t-distribution with 16 df 27 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers 0.2 0.0 0.1 density 0.3 0.4 Density of the t-distribution −4 −2 0 2 4 Density of t-distribution with 16 df Density of standard normal distribution 27 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers 0.2 0.0 0.1 density 0.3 0.4 Density of the t-distribution −4 −2 0 2 4 Density of t-distribution with 16 df Density of standard normal distribution µ = 0 and σ 2 = 16 14 27 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers 0.2 0.0 0.1 density 0.3 Density of the t-distribution −4 −2 0 2 4 Density of t-distribution with 5 df Density of normal distribution with µ = 0 and σ 2 = 5 3 27 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers How (im)probable is a deviation by 2.35 standard errors? Pr(T = 2.34) = 28 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers How (im)probable is a deviation by 2.35 standard errors? Pr(T = 2.34) = 0 28 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers How (im)probable is a deviation by at least 2.35 standard errors? Pr(T = 2.34) = 0 does not help! 28 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers How (im)probable is a deviation by at least 2.35 standard errors? Pr(T = 2.34) = 0 does not help! 0.2 0.1 0.0 density 0.3 0.4 Compute Pr(T ≥ 2.34), the so-called p-value. −4 −2 −2.34 0 2 4 2.34 28 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers How (im)probable is a deviation by at least 2.35 standard errors? Pr(T = 2.34) = 0 does not help! 0.1 0.2 The total area magenta-colored is the p-value. of the regions 0.0 density 0.3 0.4 Compute Pr(T ≥ 2.34), the so-called p-value. −4 −2 −2.34 0 2 4 2.34 28 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers R does that for us: > pt(-2.34,df=16)+pt(2.34,df=16,lower.tail=FALSE) [1] 0.03257345 29 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers R does that for us: > pt(-2.34,df=16)+pt(2.34,df=16,lower.tail=FALSE) [1] 0.03257345 Comparison with normal distribution: > pnorm(-2.34)+pnorm(2.34,lower.tail=FALSE) [1] 0.01928374 > pnorm(-2.34,sd=16/14)+ + pnorm(2.34,sd=16/14,lower.tail=FALSE) [1] 0.04060902 29 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers t-Test with R > x <- length$green-length$blue > t.test(x) One Sample t-test data: x t = 2.3405, df = 16, p-value = 0.03254 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 0.004879627 0.098649784 sample estimates: mean of x 0.05176471 30 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers We note: p − Wert = 0.03254 31 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers We note: p − Wert = 0.03254 i.e.: If the Null hypothesis “just random”, in our case µ = 0 holds, any deviation like the observed one or even more is very improbable. 31 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers We note: p − Wert = 0.03254 i.e.: If the Null hypothesis “just random”, in our case µ = 0 holds, any deviation like the observed one or even more is very improbable. If we always reject the null hypothesis if and only if the p-value is below a level of significance of 0.05, then we can say following holds: 31 / 31 The t-Test for paired samples Example: Orientation of pied flycatchers We note: p − Wert = 0.03254 i.e.: If the Null hypothesis “just random”, in our case µ = 0 holds, any deviation like the observed one or even more is very improbable. If we always reject the null hypothesis if and only if the p-value is below a level of significance of 0.05, then we can say following holds: If the null hypothesis holds, then the probability that we reject it, is only 0.05. 31 / 31