Download document

Statistical inference Statistical inference  Definition : generalization from a sample to a population.  2 cases:   Is a sample belongs to an hypothetical population? Is two samples belong to the same hypothetical population? Statistical inference  1st possibility x  96   100 x x x x x x x x x 1  100 ? Inference ?  2  100 Statistical inference  2nd possibility x1  104   100 x x x x x x x x x x2  110 x x x x x x x x 1  2  0 ? Inference ? 1  2  0 Hypotheses 1 2 H0 :   k H1 :   k H 0 : 1  2 H1 : 1  2 H0= Null hypothesis H1 = Alternative hypothesis  = Mean of the population k = Constant H0= Null hypothesis H1 = Alternative hypothesis 1 = Mean of the first population 2 = Mean of the second population We test H0 Decision  From the sample(s) we decide if we reject or not the null hypothesis.  When we are doing inference we are never certain that we took the right decision Population Sample Decision Identical Different Identical Good Error 2 Different Error 1 Good Decision  2 type of errors:  1 – If we inferred that 2 groups belong to two different populations when they don’t. We rejected H0 when H0 was true.  2 – If we inferred that 2 groups belong to the same population when they don’t. We kept H0 when H0 was false. Population Sample Decision Identical Different Identical Good Error 2 Different Error 1 Good 1- Inference about the mean of a population Sampling Distribution of the Mean Sample (n) Population x1 x2 x Sampling Distribution of the Mean x1 x2 x   72  3   72 x  ? x1 x2 x Sampling Distribution of the Mean  Characteristics:    Follows a normal curve. The mean will be equal to the one of the population The standard deviation will be equal to x    n  Standard Error The larger the sample size is, the smaller the standard error will be. Sampling Distribution of the Mean N=9 Sample Population x1 x2 Sampling Distribution of the Mean x1 x2 x10000   72  3 x10000   71.9958  x  0.9959 x1 x2 x10000 Sampling Distribution of the Mean N=16 Sample Population x1 x2 Sampling Distribution of the Mean x1 x2 x10000   72  3 x10000   71.9984  x  0.74696 x1 x2 x10000 Sampling Distribution of the Mean N=36 Sample Population x1 x2 Sampling Distribution of the Mean x1 x2 x10000   72  3 x10000   72.0146  x  0.50165 x1 x2 x10000 Sampling Distribution of the Mean N=144 Sample Population x1 x2 Sampling Distribution of the Mean x1 x2 x10000   72  3 x10000   72.0014  x  0.24972 x1 x2 x10000 Test of Significance    If we suppose that the null hypothesis is true, what is the probability of observing the giving sample mean? If it is unlikely, we will reject H0, else we will keep H0. Unlikely: 5% or 1% = a = significance threshold zx  x  x Test of Significance Example: one side       H0:  = 72 H1:  < 72 (based on previous studies) a = 0.05 (5%) x  65  72 zx    4, 67 =9 x 1,5 x = 65 n = 36  9 9 x  n  36 za = 1.65  6  1,5 Because zx is greater za we reject the null hypothesis and accept the alternative hypothesis Test of Significance Example : 2 sides       H0:  = 72 H1:   72 a = 0.05 (5%) =9 x = 68 n = 36  9 9 x  n  36 za = 1.96  6 zx   1,5 x  x  68  72  2, 667 1,5 Because zx is greater za we reject the null hypothesis and accept the alternative hypothesis Confidence intervals  We are never sure that the mean of our sample is exactly the real mean of the population. Therefore, instead of given the mean only, it is possible de quantify our level of certitude by specifying a confidence interval around the mean. CI 1a  x  za x    x  za x Confidence intervals Example: CI = 95%  x = 50,7 n = 100   = 20  20 20 x    2  n 100 IC0.95  50, 7  3,92    50, 7  3,92 IC0.95  46, 78    54, 62 10 a = 1-IC = 1-0,95 = 0,05 za = 1.96 IC0.95  50, 7  1,96  2    50, 7  1,96  2 Therefore, there is a 95% probability that the mean of the population is between 46.75 and 54.62 Confidence intervals Example: CI = 99%  x = 50,7 n = 100   = 20  20 20 x    2  n 100 IC0.99  50, 7  5,16    50, 7  5,16 IC0.99  45,54    55,86 10 a = 1-IC = 1-0,99 = 0,01 za = 2.58 IC0.99  50, 7  2,58  2    50, 7  2,58  2 Therefore, there is a 99% probability that the mean of the population is between 445.54 and 55.86 2- Inference for the difference between two population means distribution of sample mean differences Samples (n) Population   72  3 Distribution of sample mean differences x1 x2 x1  x2 x x1  x  0  x x  ? 1 2 x1  x2 x1  x Distribution of sample mean differences  Characteristics:     x x   2   2 1 2 x1 x2 Follows a normal distribution The mean will be equal to 0 (1-2=0) The standard deviation will be equal to: The standard error of mean difference Decision rule zx x 1 zx x 1  x  1 2  x  1 2  x2   1   2   x x  x2   x x 1 2 1 2 , because 1   2   0 Test of Significance Example: What is the probability of observed difference between the following groups?    H0: 1 = 2 (1 - 2 = 0) H1: 1  2 (1 - 2  0) a = 0.05 (5%) 1 5 5 x  1 n1  36  6  0,833 2 5 5  x2     0,833 n2 36 6 2  x x 1 2 2 5 5        1,18 6 6  x1 = 50  x2 = 48  1 = 5  2 = 5  n1 = 36  n2 = 36 z x1  x2  x1  x2  x x 1 2  50  48  1, 69 1,18 Critical z  1.96 Test of Significance Example: What is the probability of observed difference between the following groups?    H0: 1 = 2 (1 - 2 = 0) H1: 1  2 (1 - 2  0) a = 0.05 (5%)  x1 = 50  x2 = 48  1 = 5  2 = 5  n1 = 36  n2 = 36 Because the observed z is lower than the critical (za) we will keep the null hypothesis Confidence intervals IC1a   x1  x2   za x  1  2   x1  x2   za x Test of Significance Example: a 95% confidence interval    H0: 1 = 2 (1 - 2 = 0) H1: 1  2 (1 - 2  0) a = 0.05 (5%)  x1 = 50  x2 = 48  1 = 5  2 = 5  n1 = 36  n2 = 36 IC1a   x1  x2   za  x  1  2   x1  x2   za  x IC0,95  (50  48)  1, 96 1,18  1  2  (50  48)  1, 96 1,18 IC0,95  2  2.3128  1  2  2  2.3128 IC0,95  0.3128  1  2  4.3128 Test of Significance Example: a 95% confidence interval    H0: 1 = 2 (1 - 2 = 0) H1: 1  2 (1 - 2  0) a = 0.05 (5%)  x1 = 50  x2 = 48  1 = 5  2 = 5  n1 = 36  n2 = 36 Therefore there is a 95% probability that the mean difference between the populations is between -0.3128 and 4.3128

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download document