Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LECTURE 2 HYPOTHESIS TESTING Distributions(continued); Maximum Likelihood; Parametric hypothesis tests (chi-squared goodness of fit, t-test, F-test) Supplementary Readings: Wilks, chapters 4,5; Bevington, P.R., Robinson, D.K., Data Reduction and Error Analysis for the Physical Sciences, McGraw-Hill, 1992. Gamma Distribution ( 1) ( x / ) exp( x / ) P ( x) N ( ) 2 2 : scale parameter : shape parameter ( N 2) / 2 exp( x / 2) P ( x) x N ( N / 2)2 N / 2 More general form of the Chi-Squared distribution Gamma Distribution ( 1) ( x / ) exp( x / ) P ( x) N ( ) : scale parameter : shape parameter Beta Distribution ( p q ) p 1 q 1 P ( x) x (1 x) i ( p)(q) Lognormal Distribution Example Science Hypothesis Testing Test Statistic Null Hypothesis (H0) Alternative Hypothesis (HA) Gaussian Series? Mean Standard Deviation 2 1 N N 1 i 1 xi x x i1 xi ? s N 1 N Variance s 2 2 ? NINO3 (90-150W, 5S-5N) Histogram Gaussian? Gaussian Distribution (cont) Z is a test statistic! How do we invoke Gaussian Null hypothesis? Can we use PG alone? Gaussian Distribution (cont) Two-Sided or Two-tailed test! Z is a test statistic! A more readily applicable form of the Gaussian Null Hypothesis is provided by Integral of Gaussian Distribution Gaussian Distribution (cont) Two-Sided or Two-tailed test! Z is a test statistic! p=0.05 Central Limit Theorem For a sum of a large number of arbitrary independent, identically distributed (IID) quantities, joint PDF approaches a Gaussian Distribution. Why? Consequence: the distribution of a mean quantity is approximately Gaussian for large enough sample size. Method of Maximum Likelihood Most probable value for the statistic of interest is given by the peak value of the joint probability distribution. Consider Gaussian distribution 1 P P i 2 1,..., N N 2 x 1 exp i 2 The most probable values of and are obtained by maximizing P with respect to these parameters Method of Maximum Likelihood Easiest to work with the Log-Likelihood function: 2 1 L(, ) N ln N ln 2 x 2 2 i 1 P P i 2 1,..., N N 2 x 1 exp i 2 The most probable values of and are obtained by maximizing P with respect to these parameters Method of Maximum Likelihood Easiest to work with the Log-Likelihood function: 2 1 L(, ) N ln N ln 2 x 2 2 i We want to maximize L relative to the two parameters of interest: L(, ) 0 L(, ) 0 1 L(, ) N ln N ln 2 2 x i 2 1 L(, ) 0 x 0 2 i N 1 xi x L(, ) 0 N i1 N 1 3 xi 2 1 xi N 2 0 2 But we know, 2 1 N s i 1 xi x N 1 Maximum likelihood estimates are often biased estimates! 2 1 xi N Central Limit Theorem What is the standard deviation in the mean 1 N xi x ? N i1 Uncertainties of Gaussian distributed quantities add in quadrature P P 1 i 2 1,..., N N 2 x exp 1 i 2 Central Limit Theorem What is the standard deviation in the mean 1 N xi x ? N i1 2 x x 2 x 2 ... x 1 2 2 x 1 x x N N 2 N N 2 P P 1 i 2 1,..., N N 2 x exp 1 i 2 Chi-Squared P (x) x N (N 2)/ 2 2 x exp(x / 2) N 2 i N / 2 i 1 (N / 2)2 P P 1 i 2 1,..., N 2 N 2 x exp 1 i 2 Chi-Squared P (x) x N (N 2)/ 2 2 x exp(x / 2) N 2 i N / 2 i 1 (N / 2)2 2 2(n=5) 2 2 2N 2 N Reduced Chi-Squared Reduced Chi-Squared 2 2 /v v Reduced Chi-Squared Reduced Chi-Squared Histogram How do we determine if the observed histogram is consistent with a particular distribution (e.g. Gaussian)? “Goodness of fit” What is 2(hi)? hi g h 2 N i i i1 (h) 2 How do we determine if the observed histogram is consistent with a particular distribution (e.g. Gaussian)? What is 2(hi)? hi 2 N (g h )2 / h i1 i i i How do we determine if the observed histogram is consistent with a particular distribution (e.g. Gaussian)? 2(hi)= hi Use reduced Chi-Squared distribution 2 2 N (g h )2 / h i1 i i i /v n=N-2 (sigma estimated from data) n=N-3 (mu and sigma estimated from data)