Download cururu

Statistical Data Analysis and Simulation João R. T. de Mello Neto Jorge Andre Swieca School Campos do Jordão, January,2003 Questions • • • • • • What is probability? How to quantify it? What is the probability of something happens? What is the value of a given parameter? What is the uncertainty in a given parameter? Is this fit acceptable? What is the likelihood of a given signal be physics and not background? • How one separates signal from background? Chance The conception of chance enters into the very first steps of scientific activity, in virtue of the fact that no observation is absolutely correct. Max Born Natural Philosophy of Cause and Chance, p. 47 O acaso é um diabo e um deus ao mesmo tempo. Machado de Assis Lectures • Basics: random variables, probability, distributions • Random numbers, minimization techniques • Maximum likelihood and chi-square methods • Goodness of fit, limits • Applications: pattern recognition in the LHCb muon system, sigma particle fitting in E791, bayesian coin,… First lecture Basics: random variables, probabilities and distributions Jorge Andre Swieca School Campos do Jordão, January,2003 References • Statistical Data Analysis, G. Cowan, Oxford, 1998; • Statistics, A guide to the Use of Statistical Methods in the Physical Sciences, R. Barlow, J. Wiley & Sons, 1989; • Computational Statistics Handbook with MATLAB, W. L. Martinez, A. R. Martinez, Chapman&Hall, 2002 Random Variables • Random experiment: the outcome cannot be Errors in the measuring process predicted with certainty Fundamental unpredictability • Statistics: model and analyze the outcomes • Sample space S = set of all possible outcomes • Die X = { 1, 2, 3, 4, 5, 6} Discrete random variable • Period of a pendulum Continous random variable Probability • Quantify the degree of randomness; • Definition in terms of set theory: S composed of elements A (subsets of S) P(A) real number that satisfy three axioms: • for every A, P(A) ≥ 0 • if A∩B = Ø (disjoints) P(AUB) = P(A) + P(B) • P(S) = 1 P(Ā) = 1 – P(A) P(Ø) = 0 P(AUĀ) = 1 If A C B, P(A) ≤ P(B) 0 ≤ P(A) ≤ 1 P(AUB) = P(A) + P(B) – P(A∩B) Intuitive approach S A ∩ ∩ ∩ Conditional probability P(A|B) : prob. of event A given B B ∩ ∩ ∩ ∩ ∩ P (B )  5 P (A)  3 10 10 ∩ P( A | B)  2 ∩ P(A|B) = events in A and B = Events in B P(B|A) = 2 = 3 events in A and B total events in B total P(B∩A) P(A) P(A∩B) = P(B|A)P(A) = 2/3 x 3/10 = 2/10 = P(A|B)P(B) = P(A∩B) P(B) 5 Intuitive approach Independent probabilities S A B ∩ ∩ ∩ ∩ S ∩ ∩ ∩ ∩ A B ∩ ∩ ∩ ∩ ∩ ∩ 2 3  5 10 not independent ∩ ∩ ∩ P(A|B) = P(A) ∩ ∩ ∩ 5 4 2 P(A)  P(B)  P(A  B)  10 10 10 2 P(A | B)   P ( A) independent 4 Bayes Theorem P( A  B) P( A | B)  P (B ) P (B  A ) P (B | A )  P ( A)  P ( A | B )P (B )  P (B | A)P ( A) P (B | A)P ( A) P( A | B)  P (B ) Ai  A j  , S   i Ai i  j disjoints B  B  S  B  ( i Ai )   i (B  Ai ) P (B )   P (B  Ai )   P (B | Ai )P ( Ai ) i i P (B | A)P ( A) P( A | B)   P (B | Ai )P ( Ai ) i Cherenkov counter 90% signal π 10% signal K 95% efficiency 6% false signals P ( )  0.9 P (K )  0.1 P (s |  )  0.95 P (s |  )  0.05 P (s | K )  0.06 P (s | K )  0.94 P (s |  )P ( ) P ( | s )  P (s |  )P ( )  P (s | K )P (K ) = 99.3% P (K | s ) = 0.7% P (s | K )P (K ) P (K | s )  P (s |  )P ( )  P (s | K )P (K ) = 67.6% P ( | s ) = 32.4% AIDS positive “About 0.01 percent of men with no known risk behaviour are infected with HIV (base rate). If such a man has the virus, there is a 99.9 percent chance that the test result will be positive (sensitivity). If a man is not infected, there is a 99.99 percent chance that the test result will be negative (specificity)” What is the chance that a man who tests positive actually has the virus? p(d )  0.0001 p( p | d )  0.999 p(n | d )  0.9999 p( p | d )  0.0001 p( n | d )  0.001 p( p | d )p(d ) p( p | d )p(d )  p(d | p)  p( p | d )p(d )  p( p | d )p(d ) p( p) = 0.5 Reckoning with Risk, G. Gigerenger, 2002 AIDS positive natural frequencies (no known risk behaviour) 10000 1 HIV 9999 no HIV 1 positive 0 negative 1 positive 1 p(d | p )  2 9998 negative Many examples: mamography screening 1 out of 10 positives! Gigerenger, 2002 Probability What is the meaning of P(A)? Frequentist: limit of relative frequencies S: possible outcomes of an experiment (repeatable) A: occurrence of a given outcome (event) P(A) = lim n→∞ • • • • number of occurrences of A in n measurements n consistent with the probability axioms usual interpretation in standard textbooks appropriate to particle physics (many repeatable events) more problematic for unique phenomena • big-bang • rain tomorrow Probability Bayesian (subjective) Element of S: hypotheses or propositions (true or false) P(A) = degree of belief that hypothesis A is true Hypothesys: a measurement will yield a given outcome a certain fraction of the time subjective probabilities include the frequentist interpretation P=95% m 1 ≤ me ≤ m 2 Bayesian interpretation! Bayesian statistics: interpretation of Bayes theorem Probability P (B | A)P ( A) P( A | B)  P (B ) A: a given theory is correct; B: data will yield a particular result; likelihood P(theory|data) = P(data|theory) P(theory) P(data) apriori posteriori Distributions f(x) prob. density function x: random continuos variable probability to observe x in the interval [x, x+ dx] = f(x)dx  f ( x )dx  1 S cumulative distribution function x F(x)  / / f ( x ) dx   Distributions joint p.d.f f(x,y) P(A∩B) = prob. of x in [x, x + dx] and y in [y, y + dy] = f ( x, y )dxdy P ( A)   f ( x, y i )dydx  fx ( x )dx i  fx ( x )   f ( x, y )dy  P (B | A )  P ( A  B ) f ( x, y )dxdy  P ( A) f x ( x )dx f ( x, y ) f ( x, y ) h( y | x )   / / fx ( x ) f ( x , y ) dy  Distributions Distributions  E[ x]  expectation value  xf ( x )dx     2 population variance E[( x  E[ x ]) ]   ( x   ) f ( x )dx  V [ x ] 2  covariance Vxy  E [( x   x )( y   y )]  E [ xy ]   x  y      xyf ( x, y )dxdy     correlation coeficient  xy  Vxy  x y x y Distributions binomial • process with a given number of identical trials (N) with two possible outcomes : success (p), failure (1-p) • what is the probability of n success? ( N-n failures) probability for a particular sequence: N n p (1  p) n N! order does not matter: number of sequences n! (N  n )! N! probability, not f (n; N, p)  p n (1  p)N n n! (N  n )! prob. density E [n ]  Np V [n]  E[n ]  (E[n])  Np(1  p) 2 2 binomial binomial binomial C1 C2 C3 C4 C5 Individual efficiency: 0.95 track: at least 3 points 3 chambers: f(3;3,0.95) = 0.953 = 0.857 4 chambers: f(3;4,0.95) + f(4;4,0.95) = 0.171 + 0.815 = 0.986 5 chambers: f(3;5,0.95) + f(4;5,0.95) + f(5;5,0.95) = 0.021 + 0.204 + 0.774 = 0.999 Poisson binomial: N large, p very small, Np→ν f (n; )   e n v n! particular events, but no idea of number of trials sharp events occurring in a continuum Geiger counter near a radioactive source; Number of flashes of lightning in a storm; E [n ]   V [n ]   Poisson Proof: ν events in some interval split interval in N sections prob. that a given section contains an event p  N prob. of n events in N sections Nn f (n;  N ,N )      1   n  N  N n N n N! n! (N  n )! N→∞ with n finite N!  N (N  1)(N  2)...( N  n  1)  N n (N  n )!    1    N N n     1    e   N N f (n; )   e n v n! Poisson Poisson Fatal horse kicks: number of Prussian soldiers kicked to death by horses. In ten different army corps, over 20 years, there were 122 deaths:   122 = number of deaths = 0.610 one corps X year 200 no deaths: P(0, 0.61) = 0.5434 number of (corpsXyears) with no deaths: 200X0.5434 = 108.7 one death: P(1, 0.61) = 0.3315 number of (corpsXyears) with one death: 200X0.3515 = 66.3 deaths corpsXyear 0 1 2 3 4 actual number 109 65 22 3 1 Poisson 108.7 66.3 20.2 4.1 0.6 Gaussian 2   1 ( x   ) 2  f ( x; , )  exp  2 2 2    standard gaussian: x  ( x )   ( x / )dx /  V [x ]   2  x2  1 ( x )  exp   2  2 evaluated numerically cumulative E [x ]   Gaussian Gaussian Gaussian Gaussian   in N dimensions: x,  column vectors     T 1   1 1 f ( x; ,V )  exp  2 ( x   ) V ( x   ) N/2 1/ 2 ( 2 ) | V |   N (N  1) V: symmetric NXN matrix 2 E[ xi ]  i V [ xi ]  Vii cov[ x i , x j ]  Vij in 2 dimensions:   V12  1 2 f ( x1, x 2 ; 1,  2 ,  )  1 21 2 1   2  2 2    x1  1   x 2   2   x1  1  x 2   2   1      2       exp  2   1   2    2(1   )   1    2  Gaussian Central limit theorem the sum of N independent continous random variables xi with means µi and variances σi (N →∞) becomes a Gaussian random variable with N    i i 1 N  2    i2 i 1 regardless of the form of the individual p.d.f. of the xi formal justification for treating measurement errors as Gaussian random variables: total error: sum of a large number of small contributions Central limit theorem Actually used: algorithm R632 Cern library

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download cururu