* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download PowerPoint Presentation - Experimental Elementary Particle Physics
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Probability and Statistics What is probability? What is statistics? 1 Probability and Statistics  Probability   Formally defined using a set of axioms Seeks to determine the likelihood that a given event or observation or measurement will or has happened  What is the probability of throwing a 7 using two dice?  Statistics   Used to analyze the frequency of past events Uses a given sample of data to assess a probabilistic model’s validity or determine values of its parameters  After observing several throws of two dice, can I determine whether or not they are loaded  Also depends on what we mean by probability 2 Probability and Statistics We perform an experiment to collect a number of top quarks      How do we extract the best value for its mass? What is the uncertainty of our best value? Is our experiment internally consistent? Is this value consistent with a given theory, which itself may contain uncertainties? Is this value consistent with other measurements of the top quark mass? 3 Probability and Statistics CDF “discovery” announced 4/11/2011 4 Probability and Statistics 5 Probability and Statistics Pentaquark search - how can this occur? 2003 – 6.8s effect 2005 – no effect 6 Probability  Let the sample space S be the space of all possible outcomes of an experiment  Let x be a possible outcome    Then P(x found in [x,x+dx]) = f(x)dx f(x) is called the probability density function (pdf) It may be called f(x;q) since the pdf could depend on one or more parameters q  Often we will want to determine q from a set of measurements  Of course x must be somewhere so   f x dx  1  7 Probability  Definitions of mean and variance are given in terms of expectation values Ex   xf x dx       V x  E x  Ex  E x    s 2 2 2 2 8 Probability  Definitions of covariance and correlation coefficient   Vxy  covx, y   E x   x  y   y   E xy   x  y  xy  covx, y  s xs y if x, y independen t then f x,y  f x x  f y  y  and then E x,y   xyf x,ydxdy  x  y and so covx, y   0 9 Probability  Error propagation   Consider y  x  with vari ables x   x1 , x2 ,...xn    and   E x  and Vij  cov xi , x j   n  y    then TS expanding y  x   y        xi  i  i 1  xi  x   we find   E  y  x   y   n   y y  2  E y  x   y       Vij i , j 1   xi x j  x    2  10 Probability  This gives the familiar error propagation formulas for sums (or differences) and products (or ratio) 2 2 2 Using V  y   s y  E y  E  y    we find for y  x1  x2 s y2  s 12  s 22  2 covx1 , x2  and for y  x1  x2 covx1 , x2   2  2 2 2 y x1 x2 x1 x2 s y2 s 12 s 22 11 Uniform Distribution  Let  1 for α  x  β  f ( x;  ,  )      otherwise  0  x 1 E x    xf  x dx   dx      2     V x   E  x  E x   2  2 1 1 1   2 V x     x      dx      2 12      What is the position resolution of a silicon or multiwire proportional chamber with detection elements of space x? 12 Binomial Distribution  Consider N independent experiments (Bernoulli trials)  Let the outcome of each be pass or fail  Let the probability of pass = p Probabilit y of n successes  p n 1  p  N! But there are permuation s for n! N  n ! N distinguis hable objects, grouping them n at a time N! N n n   f (n; N , p )  p 1 p n! N  n ! N n 13 Permutations  Quick review Number of permutatio ns for N elements  N ! This considers each element distinguis hable But n elements of the first type are indistingu ishable so n! of the elements lead to the same situation Ditto for the remaining N-n  elements of the second type Thus accounting for these irrelevant permutatio ns leads to N N! number of unique permuation s is    N-n !n!  n  14 Binomial Distribution  For the mean and variance we obtain (using small tricks) N En   nf n; N , p   Np n 0   V n  E n  En  Np1  p  2 2  And note with the binomial theorem that N n N n N f (n; N , p)    p 1  p    p  1  p   1  n 0 n 0  n  N N 15 Binomial Distribution Binomial pdf 16 Binomial Distribution Examples      Coin flip (p=1/2) Dice throw (p=1/6) Branching ratio of nuclear and particle decays (p=Br) Detector or trigger efficiencies (pass or not pass) Blood group B or not blood group B 17 Binomial Distribution  It’s baseball season! What is the probability of a 0.300 hitter getting 4 hits in one game? N  4, p  0.3 4! 4 0 f 4;4,0.3   0.3  0.7  0.0081 4!0! E n  4  0.3  1.2 V n  4  0.3  0.7  0.84 Expect 1.2  0.84 hits for a 0.300 hitter 18 Poisson Distribution  Consider when N  p0 E n   Np  v The Poisson pdf is n v v f n; v   e n! and one finds E n   v V n   s 2  v 19 Poisson Distribution N! N n f n; N , p   p n 1  p  n! N  n ! N!  N n for N large N  n ! p 2 N  n N  n  1 1  p   1  pN  n    ... 2! N 2 p2 N n 1  p   1  Np   ...  e  Np  e v 2! n n v N n v v e f n; N , p   pe  for large N and small p n! n! N n 20 Poisson Distribution  Poisson pdf 21 Poisson Distribution  Examples  Particles detected from radioactive decays  Sum of two Poisson processes is a Poisson process       Particles detected from scattering of a beam on target with cross section s Cosmic rays observed in a time interval t Number of entries in a histogram bin when data is accumulated over a fixed time interval Number of Prussian soldiers kicked to death by horses Infant mortality QC/failure rate predictions 22 Poisson Distribution  Let Let N~1020 atoms 1 1 Let p   12 ~ 2  1020 /s τ 10 y Then v  Np  2 / s and 20 e 2 P (0;2)   0.135 0! 21 e 2 P (1;2)   0.271 1! 22 e 2 P ( 2;2)   0.271 2! 23 Gaussian Distribution  Gaussian distribution  Important because of the central limit theorem  For n independent variables x1,x2,…,xN that are distributed according to any pdf, then the sum y=∑xi will have a pdf that approaches a Gaussian for large N  Examples are almost any measurement error (energy resolution, position resolution, …) E  y    i i V y   s i i 24 Gaussian Distribution  The familiar Gaussian pdf is 2  1  x      f  x;  , s   exp  2  2 s 2s 2   E x    V x   s 2 25 Gaussian Distribution  Some useful properties of the Gaussian distribution are in range ±s) = 0.683 in range ±2s) = 0.9555 in range ±3s) = 0.9973 outside range ±3s) = 0.0027 outside range ±5s) = 5.7x10-7  P(x P(x P(x P(x P(x  P(x in range ±0.6745s) = 0.5     26 c2 Distribution  Chi-square distribution 1 f  z; n   n / 2 z n / 21e  z / 2 z  0  2 n / 2  n  1,2,... is the number of degrees of freedom Ez   n V z   2n The usefulness of this pdf is that for n independen t xi , i , s i2 n z i 1 xi  i  2 s i2 follows the c 2 distributi on with n d.o.f. 27 c2 Distribution 28 Probability 29 Probability  Probability can be defined in terms of Kolmogorov axioms  The probability is a real-valued function defined on subsets A,B,… in sample space S For every subset A in S, P A  0 If A  B  0, P A  B   P A  PB  PS   1  This means the probability is a measure in which the measure of the entire sample space is 1 30 Probability  We further define the conditional probability P(A|B) read P(A) given B P A  B  P A | B   P B   Bayes’ theorem Using P A  B   PB  A PB | AP A P A | B   P B  31 Probability  For disjoint Ai PB    PB | Ai P Ai  i then PB | AP A P A | B    PB | Ai P Ai  i  Usually one treats the Ai as outcomes of a repeatable experiment 32 Probability  Usually one treats the Ai as outcomes of a repeatable experiment  Then P(A) is usually assigned a value equal to the limiting frequency of occurrence of A A P  A  lim n  n  Called frequentist statistics  But Ai could also be interpreted as hypotheses, each of which is true or false   Then P(A) represents the degree of belief that hypothesis A is true Called Bayesian statistics 33 Bayes’ Theorem  Suppose in the general population   P(disease) = 0.001 P(no disease) = 0.999  Suppose there is a test to check for the disease   P(+, disease) = 0.98 P(-, disease) = 0.02  But also   P(+, no disease) = 0.03 P(-, no disease) = 0.97  You are tested for the disease and it comes back +. Should you be worried? 34 Bayes’ Theorem  Apply Bayes’ theorem P(, disease ) P(disease ) P(disease ,)  P(, disease ) P(disease )  P(, no disease ) P(no disease ) 0.98  0.001 P(disease ,)   0.032 0.98  0.001  0.03  0.999  3.2% of people testing positive have the disease  Your degree of belief about having the disease is 3.2% 35 Bayes’ Theorem  Is athlete A guilty of drug doping?  Assume a population of athletes in this sport   P(drug) = 0.005 P(no drug) = 0.995  Suppose there is a test to check for the drug   P(+, drug) = 0.99 P(-, drug) = 0.01  But also   P(+, no drug) = 0.004 P(-, no drug) = 0.996  The athlete is tested positive. Is he/she involved in drug doping? 36 Bayes’ Theorem  Apply Bayes’ theorem P(, drug ) P(drug ) P(drug ,)  P(, drug ) P(drug )  P(, no drug ) P(no drug ) 0.99  0.005 P(drug ,)   0.55 0.99  0.005  0.004  0.995 and P(, no drug ) P(no drug ) P(no drug ,)  P(, no drug ) P(no drug )  P(, drug ) P(drug ) 0.004  0.995 P(no drug ,)   0.45 0.004  0.995  0.99  0.005 37  ??? Binomial Distribution  Calculating efficiencies  Usually use e instead of p E n  n E n   Ne  e    e N N V n   s 2  Ne 1  e  We don' t know the true e but we can use our estimate e  s e 1  e  1 e    n 1  n / N  N N N This is the error on the efficiency 38 Binomial Distribution  But there is a problem   If n=0, (e’) = 0 If n=N, (e’) = 0  Actually we went wrong in assuming the best estimate for e is n/N  We should really have used the most probable value of e given n and N  A proper treatment uses Bayes’ theorem but lucky for us (in HEP) the solution is implemented in ROOT    h_num->Sumw2() h_den->Sumw2() h_eff->Divide(h_num,h_den,1.0,1.0,”B”) 39
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            