Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data, Statistics, Probability 3 : Probability distributions Christopher Grigoriou 1 Executive MBA - HEC Lausanne 2007/2008 Probability Distributions 2 Discrete distributions: Binomial Uniform Continuous distributions: Uniform Exponential Normal t-distribution Confidence intervals for the Normal distribution Normal approximation to the Binomial distribution Executive MBA - HEC Lausanne 2007/2008 Binomial distribution E.g. Customer enquiries Each customer: makes a booking with probability p Hypothesis: Behaviour of different customers independent Consider the total number of calls per day (n) Random variable: X = number of these n customers who make a booking ==> Binomial distribution with parameters n and p X ~ B(n,p), E(X) = np, Var(X) = np(1-p) Type of questions: Probability that more than half the customers make a booking Probability that at least x bookings 3 Executive MBA - HEC Lausanne 2007/2008 Example: A probability tree approach 4 Number of enquiries = 3 (n) Probability that an enquiry results in a booking = 0.4 (p) Probability that these 3 calls result in exactly 2 (r) bookings = Pr(r = 2 | n = 3, p = 0.4) = Probability that these 3 calls result in at least 2 bookings = Pr(r ≥ 2 | n = 3, p = 0.4) = Executive MBA - HEC Lausanne 2007/2008 Distribution of Proportions E.g. Customer enquiries Each customer: makes a booking with probability p Hypothesis: Behaviour of different customers independent Consider the total number of calls per day (n) Random variable: X = number of these n customers who make a booking ==> Binomial distribution with parameters n and p X ~ B(n,p), E(X) = np, Var(X) = np(1-p) Random variable: Y = X/n = Proportion of customers who make a booking E(Y) = E(X/n) = (1/n) E(X) = p Var(Y) = Var(X/n) = (1/n2) Var(X) = p(1-p)/n 5 Executive MBA - HEC Lausanne 2007/2008 Normal Distribution Probability density Example: Estimate the $ - CHF exchange rate for next June ==> Random variable X E.g. E(X) = 1.6, SD(X) = 0.2 68% 2.5% 2.5% Assume: Normal distribution Values close to the mean are most likely ==> Small probability of extreme value 1 1.2 1.4 E(X) - 2SD 6 1.6 E(X) 1.8 2.0 2.2 E(X) + 2SD Executive MBA - HEC Lausanne 2007/2008 Shape and location N(1.6, 0.2) N(1.6, 0.05) 1 1.2 1.4 1.6 1.8 2 2.2 N(0, 0.2) 7 -0.6 -0.4 -0.2 N(1.6, 0.2) 0 0.2 0.4 0.6 1 1.2 Executive 1.4 MBA 1.6 - HEC 1.8 Lausanne 2 2.2 2007/2008 Asymmetric distributions Distributions: X1, X2 X1 X2 Var(X1) = Var(X2) E(X1) = E(X2) 8 Assume these are profit distributions Which do you prefer? E(X1) = E(X2) Executive MBA - HEC Lausanne 2007/2008 Relating the Density function to the Cumulative distribution 1,00 0,75 0,50 68% 2.5% 1 2.5% 1.2 1.4 1.6 1.8 2.0 2.2 0,25 0,00 0 9 0,5 1 1,5 2 Executive MBA - HEC Lausanne 2007/2008 2,5 Standard Normal Distribution N(0,1) 1.2 1.4 1.6 1.8 2.0 2.2 X ~ N(1.6, 0.2) -0.6 -0.4 -0.2 0 0.2 0.4 0.6 X - E(X) ~ N (0, 0.2) -1 0 1 2 3 1 -3 10 -2 X - E(X) ~ N(0,1) SD(X) = z-score Executive MBA - HEC Lausanne 2007/2008 Some critical values of N(0,1) Pr(Z > z) 0 11 z Pr(Z>z) z 50% 0 45% 0.13 40% 0.25 35% 0.39 30% 0.52 25% 0.67 20% 0.84 15.90% 1 10% 1.28 5% 1.64 (round to 2) 2.50% 1.96 2.30% 2 1% 2.33 0.62% 2.5 0.50% 2.57 0.13% 3 0.10% 3.09 Executive MBA - HEC Lausanne 2007/2008 Confidence intervals Example: 95% Confidence interval: 90% Confidence interval: 12 X = $ - CHF exchange rate for next June X ~ N(µ = 1.6, σ = 0.2) Executive MBA - HEC Lausanne 2007/2008 t-distribution t-score 0.0 Degrees of freedom 1 5 10 20 30 50 100 0.500 0.500 0.500 0.500 0.500 0.500 0.500 1.0 0.250 0.182 0.170 0.165 0.163 0.161 0.160 2.0 0.148 0.051 0.037 0.030 0.027 0.025 0.024 2.5 0.121 0.027 0.016 0.011 0.009 0.008 0.007 3.0 0.102 0.015 0.007 0.004 0.003 0.002 0.002 Tail probability 13 Similar to Standard normal distribution Family of distribution with one parameter: degrees of freedom n As n goes to infinity, t(n) converges to N(0,1) N(0,1) Degrees of freedom 1 5 10 20 30 50 100 50.0% 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10.0% 3.08 1.48 1.37 1.33 1.31 1.30 1.29 5.0% 2.5% 1.0% 6.31 12.71 31.82 2.02 2.57 3.36 1.81 2.23 2.76 1.72 2.09 2.53 1.70 2.04 2.46 1.68 2.01 2.40 1.66 1.98 2.36 t(2) t(10) -3 -2 -1 0 1 2 3 Executive MBA - HEC Lausanne 2007/2008 Normal approximation to the binomial distribution 14 X = Number of successes Assume X ∼ B(n,p) If np > 5 and n(1-p) > 5 and 0.1 < p < 0.9 Then can approximate X by N(µ =np, σ2 = np(1-p)) X/n = Proportion of successes If X ∼ N(µ =np, σ2 = np(1-p)) Then X/n ∼ N(µ =p, σ2 = p(1-p)/n ) Executive MBA - HEC Lausanne 2007/2008