Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review • Probability • Basic definitions: Randomization experiment Sample spaces Elementary outcomes Event • Basic operations—conditional probability • Bayes Theorem Objectives • Random Variable Discrete random variable Continuous random variable • Two probability distributions Binomial distribution Normal distribution Random variables 4 • A random variable is a function that assigns numeric values to different events in a sample space. Usually we denote a random variable using a capital letter X, Y or Z… • NOTE: (1) Randomness; (2) Numeric values • Example 1: Randomly select a student from a class. X=student’s number of siblings. X could be 0, 1, 2 … • Example 2: Randomly select a student from a class. X=student’s height. X could be any value bigger than 0 Two types of random variables 5 1. Discrete random variable: their outcomes are set of discrete (isolated) values. Eg. X=number of siblings 2. Continuous random variable: its possible values cannot be enumerated; infinite number of values, all outcomes have probability zero. p(x)=0 for every x. Eg. X=the student’ height EG1. Tossing two coins 6 let X=number of heads Outcome TT x 0 Notation: X: variable x: observed values HT TH 1 HH 2 Probability distribution function 7 • A probability distribution function (pdf) is a mathematical relationship, or rule, that assigns to any possible value x of a discrete random variable X the probability Pr(X=x). 8 Probability Distribution of the Random Variable X=number of heads. Probability histogram Outcome TT WT TW WW x 0 1 2 P(X=x) 1/4 1/2 1/4 EG2. Tossing two dice 9 Y: the sum of the dots on the two Dice. What’s the possible values of Y? 10 Probability Distribution of the Random Variable Y: the sum of the dots on the two Dice. Relative frequency In practice, the probability can be estimated by the relative frequency of an event “in a long run”. frequency of occurrences Probability = frequency of all possible occurrences 0 ≤ Probability ≤ 1 Relative frequency histogram should look very much like the probability histogram, if the experiment is repeated many times. Data set vs. Probability distributions 12 Sample properties—based on data set n x = ∑ i =1 xi / n Sample mean: 1 n 2 2 Sample variance: = − s x x ( ) ∑ i n −1 i =1 Model or population properties—based on probability distribution. R = µ ∑ = xi Pr( X xi ) Population mean: i =1 R Population variance: σ 2 = ( x − µ ) 2 Pr( X = x) ∑ i =1 i i Mean of Random Variable 13 Mean or expected value of X, denoted as E(X) or µ, is defined as R E ( X ) = µ = ∑ xi Pr( X = xi ) i =1 It is the sum of the possible values, each weighted by its probability Expectation represents “average” value of the random variable Mean of X 14 X=number of heads. Outcome TT x 0 1 2 P(X=x) 1/4 1/2 1/4 xP(x) 0 1/2 1/2 E( X = ) µ= WT 3 ∑ x Pr( X= i =1 i TW xi = ) 1 WW Variance of Random Variable 15 The variance of X is the expected squared distance from the population mean. R Var ( X ) = σ = ∑ ( xi − µ ) 2 Pr( X = xi ) 2 i =1 The standard deviation σ is the square root of variance sd ( X ) = σ = Var ( X ) Variance of X 16 X=number of heads. x P(x) (X-µ)2 P(x) 0 0.25 (0-1)2*0.25=0.25 1 0.5 (1-1)2*0.25=0 2 0.25 (2-1)2*0.25=0.25 Total 0.50 σ 2 = 0.5 Thus, Summary, µ and σ are computed from probability distribution. They are population properties. Two types of random variables 17 1. Discrete random variable: their outcomes are set of discrete (isolated) values. 2. Continuous random variable: its possible values cannot be enumerated; infinite number of values, all outcomes have probability zero. p(x)=0 for every x. Continuous random variables 18 A balanced spinning pointer. Can stop anywhere in the circle X—the proportion of the total circumference it lands on. X can be any value between 0 and 1. Infinite values. p(0.25≤x ≤0.75)=0.5 p(x=0.5)=0, for x can take on an infinite number of values. Probability density function(pdf) of X 19 • The curve y = f ( x) is the probability density function (pdf) of the random variable X • Pr(a≤X ≤b)= is the area under the curve between the x value a and b. y = f ( x) b P ( a ≤ X ≤ b) = ∫ f ( x)dx a • The total area under the density function curve over the entire range of possible values for the random variable is 1 P (−∞ = ≤ X ≤ ∞) ∞ f ( x)dx ∫= −∞ 1 Probability density function(pdf) of X 20 • The pdf has large values in regions of high probability and small values in regions of low probability • Pr(X=x)=0 for any specific value x • Generally, a distinction is not made between probabilities such as Pr(X<x) and Pr(X≤x), Pr(a≤X≤b) and Pr(a<X<b) when X is a continuous y = f ( x) 21 Expectation and variance of a continuous random variable • Mean µ: E (X)= µ= ∫ ∞ −∞ xf ( x)dx Center of the probability density • Variance σ : 2 Var (X) = σ= 2 ∫ ∞ −∞ ( x − µ ) f ( x)dx 2 Spread of the probability density • The standard deviation, or σ, is the square root of the variance, that is, σ = Var ( X ) Two distributions 22 Binomial --discrete Normal -- continuous Bernoulli trial 23 Examples: A heads-or-tails Coin toss A win-or-lose football game A pass-or-fail automotive smog inspection Properties: Two outcomes: success or failure Success probability(p) is the same in each trial Trials are independent. Binomial random variable 24 ---X is the number of success in n repeated Bernoulli trial with probability p of success. Success probability(p) is the same in each trial Trials are independent. Binomial random variable 25 Probability Distribution: the probability of obtaining k successes in n trial, with success probability p: n k n−k P( X= k= ) p (1 − p ) k n n! : = k k !(n − k )! counts all possible ways of getting k success and n-k failures where n ! = n × (n − 1) × ... ×1 p (1 − p ) k n−k : probability for getting k success and n-k failures 26 Mean and Variance of the Binomial Distribution µ = np np (1 − p ) = σ 2 Exercise 27 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 1. what is the exact binomial probability of 5 HIV positive test results? Exercise 28 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 1. what is the exact binomial probability of 5 HIV positive test results? Answer: 500 5 495 P( X= 5)= 0.01 (1 − 0.01) 5 = 0.176 EXCEL: BINOMDIST(5,500,0.01,FALSE) Exercise 29 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 2. What is the exact binomial probability of at least 5 HIV positive test results? Exercise 30 Newborns were screened for HIV in a Massachusetts hospital. The positive rate for inner-city baby is p=0.01. If 500 newborns are screened, 2. What is the exact binomial probability of at least 5 HIV positive test results? Answer: P( X ≥ 5) =1 − P( X ≤ 4) = 1 − F (4) = 1 − 0.44 = 0.56 EXCEL: F(4)= BINOMDIST(4,500,0.01,TRUE) Normal distribution 31 • Normal distribution is also called Gaussian distribution, after the well-known mathematician Karl Gauss (1777-1855, “the Prince of Mathematicians“) Normal distribution 32 • Normal distribution is very useful • Many things closely follow a normal distribution • Heights of people • Errors in measurement • Blood pressure • Scores on a test • Many other distributions can be made approximately normal by transformation—Binomial et al. • Most statistical methods considered in this text are based on normal distribution The pdf of normal distribution 33 • The normal distribution is defined by its pdf, which is given as for some parameters µ and σ f ( x) = 1 e 2π σ ( x − µ )2 − 2 2σ Other properties of Normal pdf 34 •Mean=median=mode •Symmetry about the center •50% of values less than the mean Location is measured by µ 35 • In the graph, µ2>µ1 Spread is measured by σ2 36 • In the graph, σ2>σ1 Standard normal distribution N(0, 1) 37 • A normal distribution with mean 0 and variance 1 is called a standard normal distribution. Denoted as N(0, 1) • In the following, we will examine the standard normal distribution N(0, 1) in details. • We will see that any information concerning a general normal distribution N(µ, σ2) can be obtained from appropriate manipulations of an N(0,1) distribution 38 Density of standard normal N(0,1) µ =0 σ =1 f ( x) = 1 e 2π x2 − 2 Properties of the standard normal N(0, 1) 39 • It can be shown that about 68% of the area under the standard normal density lies between -1 and +1, about 95% of the area lies between -2 and +2, and about 99% lies between -2.5 and +2.5 NOTE: You will see that, more precisely, Pr(-1<x<1)=0.6827, Pr(-1.96<X<1.96)=0.95, Pr(-2.576<X<2.576)=0.99 Cumulative probability 40 • The cumulative distribution function (cdf) for a standard normal distribution is denoted by F= ( a )Φ(x)=Pr(X≤x), P ( Z ≤ a )where Z~N(0,1) Excel: F(a): NORMSDIST(a); P ( a ≤ Z ≤ b= ) F (b) − F ( a ) 41 P ( −1 ≤ Z ≤ 1) = F (1) − F ( −1) =0.8413-0.1587 =0.6826 Excel: F(1): NORMSDIST(1); F(-1): NORMSDIST(-1); P( Z ≥ a) = 1 − F (a) 42 • Eg. P ( Z ≥ 1) =1 − F (1) =1-0.8413 =0.1587 Excel: F(1): NORMSDIST(1); (1) NORMSDIST(1) How to standardize the normal distribution? 43 How to standardize the normal distribution? X −µ Z= σ 44 Then Z has a standard normal distribution, Z ~ N(0, 1) Standardization 45 • IF X~ N(µ, σ 2) and Z = X −µ σ then Z~N(0,1) Then a−µ b−µ b−µ a−µ < b) P( < Z < =) F ( P(a < X = ) − F( ) σ σ σ σ Use standardization for many problems 46 • Example:If X~N(80, 12^2), what is Pr(90<X<100)? • Solution: 90 − 80 X − 80 100 − 80 < < ) 12 12 12 = Pr(0.83 < Z < 1.67) =F(1.67)-F(0.83) =0.9522-0.7977 Pr(90 < X = < 100) Pr( =0.155 Always draw a graph… 47 Exercise 48 • Suppose we know that among men aged 30-34 who have ever smoked, the mean number of years they smoked is 12.8 with a standard deviation of 5.1 years. Assuming that the duration of smoking is normally distributed, what proportion of men in this age group have smoked for more than 20 years? Exercise 49 Suppose we know that among men aged 30-34 who have ever smoked, the mean number of years they smoked is 12.8 with a standard deviation of 5.1 years. Assuming that the duration of smoking is normally distributed, what proportion of men in this age group have smoked for more than 20 years? Answer: We have X ~ N (12.8, 5.12 ) And we need to compute P( X > 20) P( X > 20) = 1 − P( X ≤ 20) 20 − 12.8 =1-P(Z ≤ ) 5.1 = 1 − F (1.412) =1-0.9210=0.079 EXCEL: NORMDIST(20,12.8,5.1,TRUE) Or NORMSDIST(1.412)