Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Continuous Random Variables: Basics Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: - D. P. Bertsekas, J. N. Tsitsiklis, Introduction to Probability , Sections 3.1-3.3 Continuous Random Variables • Random variables with a continuous range of possible values are quite common – – – – The velocity of a vehicle The temperature of a day The blood pressure of a person etc. Event {c≦ outcomes ≦d} Sample Space Ω Event {a≦ outcomes ≦b} Event {e≦ outcome ≦f} a’ b’ c’ d’ e’ f’ x Probability-Berlin Chen 2 Probability Density Functions (1/2) • A random variable X is called continuous if its probability law can be described in terms of a nonnegative function f X ( f X ≥ 0 ) , called the probability density function (PDF) of X , which satisfies P ( X ∈ B ) = ∫B f X dx for every subset B of the real line. – The probability that the value of X falls within an interval is P (a ≤ X ≤ b ) = ∫ab f X dx Probability-Berlin Chen 3 Probability Density Functions (2/2) • Illustration of a PDF Event {c≦ outcomes ≦d} f X (x ) Sample Space Ω Event {a≦ outcomes ≦b} Event {e≦ outcome ≦f} a’ b’ c’ d’ e’ f’ x • Notice that – For any single value a , we have P ( X = a ) = ∫aa f X ( x )dx = 0 – Including or excluding the endpoints of an interval has no effect on its probability P (a ≤ X ≤ b ) = P (a < X ≤ b ) = P (a ≤ X < b ) = P (a < X < b ) – Normalization probability ∞ ∫− ∞ f X dx = P (− ∞ < X < ∞ ) = 1 Probability-Berlin Chen 4 Interpretation of the PDF • For an interval [x , x + δ ] with very small length δ , we have P ([x , x + δ ]) = ∫xx + δ f X (t )dt ≈ f X (t ) ⋅ δ – Therefore, f X ( x ) can be viewed as the “probability mass per unit length” near x • f X ( x ) is not the probability of any particular event, it is also not restricted to be less than or equal to one Probability-Berlin Chen 5 Continuous Uniform Random Variable • A random variable X that takes values in an interval [a , b ] , and all subintervals of the same length are equally likely ( X is uniform or uniformly distributed) ⎧ 1 , if a ≤ x ≤ b ⎪ f X (x ) = ⎨ b − a ⎪⎩ 0 , otherwise • Normalization property ∞ ∫− ∞ f X ( x )dx = b ∫a 1 dx = 1 b−a Probability-Berlin Chen 6 Random Variable with Piecewise Constant PDF • Example 3.2. Alvin’s driving time to work is between 15 and 20 minutes if the day is sunny, and between 20 and 25 minutes if the day is rainy, with all times being equally likely in each case. Assume that a day is sunny with probability 2/3 and rainy with probability 1/3. What is the PDF of the driving time, viewed as a random variable X ? ⎧ c1 , if 1 5 ≤ x ≤ 20 , ⎪ f X ( x ) = ⎨ c 2 , if 2 0 ≤ x ≤ 25 , ⎪0, otherwise. ⎩ 2 20 20 f X ( x )dx = ∫15 c1 dx = 5 c1 = ∫15 3 1 25 20 f X ( x )dx = ∫15 c1 dx = 5 c 2 P (rainy day ) = = ∫20 3 1 2 ∴ c1 = , c1 = 15 15 f X (x ) P (sunny day ) = 2/5 15 1/5 20 25 x Probability-Berlin Chen 7 Exponential Random Variable • An exponential random variable X has a PDF of the form ⎧⎪ λ e − λ x , if x ≥ 0 , f X (x ) = ⎨ ⎪⎩ 0 , otherwise, – λ is a positive parameter characterizing the PDF • Normalization Property ∞ ∫− ∞ f X ( x )dx = ∫0∞ λ e − λ x dx = − e − λ x ∞ 0 =1 • The probability that X exceeds a certain value decreases exponentially P ( X ≥ a ) = ∫a∞ λ e − λ x dx = e − λ a Probability-Berlin Chen 8 Normal (or Gaussian) Random Variable • A continuous random variable X is said to be normal or Gaussian if it has a PDF of the form f X (x ) = 1 e 2π σ 2 ( x−μ ) − 2σ 2 , -∞ ≤ x ≤ ∞ – Where the parameters μ and σ are respectively its mean and variance (to be shown latter on !) 2 • Normalization Property 2 ( x−μ ) − 1 2 ∞ 2 σ e dx = 1 (?? See the end of chapter problems) ∫− ∞ 2π σ Probability-Berlin Chen 9 Standard Normal Random Variable • A normal random variable Y with zero mean μ = 0 and 2 unit variance σ = 1 is said to be a standard normal fY ( y ) = 1 2π y2 − e 2 , -∞ ≤ y ≤ ∞ • Normalization Property ∞ ∫− ∞ 1 2π y2 − e 2 dy = 1 • The standard normal is symmetric around y = 0 Probability-Berlin Chen 10 The PDF of a Random Variable Can be Arbitrarily Large • Example 3.3. A PDF can be arbitrarily large. Consider a random variable X with PDF ⎧ 1 , if 0 < x ≤ 1, ⎪ f X (x ) = ⎨ 2 x ⎪0, otherwise, ⎩ x – The PDF value becomes infinite large as approaches zero • Normalization Property 1 ∫0 f X ( x )dx = ∫01 1 2 x dx = x 1 0 =1 Probability-Berlin Chen 11 Expectation of a Continuous Random Variable (1/2) • Let X be a continuous random variable with PDF f X X – The expectation of E [X ]= ∞ ∫− ∞ is defined as x ⋅ f X ( x )dx – The expectation of a function E [g ( X )] = ∞ ∫− ∞ g( X ) has the form g ( x ) ⋅ f X ( x )dx (?? See the end of chapter problems) – The variance of [ X is defined by ] var ( X ) = E ( X − E[X ])2 = ∫−∞∞ ( X − E[X ])2 ⋅ f X ( x )dx • We have var ( X ) = E [X ]− (E[ X ]) 2 2 ≥0 Probability-Berlin Chen 12 Expectation of a Continuous Random Variable (2/2) • If Y = aX + b , where a and b are given scalars, then E[Y ] = aE[X ] + b, var(Y ) = a 2 var( X ) Probability-Berlin Chen 13 Illustrative Examples (1/3) • Mean and Variance of the Uniform Random Variable X ⎧ 1 , if a ≤ x ≤ b ⎪ f X (x ) = ⎨ b − a ⎪⎩ 0 , otherwise E [ X ] = ∫ab xf X ( x )dx = ∫ab x 1 1 = ⋅ x2 b−a 2 b+a = 2 1 dx b−a = 2 2 = b a b a x 2 f X ( x )dx 1 1 ⋅ x3 b−a 3 b a b 2 + ab + a 2 = 3 [ ]− (E [X ]) ∴ var ( X ) = E X [ ]= ∫ E X 2 b 2 + ab + a 2 ⎛ b + a ⎞ = −⎜ ⎟ 3 ⎝ 2 ⎠ 2 (b − a )2 12 Probability-Berlin Chen 14 Illustrative Examples (2/3) • Mean and Variance of the Exponential Random Variable X ⎧⎪ λ e − λ x , if x ≥ 0 , f X (x ) = ⎨ ⎪⎩ 0 , otherwise, E [ X ] = ∫0∞ xf X ( x )dx = ∫0∞ x λ e − λ x dx = − xe − λ x 0∞ = 0− [ ] 1 λ + e −λx ∞ 0 = ( = 0+ λ )+ ( ( ∫ λ 1 λ ) ∞ −λx ∫0 2xe dx ) ∞ −λx λ 2 x e dx 0 2 2 = E[X ] = 2 λ ) 1 = E X 2 = ∫0∞ x2λe−λxdx − x2e−λx ∞ 0 ( ⎞ ⎛ d − xe − λ x ⎜Q = λ xe − λ x − e − λ x ⎟ ⎟ ⎜ dx ⎠ ⎝ ∞ −λx ∫0 e dx ( ) ⎛ d − x2e−λx ⎞ ⎜Q = x2λe−λx − 2xe−λx ⎟ ⎜ ⎟ dy ⎝ ⎠ [ ]− (E [X ]) ∴ var ( X ) = E X 2 2 = 1 λ2 Probability-Berlin Chen 15 Illustrative Examples (3/3) • Mean and Variance2 of the Normal Random Variable X − (x − μ ) 1 2 2 σ f X (x ) = e , -∞ ≤ x ≤ ∞ 2π σ y2 − X −μ 1 ⇒ fY ( y ) = e 2 , -∞ ≤ y ≤ ∞ Let Y = σ 2π − y2 2 1 1 e dy = − e 2π 2π ⇒ E [ X ] = σ E [Y ] + μ = 0 + μ = μ E [Y ] = ∫-∞∞ y 1 e 2π var(Y ) = ∫-∞∞ ( y − E[Y ])2 − − y2 2 ∞ -∞ y2 2 dy y y ⎡ − 1 ∞ 2 −2 1 = ⋅ − ye 2 ∫-∞ y e dy = ⎢ ⎢ 2π 2π ⎣ = 0 +1 2 =0 2 y ⎤ ⎡ ⎤ − 1 ∞ ∞⎥ ⎢ ∫-∞ e 2 dy ⎥ ∞ + ⎥ ⎢ 2π ⎥ ⎦ ⎣ ⎦ 2 ⎛ ⎞ y2 ⎞ ⎛ − ⎟ ⎜ ⎟ ⎜ 2 ⎜ d ⎜ − ye ⎟ 2 2 ⎟ y y ⎜ ⎟ − − ⎟ ⎜ ⎝ ⎠ 2 2 =y e −e 2 ⎟ ⎜Q dy ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ =1 ∴ var( X ) = σ 2 var(Y ) = σ 2 Probability-Berlin Chen 16 Cumulative Distribution Functions • The cumulative distribution function (CDF) of a random variable X is denoted by F X ( x ) and provides the probability P ( X ≤ x ) if X is discrete ⎧ ∑ p X (k ), ⎪ F X (x ) = P ( X ≤ x ) = ⎨ k ≤ x ⎪⎩ ∫−x∞ f X (t )dt , if X is continuous – The CDF F X ( x ) accumulates probability up to x – The CDF F X x provides a unified way to describe all kinds of random variables mathematically ( ) Probability-Berlin Chen 17 Properties of a CDF (1/3) • The CDF F X ( x ) is monotonically non-decreasing ( ) if xi ≤ x j , then F X ( xi ) ≤ F X x j • The CDF F X ( x ) tends to 0 as x → −∞ , and to 1 as x → ∞ • If X is discrete, then F X ( x ) is a piecewise constant function of x Probability-Berlin Chen 18 Properties of a CDF (2/3) • If X is continuous, then F X ( x ) is a continuous function of x f X ( x) = Fx ( X ≤ x) = ∫ax f X (t )dt = ∫ax 1 , for a ≤ x ≤ b b−a = f X ( x ) = c ( x − a ), for a ≤ x ≤ b c ⇒ ∫ab c ( x − a )dx = ( x − a )2 2 2 ⇒c= (b − a )2 2 (b − a ) 2 ⇒ f X (b ) = = (b − a )2 b − a b a Fx ( X ≤ x ) = ∫ax f X (t )dt = ∫ax =1 = 1 dt b−a x−a b−a 2 (t − a ) (b − a )2 dt (x − a )2 (b − a )2 Probability-Berlin Chen 19 Properties of a CDF (3/3) • If X is discrete and takes integer values, the PMF and the CDF can be obtained from each other by summing or differencing k F X (k ) = P ( X ≤ k ) = ∑ p X (i ), i = −∞ p X (k ) = P ( X ≤ k ) − P ( X ≤ k − 1) = F X (k ) − F X (k − 1) • If X is continuous, the PDF and the CDF can be obtained from each other by integration or differentiation F X ( x ) = P ( X ≤ x ) = ∫−x∞ f X (t )dt , dF X ( x ) p X (x ) = dx – The second equality is valid for those x for which the CDF has a derivative (e.g., the piecewise constant random variable) Probability-Berlin Chen 20 An Illustrative Example (1/2) • Example 3.6. The Maximum of Several Random Variables. You are allowed to take a certain test three times, and your final score will be the maximum of the test scores. Thus, X = max {X 1 , X 2 , X 3 } where X 1 , X 2 , X 3 are the three test scores and X is the final score – Assume that your score in each test takes one of the values from 1 to 10 with equal probability 1/10, independently of the scores in other tests. – What is the PMF p X of the final score? Trick: compute first the CDF and then the PMF! Probability-Berlin Chen 21 An Illustrative Example (2/2) Q F X (k ) = P ( X ≤ k ) = P (X 1 ≤ k , X 2 ≤ k , X 3 ≤ k ) = P ( X 1 ≤ k )P ( X 2 ≤ k )P ( X 3 ≤ k ) ⎛ k ⎞ =⎜ ⎟ ⎝ 10 ⎠ 3 3 ⎛ k ⎞ ⎛ k −1⎞ ∴ p X (k ) = P ( X ≤ k ) − P ( X ≤ k − 1) = ⎜ ⎟ − ⎜ ⎟ ⎝ 10 ⎠ ⎝ 10 ⎠ 3 Probability-Berlin Chen 22 CDF of the Standard Normal • The CDF of the standard normal Y , denoted as Φ ( y ), is recorded in a table and is a very useful tool for calculating various probabilities, including normal variables Φ ( y ) = P (Y ≤ y ) = y ∫− ∞ 1 −t 2 / 2 e dt 2π – The table only provides the value of Φ ( y ) for y ≥ 0 – Because the symmetry of the PDF, the CDF at negative values of Y can be computed form corresponding positive ones Φ (− 0 . 5 ) = P (Y ≤ − 0 . 5 ) = 1 − P (Y ≤ 0 . 5 ) = 1 − Φ (0 . 5 ) = 1 − 0 . 6915 = 0 . 3085 Probability-Berlin Chen 23 Table of the CDF of Standard Normal Probability-Berlin Chen 24 CDF Calculation of the Normal • The CDF of a normal random variable X with mean μ 2 and variance σ is obtained using the standard normal table as x−μ ⎞ x−μ ⎞ ⎛x−μ ⎞ ⎛ ⎛X −μ P (X ≤ x ) = P ⎜ ≤ ⎟ ⎟ = Φ⎜ ⎟ = P⎜Y ≤ σ ⎠ σ ⎠ ⎝ σ ⎠ ⎝ ⎝ σ Probability-Berlin Chen 25 An Illustrative Example • Example 3. 7. Using the Normal Table. The annual snowfall at a particular geographic location is modeled as a normal random variable with a mean of μ = 60 inches, and a standard deviation of σ = 20 . What is the probability that this year’s snowfall will be at least 80 inches? P ( X ≥ 80 ) = 1 − P ( X ≤ 80 ) 80 − 60 ⎞ ⎛ = 1 − P⎜Y ≤ ⎟ 20 ⎠ ⎝ = 1 − Φ (1) = 1 - 0.8413 = 0.1587 Probability-Berlin Chen 26 Relation between the Geometric and Exponential (1/2) • The CDF of the geometric n Fgeo (n ) = ∑ (1 − p ) k −1 k =1 1 − (1 − p )n p =p = 1 − (1 − p )n 1 − (1 − p ) for n = 1, 2 , K • The CDF of the exponential F exp ( x ) = ∫0x λ e − λ x dx = − e − λ x x 0 = 1 − e −λx for x > 0 • Compare the above two CDFs and let e − λ x = (1 − p )n ⇒ x = n⋅ −1 −1 ⎞ ⎛ ln (1 − p ) ⎜ let δ = ln (1 − p ) > 0 ⎟ λ λ ⎠ ⎝ ⇒ x = n ⋅δ (∴ 1 − p = e ) − λδ Probability-Berlin Chen 27 Relation between the Geometric and Exponential (2/2) ∴ Fexp (δ n ) = 1 − e − λδ n = 1 − (1 − p ) = Fgeo (n ) n Probability-Berlin Chen 28 Recitation • SECTION 3.1 Continuous Random Variables and PDFs – Problems 2, 3, 4 • SECTION 3.2 Cumulative Distribution Functions – Problems 6, 7, 8 • SECTION 3.3 Normal Random Variables – Problems 9, 10, 12 Probability-Berlin Chen 29 Homework-3 • Chapter 2: Problems 15, 18, 21 • Chapter 3: Problems 2, 5 (Due 12/01) Probability-Berlin Chen 30