Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Discrete Random Variables Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Probability Random Variables A random variable is a function whose domain is a sample space and whose codomain is the set of real numbers X : Ω 7→ R That is, an outcome ω of a random experiment is mapped to X(ω), which is a real number. Prof. C. Chen Discrete Random Variables Example Flip a coin until the first head shows up. The number of flips is a random variable X(ωn ) = n Prof. C. Chen Discrete Random Variables Discrete Random Variable A random variable is discrete if its image is finite or countable. For a discrete random variable X, the image can be represented by X = {x1 , x2 , . . . } For example, in the previous example X = {1, 2, . . . } Prof. C. Chen Discrete Random Variables Probability of a Discrete Random Variable Let (Ω, F, P(·)) be a probability model, and X be a discrete random variable defined on Ω. For the probability of X = x to be well-defined, it is required that {ω | X(ω) = x} is an event in F. Prof. C. Chen Discrete Random Variables Probability Mass Function For every xi ∈ X , the set of outcomes mapped by X to xi {X = xi } = {ω ∈ Ω X(ω) = xi } is an event in F. The probability of the event {X = x} as a function of x is called the probability mass function (PMF) of X, denoted by pX (x) = P(X = x) Prof. C. Chen Discrete Random Variables Abstraction It can be tedious to start with a random experiment to derive the PMF of a random variable. More often than not, we begin with PMF. a discrete random variable is specified by a PMF leaving the details of random experiments behind applicable in different scenarios Prof. C. Chen Discrete Random Variables Examples What do the following items have in common? head or tail brother or sister catch-up or fall-behind win or lose pass or fail yes or no busy or idle strike or ball They can all be analyzed by a random variable X taking a value in {0, 1}. Prof. C. Chen Discrete Random Variables Common Discrete Random Variables Prof. C. Chen Discrete Random Variables Experiment: Flip a Coin Once Flip a coin once. The number of head X(head) = 1 X(tail) = 0 is a random variable. Prof. C. Chen Discrete Random Variables Abstraction: Bernoulli Random Variable A Bernoulli random variable takes a value of either 1 or 0. The PMF of a Bernoulli random variable is ( pX (x) = p, if x = 1 1 − p, if x = 0 It is denoted by X ∼ Bernoulli(p) Prof. C. Chen Discrete Random Variables Experiment: Flip a Coin n Times Flip a coin n times. The number of times X that the head shows up is a random variable. Prof. C. Chen Discrete Random Variables Abstraction: Binomial Random Variables A binomial random variable takes a value in {0, 1, . . . , n} The PMF of a binomial random variable is ! pX (x) = n x p (1 − p)n−x x It is denoted by X ∼ binomial(n, p) Prof. C. Chen Discrete Random Variables Experiment: Just Flip Flip a coin until a head shows up, then stop. The number of flips X is a random variable. Prof. C. Chen Discrete Random Variables Abstraction: Geometric Random Variables A geometric random variable takes a value in N = {1, 2, . . . } The PMF of a geometric random variable is pX (x) = (1 − p)x−1 p where 0 ≤ p ≤ 1 is a parameter. It is denoted by X ∼ geometric(p) Prof. C. Chen Discrete Random Variables Limit of a Binomial Random Variable Suppose Y ∼ binomial(n, p) with n 1, Then p1 ! pY (x) = (np)x −np n x e p (1 − p)n−x ≈ x x! since (1 − p)n−x ≈ (1 − p)n = (1 − p) Prof. C. Chen −np −p = (1 − p) − p1 −np Discrete Random Variables ≈ e−np Poisson Random Variables A Poisson random variable takes a value in {0} ∪ N The PMF of a Poisson random variable is pX (x) = e−λ λx x! where λ > 0 is a parameter. It is denoted by X ∼ Poisson(λ) Prof. C. Chen Discrete Random Variables Binomial and Poisson A binomial random variable can be approximated by a Poisson random variable, and vice versa. Specifically, if n 1, p1 then binomial(n, p) ≈ Poisson(np) Prof. C. Chen Discrete Random Variables Coin and Random Variables random experiment flip a coin once flip a coin n times flip it until first head shows up flip it many many times, rare heads Prof. C. Chen abstraction Bernoulli binomial geometric Poisson Discrete Random Variables PMFs Note the similarity between binomial(20, 0.5) and Poisson(10) Prof. C. Chen Discrete Random Variables Functions of a Random Variable A function of a random variable is a random variable. Let Y = g(X) Then ω −→ X(ω) −→ Y (ω) = g(X(ω)) So Y is a random variable. Prof. C. Chen Discrete Random Variables PMF of Y For y in the image of Y , define set Sy = {x|g(x) = y} ⇒ {ω | Y (ω) = y} = [ {ω | X(ω) = x} x∈Sy ⇒ pY (y) = P(Y = y) = P({ω | Y (ω) = y}) [ = P {X = x} x∈Sy = X P (X = x) x∈Sy = X pX (x) x∈Sy Prof. C. Chen Discrete Random Variables Example 2.1 Function of a Uniform Random Variable Suppose the PMF of random variable X is ( pX (x) = 1 9, 0, if x is an integer in [−4, 4] otherwise Find pY (y) for Y = |X| Prof. C. Chen Discrete Random Variables Expectation and Variance Prof. C. Chen Discrete Random Variables Expectation The expectation (or mean or expected value) of a discrete random variable X is X E[X] = x pX (x) x∈X Prof. C. Chen Discrete Random Variables Example 2.2 Consider 2 independent coin tosses, each with a probability of 3/4 for a head, and let X be the number of heads obtained. Decide the PMF and the mean of X. Prof. C. Chen Discrete Random Variables Variance and Standard Deviation variance h var(X) = E (X − E[X])2 i standard deviation σX = Prof. C. Chen q var(X) Discrete Random Variables Example 2.3 Suppose the PMF of X is ( pX (x) = 1 9, 0, if x is an integer in [−4, 4] otherwise Find the variance of X through the PMF of Y = (X − E[X])2 Prof. C. Chen Discrete Random Variables Expectation of a Function of Random Variable Let X be a random variable and Y = g(X) Then E[Y ] = X g(x) pX (x) x∈X That is X E[g(X)] = g(x) pX (x) x∈X Prof. C. Chen Discrete Random Variables Proof E[Y ] = X y pY (y) y∈Y = = X y X y∈Y x∈Sy X X pX (x) y pX (x) y∈Y x∈Sy = X X g(x) pX (x) y∈Y x∈Sy = X g(x) pX (x) x∈X Prof. C. Chen Discrete Random Variables Example 2.3 (continued) Suppose the PMF of X is ( pX (x) = 1 9, 0, if x is an integer in [−4, 4] otherwise Find the variance of X. Prof. C. Chen Discrete Random Variables Linear Function of a Random Variable Y = aX + b ⇒ E[Y ] = X (ax + b)pX (x) = aE[X] + b x∈X ⇒ var(Y ) = E[(aX + b − (aE[X] + b))2 ] = E[(aX − aE[X])2 ] = a2 E[(X − E[X])2 ] = a2 var(X) Prof. C. Chen Discrete Random Variables An Equality for Variance Variance is the difference of second moment and mean square. var(X) = E[(X − E[X])2 ] = X (x − E[X])2 pX (x) x∈X = X x2 − 2xE[X] + E 2 [X] pX (x) x∈X = E[X 2 ] − 2E 2 [X] + E 2 [X] = E[X 2 ] − E 2 [X] Prof. C. Chen Discrete Random Variables Example 2.4 Time to School If the weather is good (with probability 0.6), Alice walks the 2 miles to school at a speed of 5 miles per hour. Otherwise, she rides her motorcycle at a speed of 30 miles per hour. What is the mean time T for Alice to get to school? Prof. C. Chen Discrete Random Variables Example 2.5 Bernoulli Random Variable Find the mean and variance of X ∼ Bernoulli(p) Prof. C. Chen Discrete Random Variables Example 2.6 What is the mean and the variance of a roll of 6-sided die? Prof. C. Chen Discrete Random Variables Discrete Uniform Random Variable A discrete uniform random variable has a constant PMF. For a uniform random variable X taking an integer in [a, b], denoted by X ∼ uniform[a, b] the PMF is pX (x) = Prof. C. Chen 1 b−a+1 Discrete Random Variables Mean and Variance E[X] = b X x pX (x) = x=a b X a+b 1 x= b − a + 1 x=a 2 Y =X −a+1 ⇒ Y ∼ uniform[1, b − a + 1] ⇒ var(X) = var(Y ) = E[Y 2 ] − E 2 [Y ] b−a+1 X 1 = y2 − b − a + 1 y=1 = b−a+2 2 (b − a)(b − a + 2) 12 Prof. C. Chen Discrete Random Variables 2 Example 2.7 Poisson Random Variable Find the mean and variance of Z ∼ Poisson(λ) Prof. C. Chen Discrete Random Variables Mean pZ (z) = e−λ λz z! ∞ X z e−λ ⇒ E[Z] = z=0 = e−λ =e −λ = λe λz z! ∞ X λz (z − 1)! z=1 0 ∞ X λz +1 z 0 =0 ∞ X −λ z0! 0 λz z0! z 0 =0 = λe−λ eλ =λ Prof. C. Chen Discrete Random Variables Variance 2 E[Z ] = = ∞ X z=0 ∞ X z 2 e z −λ λ z! z(z − 1)e−λ z=1 = e−λ −λ =e ∞ X λz +λ (z − 2)! z=2 0 ∞ X λz +2 z 0 =0 ∞ X 2 −λ =λ e ∞ λz X λz + ze−λ z! z=1 z! z0! +λ 0 λz +λ z0! z 0 =0 = λ2 e−λ eλ + λ = λ2 + λ ⇒ var(Z) = E[Z 2 ] − E 2 [Z] = (λ2 + λ) − λ2 = λ Prof. C. Chen Discrete Random Variables Example: Geometric Random Variable For G ∼ geometric(p) the mean is E[G] = and the variance is var(G) = Prof. C. Chen 1 p 1−p p2 Discrete Random Variables Mean (Expectation) E[G] = ∞ X kpG (k) k=1 =p+ =p+ =p+ ∞ X k=2 ∞ X kpG (k) (k 0 + 1)pG (k 0 + 1) k0 =1 ∞ X (k 0 + 1)(1 − p)pG (k 0 ) k0 =1 = p + (1 − p) (E[G] + 1) ⇒ pE[G] = 1 ⇒ E[G] = Prof. C. Chen 1 p Discrete Random Variables Variance E[G2 ] = ∞ X k 2 pG (k) = p + k=1 =p+ ∞ X k=2 ∞ X k 2 pG (k) (k 0 + 1)2 pG (k 0 + 1) k0 =1 =p+ ∞ X (k 02 + 2k 0 + 1)(1 − p)pG (k 0 ) k0 =1 = p + (1 − p) E[G2 ] + 2E[G] + 1 ⇒ E[G2 ] = 2−p p2 ⇒ var(G) = E[G2 ] − E 2 [G] = Prof. C. Chen 1−p p2 Discrete Random Variables Example 2.8 Game Consider a game where a player is given 2 questions. Question A is answered correctly with probability 0.8, and the prize money is 100. Question B is answered correctly with probability 0.5, and the prize money is 200. If the first question attempted is answered incorrectly, the quiz terminates. If it is answered correctly, the player attempts the second question to earn more money. Which question should be answered first to maximize the expected total prize money? Prof. C. Chen Discrete Random Variables Multiple Random Variables Prof. C. Chen Discrete Random Variables 2 Random Variables Let X and Y be discrete random variables. For any x ∈ X and y ∈ Y, since the sets {ω|X(ω) = x}, {ω|Y (ω) = y} are events, the set {ω|X(ω) = x, Y (ω) = y} = {ω|X(ω) = x} ∩ {ω|Y (ω) = y} is also an event. Prof. C. Chen Discrete Random Variables Joint Probability Mass Function The probability of the event {X = x, Y = y} as a function of x and y is called the joint probability mass function (joint PMF) of X and Y , denoted by pXY (x, y) = P(X = x, Y = y) Prof. C. Chen Discrete Random Variables Marginal Probability The marginal probability mass function (marginal PMF) of a random variable can be computed by a joint PMF pX (x) = X pXY (x, y) y∈Y This follows from P(X = x) = X P(X = x, Y = y) y∈Y Similarly pY (y) = X pXY (x, y) x∈X Prof. C. Chen Discrete Random Variables Example 2.9 Joint Probability Table Consider two random variables X and Y , described by the joint probability shown in Fig. 2.10. Find the marginal PMFs. Prof. C. Chen Discrete Random Variables Function of Multiple Random Variables A function of multiple random variables is a random variable. For Z = g(X, Y ) surely Z(ω) = g(X(ω), Y (ω)) is a mapping from Ω to R. Prof. C. Chen Discrete Random Variables PMF of Z For z in the image of Z, define Sz = {(x, y) | g(x, y) = z} [ ⇒ {Z = z} = {X = x, Y = y} (x,y)∈Sz ⇒ pZ (z) = P(Z = z) = X pXY (x, y) (x,y)∈Sz ⇒ E[Z] = X zpZ (z) = z∈Z Prof. C. Chen X X g(x, y) pXY (x, y) x∈X y∈Y Discrete Random Variables Example 2.9 (continued) Find the PMF and the expectation of Z = X + 2Y Prof. C. Chen Discrete Random Variables Linear Function Z = a1 X1 + a2 X2 + · · · + an Xn ⇒ E[Z] = E[a1 X1 + a2 X2 + · · · + an Xn ] = X pX1 ...Xn (x1 , . . . , xn )[a1 x1 + · · · + an xn ] x1 ,...,xn = X pX1 ...Xn (x1 , . . . , xn )a1 x1 + · · · + x1 ,...,xn X pX1 ...Xn (x1 , . . . , xn )an xn x1 ,...,xn = X pX1 (x1 )a1 x1 + · · · + X pXn (xn )an xn xn x1 = a1 E[X1 ] + · · · + an E[Xn ] Prof. C. Chen Discrete Random Variables Example 2.10 Grade A Each student in a 300-student class has probability of 1/3 of getting an A, independent of any other student. What is the mean of X, the number of students that get an A? Prof. C. Chen Discrete Random Variables Example 2.11 Suppose n people throw their hats in a box and then each picks one hat at random. What is the expected value of H, the number of people retrieving their own hats? Prof. C. Chen Discrete Random Variables Let Xi indicate whether the ith person picks his own hat. ⇒ H = X1 + · · · + Xn ⇒ E[H] = = n X i=1 n X E[Xi ] P(Xi = 1) i=1 hat still in the box hat retrieved = n X z i=1 = }| { n−i+1 n z }| { 1 n−i+1 n X 1 i=1 n =1 Prof. C. Chen Discrete Random Variables Conditional Probability and Independence Prof. C. Chen Discrete Random Variables Conditional Probability Mass Function The conditional probability mass function (conditional PMF) of a random variable X conditioned on an event A of non-zero probability (a non-null event) is defined by pX|A (x) = P({X = x} ∩ A) P(A) Probability is re-distributed to A and re-normalized by P(A). Prof. C. Chen Discrete Random Variables Normalization X pX|A (x) = 1 x∈X A= [ ({X = x} ∩ A) x∈X ⇒ P(A) = X P({X = x} ∩ A) x∈X X ⇒ 1= pX|A (x) x∈X Prof. C. Chen Discrete Random Variables Example 2.12 Let X be the roll of a fair 6-sided die. Let A be the event that the roll is an even number. What is the conditional PMF of X given A? Prof. C. Chen Discrete Random Variables Example 2.13 A student will take a certain test repeatedly until he passes the test, each time with a probability p of passing, independent of the previous attempts, up to a maximum number of n times. What is the conditional PMF of the number of attempts K, given that the student passes the test? Prof. C. Chen Discrete Random Variables A = {the student eventually passes the test} ⇒ Ac = {the student fails the test n times} ⇒ P(Ac ) = (1 − p)n ⇒ P(A) = 1 − (1 − p)n ⇒ pK|A (k) = P(K = k|A) P({K = k} ∩ A) P(A) (1 − p)k−1 p = , k = 1, 2, . . . , n 1 − (1 − p)n = Prof. C. Chen Discrete Random Variables Conditional Probability Mass Function The conditional probability mass function of X conditioned on a random variable Y is defined by pX|Y (x|y) = P(X = x|Y = y) That is pX|Y (x|y) = pXY (x, y) pY (y) Prof. C. Chen Discrete Random Variables Chain Rule Marginal PMF, conditional PMF, and joint PMF are related by pXY (x, y) = pX (x)pY |X (y|x) = pY (y)pX|Y (x|y) This is known as the chain rule. Prof. C. Chen Discrete Random Variables Example 2.14 Professor May B. Right answers each of her students’ questions incorrectly with probability 1/4, independent of other questions. In each lecture, she is asked 0, 1, or 2 questions with equal probability 1/3. Let X and Y be respectively the number of questions she is asked and the number of questions she answers wrong in a given lecture. Find pXY (x, y). What is the probability that she answers at least one question incorrectly? Prof. C. Chen Discrete Random Variables Example 2.15 Consider a transmitter that is sending messages over a computer network. Let Y be the length of a message, and X be the travel time of the message. Suppose the PMF of Y is pY (102 ) = pY (104 ) = 5/6 1/6 Furthermore, X depends on Y statistically by X = 10−4 Y, with probability 1/2 X = 10−3 Y, with probability 1/3 X = 10−2 Y, with probability 1/6 We want to know the PMF of X. Prof. C. Chen Discrete Random Variables Conditional Expectation An expectation based on conditional probability is called conditional expectation. The conditional expectation of X conditioned on an event A is E[X|A] = X x pX|A (x) x∈X The conditional expectation of X conditioned on Y = y is E[X|Y = y] = X x pX|Y (x|y) x∈X Prof. C. Chen Discrete Random Variables Total Expectation Theorem The expectation of a random variable is the expectation of a conditional expectation, i.e. E[X] = X pY (y)E[X|Y = y] y or more concisely E[X] = E[E[X|Y ]] Note that E[X|Y ] is a function of Y , so it is a random variable. Prof. C. Chen Discrete Random Variables Proof E[X] = X xpX (x) x = X X x x = X X x x = = X X xpX|Y (x|y)pY (y) x pY (y) y = pX|Y (x|y)pY (y) y XX y pXY (x, y) y X xpX|Y (x|y) x pY (y)E[X|Y = y] y Prof. C. Chen Discrete Random Variables General Form of Total Expectation Theorem Let A1 , . . . , An be a partition of sample space Ω. Then E[X] = n X P(Ai )E[X|Ai ] i=1 E[X] = X xpX (x) x = X X x x = = X X x pX|Ai (x)P(Ai ) x Ai X P(Ai ) X X xpX|Ai (x) x Ai = P({X = x} ∩ Ai ) Ai P(Ai )E[X|Ai ] Ai Prof. C. Chen Discrete Random Variables Example 2.16 Message transmitted by a computer in Boston through a data network is destined for New York with probability 0.5, for Chicago with probability 0.3, and for San Francisco with probability 0.2. The transmission time X is random. The mean transmission time is 0.05 seconds for a message destined for New York, 0.1 seconds for a message destined for Chicago, and 0.3 seconds for a message destined for San Francisco. What is E[X]? Prof. C. Chen Discrete Random Variables Mean and Variance of a Geometric Random Variable A = {first attempt successful} ⇒ Ac = {first attempt unsuccessful} E[X] = P(A)E[X|A] + P(Ac )E[X|Ac ] = p · 1 + (1 − p)(1 + E[X]) 1 ⇒ E[X] = p E[X 2 ] = P(A)E[X 2 |A] + P(Ac )E[X 2 |Ac ] = p · 12 + (1 − p)E[(1 + X)2 ] = p + (1 − p)(1 + 2E[X] + E[X 2 ]) 2 1 ⇒ E[X 2 ] = 2 − p p 1−p ⇒ var(X) = E[X 2 ] − E[X]2 = p2 Prof. C. Chen Discrete Random Variables Example 2.17 Programming You write a program over and over, and each time there is a probability p that it works correctly, independent of previous attempts. What is the mean and variance of X, the number of tries until the program works correctly? Prof. C. Chen Discrete Random Variables Independence of a Random Variable and an Event Random variable X and event A are independent if P({X = x} ∩ A) = P(X = x)P(A) for every x ∈ X . Prof. C. Chen Discrete Random Variables Invariance of PMF Suppose random variable X and event A are independent. ⇒ P({X = x} ∩ A) = P(X = x)P(A) ⇒ P({X = x} ∩ A) = P(X = x) P(A) ⇒ pX|A (x) = pX (x) The PMF of X is invariant with the occurrence of A. Prof. C. Chen Discrete Random Variables Example 2.19 Consider 2 independent tosses of a fair coin. Let X be the number of heads and A be the event that the number of heads is even. Show that X and A are not independent. Prof. C. Chen Discrete Random Variables Independent Random Variables Random variables X and Y are said to be independent if pXY (x, y) = pX (x)pY (y) for all x ∈ X and y ∈ Y. This is denoted by X⊥ ⊥Y Prof. C. Chen Discrete Random Variables Invariance of Probability X⊥ ⊥Y ⇒ pXY (x, y) = pX (x)pY (y) ⇒ pXY (x, y) = pY (y) pX (x) ⇒ pY |X (y|x) = pY (y) ⇒ pX|Y (x|y) = pX (x) Prof. C. Chen Discrete Random Variables Properties X⊥ ⊥Y ⇒ E[XY ] = X pXY (x, y)xy x,y = X pX (x)pY (y)xy x,y = X pX (x)x x X pY (y)y y = E[X]E[Y ] var(X + Y ) = E[(X + Y )2 ] − E[X + Y ]2 = E[X 2 + 2XY + Y 2 ] − (E[X]2 + 2E[X]E[Y ] + E[Y ]2 ) = E[X 2 ] − E[X]2 + E[Y 2 ] − E[Y ]2 = var(X) + var(Y ) Prof. C. Chen Discrete Random Variables Example 2.20 Variance of a Binomial Random Variable The variance of a binomial random variable X ∼ binomial(n, p) is var(X) = np(1 − p) Prof. C. Chen Discrete Random Variables Proof Consider n tosses of a coin (head showing up with probability p). The number of heads X = H1 + · · · + Hn where Hi is the Bernoulli random variable for toss i, is a binomial random variable X ∼ binomial(n, p) We have var(X) = var(H1 ) + · · · + var(Hn ) = n X p(1 − p) i=1 = np(1 − p) Prof. C. Chen Discrete Random Variables Example 2.21 Approval Rate We wish to estimate the approval rate of Trump. We ask n persons at random. Define ( Ai = 1, if the ith person approves Trump 0, otherwise The approval rate of the sampled n persons is Rn = A1 + · · · + An n Suppose A1 , . . . , An are independent Bernoulli random variables with mean p and variance p(1 − p). Find the mean and variance of Rn . Prof. C. Chen Discrete Random Variables Example 2.22 Simulation The probability of a well-defined event, say A, can be difficult to compute. Yet, whether A occurs in a trial is easy to decide. In this case, we can estimate the probability of A by simulation. Prof. C. Chen Discrete Random Variables Quality Carry out n trials. Associate trial i with a Bernoulli random variable ( 1, if outcome i is in A Ai = 0, otherwise Consider the relative frequency Fn = ⇒ E[Fn ] = var(Fn ) = A1 + · · · + An n n 1X E[Ai ] = P(A) n i=1 n 1 1 X var(Ai ) = [P(A)(1 − P(A))] 2 n i=1 n Prof. C. Chen n→∞ −−−→ Discrete Random Variables 0