Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CPSC 531 Systems Modeling and Simulation Discrete Random Variables Dr. Anirban Mahanti Department of Computer Science University of Calgary mahanti@cpsc.ucalgary.ca Random Variables A random variable is a real-valued mapping that assigns a numerical value to each possible outcome of an experiment. Formally, a random variable X on a sample space S is a function X : S →ℜ that assigns for all s ∈ S a real number X(s). Notation: Upper case characters (e.g., X, Y , Z etc) denote random variable; lower case characters (e.g., x, y, z etc.) denote value attained by a random variable. Discrete Random Variable 2 Examples Consider arrival of packets at a router. Let X be the number of packets that arrive per unit time. X is a random variable that can take the values {0,1,2,…}. Consider rolling a pair of dice. Let X equal the sum of the dice on a roll. If we think of the sample points as a pair (i, j), where i = value rolled by the first die and j = value rolled by the second die, we have X(s) = i + j. X is a random variable that can take any value between 2 and 12. Discrete Random Variable 3 1 Another Example Consider tossing two fair coins (i.e., one toss followed by another). Define a random variable X as the number of heads seen in the experiment. For this experiment: The sample space S = {(H,H), (H, T), (T,H), (T, T)} The mapping of s ∈ S to a real number X(s) is as follows: s X(s) (H,H) (H,T) 2 1 (T,H) (T,T) 1 0 Discrete Random Variable 4 Event Space of Random Variables Random variable partitions its sample space into mutually exclusive and collectively exhaustive events. Define Ax = {X = x} to be a subset of S consisting of all sample points s to which the random variable X assigns the value x. Ax = {s ∈ S | S ( s ) = x} Clearly, Ax satisfies the following properties: Ax ∩ Ay = ∅, ∀x ≠ y Ux∈ℜ Ax = S Discrete Random Variable 5 Event Space - An Example Consider again the coin tossing experiment. Enumerate all possible event spaces of the random variable X? A0 = {s ∈ S | X ( s ) = 0 } = {(T , T )} A1 = {s ∈ S | X ( s ) = 1} = {( H , T ), (T , H )} A2 = {s ∈ S | X ( s ) = 2 } = {( H , H )} Discrete Random Variable 6 2 Discrete Random Variables and PMF A random variable X is said to be discrete if the number of possible values of X is finite, or at most, an infinite sequence of different values. Discrete random variables are characterized by the probabilities of the values attained by the variable. These probabilities are referred to as the Probability Mass Function (PMF) of X. Mathematically, we define PMF as: p X ( x ) = P( X = x) = P ({s | X ( s) = x}) = ∑ P( s) X (s) = x Discrete Random Variable 7 Properties of PMF 0 ≤ p X ( x) ≤ 1, ∀x ∈ ℜ ; this follows from Axioms of Probability. ∑ p ( x) = 1 ; this is true because random variable X x∈ℜ X assigns some value x ∈ ℜ to each sample point s ∈ S. A discrete random variable can take finite or countably infinite different values, say x1, x2,…. Thus, the condition above can be restated as follows: ∑i p X ( xi ) = 1 Terminology: Probability Mass Function is also referred to as discrete density function. Discrete Random Variable 8 Cumulative Distribution Function The PMF is defined for a specific random variable value x. How can we compute the probability of A ⊂ ℜ? Compute P ({s | X ( s ) ∈ A}) = U x ∈A P ({s | X ( s ) = xi }) i Write the above as P ({s | X ( s ) ∈ A}) = P( X ∈ A); if A = (a, b), we write P ( X ∈ A) as P (a < X < b). The cumulative distribution function (CDF) of a random variable X is a function FX(t), −∞ < t < ∞, defined as: FX (t ) = P(−∞ < X ≤ t ) = P( X ≤ t ) = ∑ p X ( x) x ≤t Discrete Random Variable 9 3 Properties of CDF 0 ≤ F(x) ≤ 1, ∀ −∞ < x < ∞. F(x) is a monotone increasing function of x because if x1 ≤ x2, then F(x1) ≤ F(x2). limx →∞ = 1 and limx →-∞ = 0. Note: In the above, the random variable is not explicitly specified. If we are talking about two random variables, say X and Y at the same time, then we should explicitly specify them in the CDFs, e.g., FX(t) and FY(t). Terminology: Cumulative Distribution Function may be called Probability Distribution Function or simply a Distribution Function. Discrete Random Variable 10 Expectation PMF of a random variable provides several numbers. Expectation (or mean) of X is one way to summarize the information provided by PMF E[ X ] = x p X ( x) = μ Definition: ∑ x weighted average of possible values of X. Discrete Random Variable 11 Properties of Mean E[cX ] = cE[ X ] n n i =1 i =1 E[∑ ci X i ] =∑ ci E[ X i ] • ci ‘s are constants • works even if Xi’s are not independent Discrete Random Variable 12 4 Variance Variance of a random variable X, denoted as Var(X) or σ2, is defined as the expected value of (X – E[X])2: Var ( X ) = E[( X − E[ X ]) 2 ] Var(X) can also be calculated as: Var ( X ) = E[ X 2 ] − ( E[ X ]) 2 where E[X2] is the second moment of X. Discrete Random Variable 13 Properties of Variance Variance is a measure of dispersion of a random variable about its mean. Var ( X ) ≥ 0 Var (cX ) = c 2Var ( x) ⎛ n ⎞ n Var ⎜ ∑ xi ⎟ = ∑ Var ( xi ) ⎝ i =1 ⎠ i =1 if xi’s are independent Discrete Random Variable 14 Common Discrete Random Variables Bernoulli Random Variable Binomial Random Variable Geometric Random Variable Poisson Random Variable Discrete Random Variable 15 5 Bernoulli Random Variable Consider an experiment whose outcome can be either success or failure. If X is a random variable that characterizes this experiment, we can say X = 1 for success, and X = 0 for failure. The PMF for this random variable is given by: pX(1) = P({X = 1}) = p pX(0) = P({X = 0}) = 1 – p = q where p is the probability of success of an experiment. Discrete Random Variable 16 Bernoulli Distribution The CDF of a Bernoulli random variable X with parameter p=1- q is given by: ⎧0, x < 0 ⎪ F ( x ) = ⎨q, 0 ≤ x < 1 ⎪1, x ≥ 1 ⎩ Mean and variance: Figure 1: CDF of a Bernoulli Random Variable E[ X ] = p Var ( X ) = p(1 – p) Discrete Random Variable 17 Binomial Random Variable Consider n Bernoulli trials, where each trial can result in a success with probability p. The number of successes X in such a n-trial sequence is a binomial random variable. The PMF for this random variable is given by: ⎧⎛ n ⎞ k n−k ⎪⎜ ⎟ p (1 − p) , k = 0,1,2,..., n p X (k ) = P{( X = k )} = ⎨⎜⎝ k ⎟⎠ ⎪0, otherwise ⎩ where p is the probability of success of a Bernoulli trial. Discrete Random Variable 18 6 Binomial PMF Binomial Distribution (n = 10) 0.3 0.25 P ({X = k }) 0.2 p = 1/2 p = 1/4 p = 3/4 p = 1/2 0.15 0.1 0.05 0 0 1 2 3 4 5 6 7 8 9 10 Number of successes (k ) Discrete Random Variable 19 Binomial PMF (large n, small p) Binomial Distribution (n = 100, p =0.02) 0.3 P ({X = k }) 0.25 0.2 0.15 0.1 0.05 0 0 5 10 15 20 Number of successes (k ) Discrete Random Variable 20 CDF of Binomial R.V. ⎣t ⎦ ⎛ n ⎞ FX (t ) = ∑ ⎜⎜ ⎟⎟ p i (1 − p ) n −i , t ≥ 0 i =0 ⎝ i ⎠ How did we compute pX(k)? From independent trial assumption, we know pk(1-p)n-k is the probability of any sequence of outcomes that results in k successes. There are ⎛n⎞ ⎜⎜ ⎟⎟ such sequences. ⎝k ⎠ Hence, we calculate pX(k) as shown above. Mean and variance: E[ X ] = np Var ( X ) = np (1 − p ) Discrete Random Variable 21 7 Geometric Random Variable The number of Bernoulli trials, X, until first success is a Geometric random variable. PMF is given as: ⎧(1 − p ) k −1 p, k = 1,2,... p X (k ) = ⎨ 0 , otherwise ⎩ CDF is given as: ⎣t ⎦ FX (t ) = ∑ (1 − p ) i −1 p = 1 − (1 − p) ⎣t ⎦ , t≥0 i =1 Mean and variance: E[ X ] = 1 p Var ( X ) = 1− p p2 Discrete Random Variable 22 Geometric PMF Geometric Distribution (p = 0.5) 0.6 P ({X = k}) 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 10 Number of trials until first success (k ) Discrete Random Variable 23 Example: Modeling Packet Loss Geometric r.v. gives number of trials required to get first success It is easy to see pX(k) = (1-p)k-1p, k = 1,2,… where p is the probability of success of a trial Modeling packet losses seen at a router We can model using a Bernoulli process {Y0, Y1, Y2,…} where Yi represents a Bernoulli trial for packet number i We can say: P{Yi = 1} = p (i.e., a packet loss) P{Yi = 0} = 1 - p (i.e., no loss) So number of successful packet transmissions before first loss, X, is geometrically distributed P{(X = n)} = p (1-p)n-1 , n = 1,2,…(good length distribution) Discrete Random Variable 24 8 Example: Modeling Packet Loss (…) Model Suppose each bit transmitted through a channel is received correctly with probability 1-p and corrupted with probability p. Each transmission is an independent Bernoulli experiment. Assume p is constant over time. S R PKT ACK Timer times out if no ACK is received Assume each packet has length l bits Questions 1) How many trials we need to successfully deliver a packet? 2) How does (1) depend on the channel BER (p)? Discrete Random Variable 25 Example: Modeling Packet Loss (…) P(no error in transmission of packet) = (1-p)l = q P1 P2 P3 … are packet transmission trials X = number of trials needed to successfully transmit a packet ⇒X is geometrically distributed with probability q ⇒E[X] = 1 / q As p → 1, q → 0 ⇒ E[X] → ∞, which coincides well with intuition Also, for fixed p, as l → ∞ ⇒ E[X] → ∞ Discrete Random Variable 26 Poisson Random Variable A discrete random variable, X, that takes only nonnegative integer values is said to be Poisson with parameter λ > 0, if X has the following PMF: ⎧ −λ λk , ⎪e p X (k ) = ⎨ k! ⎪0, ⎩ k = 0,1,2,... otherwise Poisson PMF with parameter λ is a good approximation of Binomial PMF with parameters n and p, provided λ = np, n is very large, and p is very small. Discrete Random Variable 27 9 Poisson Random Variable (…) Note that ∞ zk ∑ k! = e z , for any real or complex number k =0 ⇒ ∞ ∑ p X ( k ) = e −λ e λ = 1 k =0 Discrete Random Variable 28 Poisson PMF Poisson Distribution 0.7 λ = 0.5 λ=1 λ=5 0.6 P [X = k] 0.5 0.4 0.3 0.2 0.1 0 0 1 2 3 4 5 6 7 8 9 10 Number of events (k) Discrete Random Variable 29 Poisson Approximation to Binomial Poisson Distribution (λ = 2) 0.3 0.25 0.25 0.2 P ({X = k }) P ({X = k }) Binomial Distribution (n = 100, p =0.02) 0.3 0.15 0.1 0.2 0.15 0.1 0.05 0.05 0 0 0 5 10 0 5 10 Number of events (k ) Number of successes (k ) Binomial distribution with large n and small p can be approximated by Poisson distribution with λ = np Discrete Random Variable 30 10 Poisson Random Variable (cont.) CDF of Poisson Random Variable: ⎣t ⎦ λk k =0 k! FX (t ) = ∑ e −λ t≥0 , Mean and variance: E[ X ] = λ Var ( X ) = λ Consider N independent Poisson random variables Xi, i=1,2,3,…,N, with parameters Xi. Then X=X1+X2+…+XN is also a Poisson r.v. with parameter λ=λ1+λ2+...+λΝ Discrete Random Variable 31 Deriving the Mean of Poisson R.V. Poisson r.v. has PMF : p X (k ) = e −λ λk , k = 0,1,2,... k! E[ X ] can be calculated as ∞ ∞ ∑ k e −λ λk = λ ∑ e −λ λ E[ X ] = k =0 ∞ k =1 =λ k! = ∑ k e −λ k =1 k −1 (k − 1)! ∞ ∑ p X ( m) λk ∞ k! = = λ ∑ e −λ m =0 λm m! = =λ m 0 24 1=4 3 =1, from axioms of probability Discrete Random Variable 32 Example: Job arrivals Consider modeling number of job arrivals at a shop in an interval (0,t] Let λ be the rate of arrival of jobs In an interval Δt → 0 P{one arrival in Δt} = λ Δt P{two or more arrivals in Δt} is negligible Divide the interval (0,t] into n subintervals of equal lengths Assume arrival of jobs in each interval to be independent of arrivals in another interval Discrete Random Variable 33 11 Example: Job arrivals (…) If n → ∞, the interval can be viewed as a sequence of Bernoulli trials with p = λ Δt = λ t n The number of successes k in n trials can be given by the Binomial distribution’s PMF ⎛n⎞ = ⎜⎜ ⎟⎟ p k (1 − p) n −k ⎝k ⎠ Discrete Random Variable 34 Example: Job arrivals (…) Substitute for p = λt/n to get k n −k ⎛ n ⎞⎛ λt ⎞ ⎛ λt ⎞ ⎜⎜ k ⎟⎟⎜ n ⎟ ⎜1 − n ⎟ , k = 0,1,..., n ⎠ ⎝ ⎠⎝ ⎠ ⎝ Setting n → ∞ the above reduces to e -λ t (λ t ) k , k = 0,1,2,... k! Letting k → 1, the probability of k events in time interval (0,1] is e -λ λk k! which is the Poisson distribution Discrete Random Variable 35 Example: ALOHA Protocol ALOHA protocol was developed in 1970’s at the University of Hawaii It is a Medium Access Control (MAC) – layer protocol developed for sharing wireless channels ALOHA protocol is designed to allow multiple users to use a single communication channel Discrete Random Variable 36 12 Example: ALOHA - Basic Idea Very, very simple… Let users transmit whenever they have data to be sent If two or more users send data at the same time, a collision occurs and the packets are destroyed Upon collision, the sender waits for a random amount of time and resends the packet Modeling question: What is the throughput of an ALOHA channel? Discrete Random Variable 37 Example: ALOHA - Model Infinite population of users generating N frames per frame time Frame time: amount of time required to transmit a fixed-length frame 0<N<1 “Poisson model”: Poisson distribution predicts the number of events to occur in a time period Question: what is the rate of generation of frames G? Frames generated include new + retransmitted frames At low loads G ≈ N At high loads G > N Discrete Random Variable 38 Example: ALOHA - Model (…) Let S = rate of successful frame transmission p0= probability of successful transmission =S/G Question: What is the vulnerable period? 2t Discrete Random Variable 39 13 Example: ALOHA - Model (…) Mean # of frames generated in vulnerable period 2t = 2G Why? E[X] = 2G, where X = X1+X2 We assume X1+X2 are Poisson distributed with rate G, so X is also Poisson with rate G Probability that no other traffic is initiated in vulnerable period is pX(0) = e-2G, which is Poisson model Discrete Random Variable 40 Example: ALOHA - Throughput p0 = S = e − 2G ⇒ S = G e − 2G G dS =0 dG 1⋅ e − 2G is throughput at maximum throughput + G (−2) e − 2G = 0 e − 2G (1 − 2G ) = 0 ⇒ G = 1 2 1 − 2⋅(1 / 2) 1 e = = 0.184 2 2e ALOHA protocol' s maximum channel utilization is 18.4% S max = Discrete Random Variable 41 Jointly Distributed Random Variables 14 Joint PMFs of Multiple R.V.’s Probabilistic models may involve more than one random variable These random variables are defined for the same experiment and sample space, and they may have relationships among them Let Z: (X, Y) be defined on sample space S For each sample point s in S, the random variables X and Y take one of its possible values, e.g. X(s)=x, Y(s)=y Z is then a 2-dimensional vector satisfying the following relationship: Z: S → ℜ2 with Z(s)=z=(x, y) Discrete Random Variable 43 Joint PMF (...) The Joint PMF of X and Y (or the Joint PMF of random vector Z) is defined as: pZ ( z ) = P ({Z = z}) = P ({ X = x}, {Y = y}) Properties of this PMF pZ ( z ) ≥ 0, z ∈ ℜ 2 {z | pZ ( z ) ≠ 0} : a subset of ℜ 2 p X ( x) = ∑ p X ,Y ( x, y ) y pY ( y ) = ∑ p X ,Y ( x, y ) x p X ( x) and pY ( y ) : Marginal PMFs of x and y, respectively Discrete Random Variable 44 Conditional PMF 15 Conditioning On An Event Look at conditional PMFs given the occurrence of a certain event or given the value of another random variable Conditional PMF of random variable X, conditioned on an event A with P(A)>0 is defined as: p X | A ( x) = P{( X = x | A)} = P({ X = x} ∩ A} P( A) Calculate pX|A(x) by adding the probabilities of outcomes that result in X=x and belong to the conditioning event A, and then normalize by dividing with P(A). Discrete Random Variable 46 Example: A Web Surfer A web surfer will repeatedly attempt to connect to a Web server, up to a maximum of n times. Each attempt has a probability p of being successful. What is the PMF of the number of events, given that the surfer successfully connects to the Web server? Discrete Random Variable 47 Example: A Web Surfer (…) Let A be the event that the web surfer successfully connects (with at most n attempts) Let X be the number of attempts needed to establish a connection assuming unlimited number of attempts could be made. Clearly, X is a geometric random variable with parameter p and A={X ≤ n}. ⎧ (1 − p ) k −1 p , k = 1,2,..., n ⎪ n ⎪ j −1 P( A) = ∑ (1 − p ) j −1 p, and p X | A (k ) = ⎨ ∑ (1 − p ) p j =1 ⎪ j =1 ⎪ 0, otherwise ⎩ n Discrete Random Variable 48 16 Conditioning on Another R.V. Let X and Y be two random variables associated with the same experiment. Suppose Y equals y, then this provides some information regarding the value of X. This information is captured by the conditional PMF: p X |Y ( x | y ) = P({ X = x, Y = y}) p X ,Y ( x, y ) = P({Y = y}) pY ( y ) The conditional PMF pX|Y(x,y) satisfies the normalization property ∑ p X |Y ( x | y) = 1 x Discrete Random Variable 49 17