* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download Notes on order statistics of discrete random variables
Renormalization group wikipedia , lookup
Financial economics wikipedia , lookup
Regression analysis wikipedia , lookup
Information theory wikipedia , lookup
Generalized linear model wikipedia , lookup
Probability box wikipedia , lookup
Fisher–Yates shuffle wikipedia , lookup
1 STAT512/432 – 2011 Notes on order statistics of discrete random variables In STAT 512/432 we will almost always focus on the order statistics of continuous random variables. Despite this, these notes discuss order statistics, in particular the maximum and the minimum, of n discrete random variables. We start with some basic background theory. Some background theory Example: the uniform distribution For a discrete uniform distribution, all possible values of the (discrete) random variable have the same probability. In these notes we focus attention on the discrete random variable Y that is equally likely to take each of the values 0, 1, 2, . . . , 9. That is, Probability(Y = y) = 1 , 10 y = 0, 1, . . . , 9. (1) The mean of Y is, clearly, mean of Y = 9 X y=0 y 1 = 4.5. 10 (2) A slightly more complicated calculation shows that the variance of Y is 8.25. The cumulative distribution function of a discrete random variable The cumulative distribution function F (y) of any discrete random variable Y is the probability that the random variable takes a value less than or equal to y. Thus using a standard notation, X F (y) = P r(Y = a). (3) a≤y The maximum of n discrete random variables Definitions The maximum of a number of a number if independent random variables is frequently used in statistics. As an introduction to the relevant theory, we now consider properties of the maximum of n iid discrete random variables Y1 , Y2 , . . . , Yn , each having the same probability distribution with common cumulative distribution function F (y). We denote this maximum by Ymax . Given 2 data y1 , y2 , . . . , yn , we write the observed (data) value of Ymax as ymax . There can be ties: for example, if n = 5 and y1 = 7, y2 = 9, y3 = 6, y4 = 9, y5 = 6, then ymax = 9. In this case two of the observations tied at the maximum value. Theory Since the maximum of n quantities is less than or equal to any number y if and only if all of these quantities are less than or equal to y, we have n (4) Prob (Ymax ≤ y) = F (y) , n Prob (Ymax ≥ y) = 1 − F (y − 1) , (5) so that Prob (Ymax = y) = F (y) n n − F (y − 1) . (6) Example: the uniform distribution again As an example, we use the theory above to consider properties of the maximum of n random variables, each having the above uniform distribution considered above. The cumulative distribution function F (y) of a random variable having the above uniform distribution is easily seen to be given by F (y) = y+1 , 10 y = 0, 1, 2, . . . , 9. Then from (6) and (7), the probability that Ymax = Y is given by n y+1 y n Pmax (y) = − , x = 0, 1, 2, . . . , 9. 10 10 (7) (8) Mean and variance of a maximum of n random variables from the above uniform distribution From the above theory, the mean value µmax of Xmax , the maximum of n random variables from the uniform distribution considered above is given by µmax = n 9 X y+1 y n y − . 10 10 y=0 This simplifies, after some algebra, to n n n 1 2 9 µmax = 9 − − − ··· − . 10 10 10 In Homework 3 you will be asked to verify this calculation. (9) (10) 3 As a check on this calculation, the case n = 1 gives 1 2 9 1 + 2 + ··· + 9 µmax = 9 − − − ··· − =9− = 4.5. (11) 10 10 10 10 One can also find the value of the variance of Ymax , but we do not do this here. One immediate conclusion that can be drawn from (10) is that, as n → ∞, the mean of Ymax approaches 9. It can also be shown that the variance mean of Ymax approaches 0 as n → ∞. Both these conclusions “make sense.” The following table indicates the rate at which these occur. value of n 5 10 20 mean of Ymax 7.79175 8.508466 8.866059 variance of Ymax 1.928782 0.624477 0.142473 The minimum of n discrete random variables Properties of the minimum Ymin of n independently and identically distributed random variables can be found in a manner similar to that for which properties of a maximum were found. Using the notation above, we get n Prob (Ymin ≥ y) = 1 − F (y − 1) , (12) n Prob (Ymin ≥ y + 1) = 1 − F (y) , (13) so that Prob (Ymin = y) = 1 − F (y − 1) n n − 1 − F (y) . (14) From (14) one can find the mean and the variance of Ymin . We do not give the details here, and note only that in the case of the uniform distribution considered above, the mean of Ymin is 2 9 1 n ) + ( )n + · · · + ( )n . 10 10 10 Note that as n → ∞, this mean approaches 0. This makes sense. ( (15) The general order statistic of n discrete random variables Ymin and Ymax are examples of order statistics: specifically, Ymin is the first order statistic and Ymax is the nth order statistic. One can also define the second, third, ... , (n − 1)th order statistics. However the theory for these for a discrete random variable becomes quite complicated, so we do not consider it here.