Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Special Probability Distributions 5 5.1 Some Discrete Probability Distributions 5.2 Some Continuous Probability Distributions 5.3 Normal and Poisson Approximations to the Binomial Distritution 86 Learning Objectives It is expected that you will be able to do the following: 1. Describe the characteristics of a given probability distribution function. 2. Compute probabilities using a given probability distribution function. 3. Describe the characteristics of the Normal probability distribution 4. Determine the probability of an observation in a given interval on a Normal probability distribution 5. Use the Normal probability distribution to approximate the Binomial distribution 6. Use the Poisson probability distribution to approximate the Binomial probability distribution, A group or family of probability distributions refers to a collection of probability distributions that is indexed by a quantity called a parameter. A parameter is a numerical characteristic of a probability distribution. Very often, observations generated by different random experiments behave similarly, and as a result, the random variables associated with these experiments can be expressed essentially by the same probability distribution and therefore can be represented by a common formula. Let us now first look at some of these random 87 experiments that generates discrete random variables, and their corresponding probability functions. 5.1 Some Discrete Probability Distributions 5.1.1 Discrete Uniform Distribution Consider the experiment of randomly selecting a number from a set consisting of the numbers x1, x2, …, xN. If we let X to be the number selected, and assuming that each of the numbers are equally likely to be chosen, then the probability of each number being selected is equal to 1/N. A special case is when the numbers x1, x2, …, xN coincides with the numbers 1, 2, … , N. In this specific case, the random variable X is called a discrete uniform random variable with parameter N. Definition 5.1 Discrete Uniform Distribution A random variable X having a probability mass function given by 1 f(x) f(x, N) N 0 for x 1,2,..., N 1 I 1,2,..., N ( x ) N otherwise where the parameter N ranges over the positive integers, is called a discrete uniform random variable. Theorem 5.1 If X has a discrete uniform distribution, then E[X] = N 1 2 and variance var[X] = N 1 12 2 88 Proof: N E[X] i i 1 var[X] E[X 2 ] ( E[X]) 2 1 N 1 N 1 i N 2 i 1 N 1 N i N i 1 1 N ( N ! ) N 2 N 1 2 2 2 N2 1 12 Example 1 In the die rolling experiment, each possible outcome in the sample space = {1,2,3,4,5,6} can occur with probability 1/6. Here, X, the number of dots in the upturned face of the die is a discrete random variable, with probability mass function f(x;6) = 1/6 for x = 1,2,3,4,5,6. The expected value of X is E[X] = 3.5 and its variance is Var[X] = 35/12 /// In example 1, X is a discrete uniform random variable since its possible values are from 1 to 6, and each of these values are equally likely to occur with probability 1/6. 5.1.2 Bernoulli Distribution Now, consider an experiment which may result only into any of the two possible outcomes, either a success or a failure. Such an experiment is called a Bernoulli trial. If we define a random variable 89 X to be equal to 1 when the trial’s result is a success, and 0 if otherwise, then X is called a Bernoulli random variable. Given the probability of success p as a constant value, then formally this distribution is defined as follows: Definition 5.2 Bernoulli Distribution A random variable X is defined to have a Bernoulli distribution if the probability mass function of X is given by p x (1 - p)1-x f x (x) f x (x; p) 0 for x 0 or 1 x 1-x p (1 - p) I 0,1( x ) otherwise where the parameter p satisfies 0< p < 1. 1-p is often denoted by q. Theorem 5.2 Proof: If X has a Bernoulli distribution, then E[X] = p and variance var[X] = pq. E[X] =0q + 1p = p Var[X] = E[X2] – (E[X])2 = 02q + 12p – p2 = p - p2 = p(1-p) = pq 5.1.3 Binomial Distribution Consider now n performances of the Bernoulli trial, that is, n performances of an experiment consisting of only two possible outcomes labeled as a success or a failure, with the probability of success in each trial equal to a constant p, and the probability of failure is q = 1-p. This type of experiment is called a Binomial random experiment, and the random variable X which denotes the 90 number of successes out of the n trials is called a binomial random variable. The binomial experiment has the following characteristics: (1) The experiment consists of n trials. (2) The n trials are independent. (3) Each trial may result only into two possible outcomes, either a success or a failure. (4) The probability of a success denoted by p is constant from trial to trial. The binomial experiment is similar to sampling or selecting n objects with replacement, wherein the group being sampled consists of two types of items, say types A and B, so that randomly selecting an object of type A is a success, and getting an object of type B is a failure. Since before each draw, the object is replaced back into the group, then the probability of success p (which is the proportion of A objects in the group) is constant in each draw. Definition 5.3 Binomial Distribution A random variable X is defined to have a Binomial distribution if the probability mass function of X is given by n x n-x p (1 - p) for x 0, 1,2,..., n f x (x) f x (x; n, p) x 0 otherwise n p x (1 - p) n-x I 0,1,2, ... n( x ) x n The in the probability function of the binomial distribution x is the number of combinations out of n trials which has exactly x successes and n-x failures. Theorem 5.3 If X has a Binomial distribution, then 91 E[X] = np and var[X] = npq Example 2 Find the probability of obtaining exactly 4 heads in 7 tosses of a fair coin. Solution: This is a binomial random experiment. There are n=7 tosses (trials) and the probability of obtaining a head in each toss is equal to a constant p, which we may assume to be equal to 1/2 since the coin is fair, and there are only two possible outcomes in each toss; the coming out of a head is regarded as a success, so that the random variable of interest X is defined as the number of successes or heads out of 7 tosses. The possible values of X are 0,1,2,3,4,5,6, and 7, and we want to find the probability that X = 4. 1 P(X = 4) = f 4;7, = 2 7 1 1 1 - 4 2 2 4 4 7! 1 1 = 7- 4 3 3!4! 2 2 = 0.2734 /// Example 3 An archer can hit the target bull’s eye 80% of the time. If he is allowed to shoot 5 arrows, find the probability that he will get 3 bull’s eyes. Solution n = 5, x =3, and p = 0.8 5 P(X=3) = f 3;5,0.8 = 0.83 0.2 2 3 = 0.2048 /// 92 Example 4 In example 4, the probability distribution of X is given by the following table: Value of X P(X=x) 5 0 F 0;5,0.8 = 0.80 0.2 5 0 1 F 1;5,0.8 = 0.81 0.24 1 2 F 2;5,0.8 = 0.82 0.23 2 3 F 3;5,0.8 = 0.83 0.2 2 3 4 F 4;5,0.8 = 0.84 0.2 1 4 5 F 5;5,0.8 = 0.85 0.2 0 5 5 5 5 5 5 = 0.00032 = 0.0064 = 0.0512 = 0.2048 = 0.4096 = 0.32768 and the cumulative distribution function of X is or a 5 Fx (a) (0.8) x (0.2) 5 x I 0,1, 2,3, 4,5 ( x) x 0 x x0 0 0.00032 0 x 1 0.00672 1 x 2 2x3 Fx(x) = 0.05792 0.26272 3 x 4 4x5 0.67232 1 x 5 93 5.1.4 Poisson Distribution The Poisson random variable X, defined as the number of outcomes occurring during a given time interval or in a specified region appears in many natural phenomena. Some examples are the following: (1) number of bacteria in a given culture (2) number of radioactive particle emissions per unit of time (3) number of telephone calls per minute (4) number of typing errors per page A Poisson experiment has the following characteristics: (1) The numbers of occurrences in the different time intervals or specified regions are independent, and intervals/regions do not overlap. (2) The probability that exactly one happening will occur during a very short time interval or in a small region is proportional to the length of the time interval or the size of the region and does not depend on the number of outcomes occurring outside this time interval or region. (3) The probability that more than one outcome will occur in a short interval or region is negligible. For a Poisson random variable, the probabilities depend only on , the average number of outcomes occurring in the given time interval or specified region. The Poisson distribution is defined as follows: Definition 5.4 Poisson Distribution A random variable X is defined to have a Poisson distribution if the probability mass function of X is given by 94 e - x f x (x) f x (x; ) x! 0 for x 0, 1,2,..., e - x I 0,1,2, ... ( x ) x ! otherwise where the parameter satisfies > 0. Theorem 5.4 If X has a Poisson distribution, then E[X] = Var[X] = . Example 5 If a secretary makes on the average = 6 mistakes per page, the probability that she will make exactly X = 4 mistakes on a given page is f(4) = f(4;6) = e 6 6 4 = 0.1338. 4! /// Example 6 In example 5, what is the probability that the secretary will have at least 4 mistakes in a given page? Solution: P(X > 4) = 1 – P(X < 4) = 1 – [P(X=0) + P(X=1) + P(X=2) + P(X=3)] = 1 – f(0) – f(1) – f(2) – f(3) =1- e 6 6 0 e 6 61 e 6 6 2 e 6 63 0! 1! 2! 3! = 1 – 0.00248 – 0.01487 – 0.04462 – 0.08924 = 0.84879 /// 5.1.5 Hypergeometric Distribution Sampling without replacement is associated with the Hypergeometric distribution. Consider a box containing M objects, of which K items are labeled success and M – K items are labeled failure. From this box, n objects will be selected without replacement. The random variable X defined to be the number of success items 95 drawn out of n items selected is of interest. X has a Hypergeometric distribution. Definition 5.5 Hypergeometric Distribution A random variable X is defined to have a Hypergeometric distribution if the probability mass function of X is given by K M - K x n x f x (x) f x (x; M, K, n) M n 0 K M - K for x 0, 1,2,..., n x n - x I 0,1,2, ... n ( x ) M n otherwise where M is a positive integer, K is a nonnegative integer that is at most M, and n is a positive integer that is at most M. Theorem 5.5 If X has a Hypergeometric distribution, then E[X] = n K and variance M var[X] = n K MK Mn . M M M 1 Example 7 A truck carries 50 boxes of toys of which 10 are defective. Eight boxes will be delivered to the first costumer. What is the probability that the first delivery will include 3 defective toys? Solution M =50, K=10, M-K =40, n=8, x=3, n-x=5 K M - K x n - x P(X=3) = f(3;50,10,8) = = M n 10 40 3 5 = 0.1471 /// 50 8 96 5.2 Some Continuous Probability Distributions Continuous random variables and their associated probability density functions result from experiments which are defined over a continuous sample space. Three important continuous distributions will be discussed: the continuous uniform distribution, exponential distribution, and the normal distribution. 5.2.1 Continuous Uniform Distribution The uniform is the simplest distribution and is used for modeling situations in which events of equal length in an interval [a,b] are equally likely to occur. Definition 5.6 Continuous Uniform Distribution If the probability density function of a random variable X is given by f x (x) f x (x; a, b) 1 I [a, b] ( x ) ba where the parameters a and b satisfy - < a < b < , then the random variable X is defined to be uniformly distributed over the interval [a,b]. Theorem 5.6 If X has a continuous uniform distribution, then E[X] = ab 2 and variance var[X] = (b a ) 2 12 Proof: 97 b x2 1 1 E[X] = xf ( x )dx = x dx = 2 ba ba a b = a ab 2 Var[X] = E[X2] – (E[X])2 a b 1 a b 2 = x f ( x )dx - =x dx - 2 a ba 2 2 b 2 2 x3 1 = 3 ba b a a b b 2 ab a 2 = 3 2 - 2 a b (b a ) 2 = 12 2 - 2 The continuous uniform distribution has the following characteristics: 1. The possible values X are restricted to some interval [a,b] of real numbers. 2. Within the interval [a,b], any value is as likely to occur as any other value. Example 8 A dispenser is designed to give out 6 to 8 ounces of beverage in plastic glasses. If in fact the amount X dispensed is a uniform random variable, what is the probability that the amount released in a glass is (a) less than 7 ounces (b) more than 7.5 ounces (c) exactly 7.5 ounces Solution The values of X ranges from 6 to 8 ounces and its probability density function is f(x) = f(x;6,8) = 1 I [6,8] ( x ) 2 98 7 1 76 1 1 (a) P(X < 7) = dx = x = = = 0.5 2 2 2 2 6 6 7 8 1 1 (b) P(X>7.5) = dx = x 2 2 7.5 8 = 7.5 8 7.5 0.5 = = 0.25 2 2 (c) P(X = 7.5) = 0 /// 5.2.2 Exponential Distribution Random variables denoting the time until some prescribed event occurs, such as the amount of time until a piece of equipment breaks down, the time until a light bulb burns out, or the time until an accident occurs usually follows an exponential distribution. Definition 5.7 Exponential Distribution If the probability density function of a random variable X is given by f x (x) f x (x; ) e-x I[0,) (x) where >0, then X is defined to have an exponential distribution. Theorem 5.7 If X has an exponential distribution, then E[X] = 1 and variance var[X] = 1 2 Proof: 99 E[X] = E[X2] = 0 xe x dx = xe x 0 0 x 2e x dx = x 2 e x Var[X] = E[X2] – (E[X])2 = + 0 0 + e x dx =0- 0 e x 0 2xe x dx =0 + = 1 2 2 E[X] = 2 2 1 1 - 2 = 2 2 /// The most interesting property of this distribution is its “memoryless” property, which means that if the lifetime of an item is exponentially distributed, then an item which has been in use for some hours is as good as a new item with regard to the amount of time remaining until the item fails. Only the exponential distribution possesses this property. Example 9 Let the exponential random variable X denote the time until a small meteorite first lands anywhere in the desert. Assuming that the expected value of X is 10 days and the time is currently midnight. What is the probability that a meteorite first lands some time between 6 in the morning and 6 in the evening of the first day? Solution X is measured in days and E[X] = 1 1 = 10. Thus, . 10 If a day starts at midnight and ends at midnight, then 6 am and 6 pm are equivalent to ¼ and ¾ day lengths respectively. Thus, the desired probability is 3/ 4 P(¼<X< ¾) = 3/ 4 1/ 4 1 x /10 e dx = e x / 10 10 = e-1/40 – e-3/40 = 0.0476 1/ 4 100 5.2.3 The Normal Distribution One of the most used continuous probability distribution is called the normal probability distribution. Many variables are approximately normally distributed and therefore can be represented by the normal distribution. The graph of this distribution has the following characteristics: 1. It is bell-shaped and has a single peak at the center of the distribution. 2. The mean, median, and mode are at the center of the distribution. 3. It is symmetric about the mean. 4. It is continuous, asymptotic (never touches the x-axis) curve. 5. The total area under the curve is equal to 1 or 100 % 6. The position of the normal distribution on the x-axis is determined by the mean, , and the spread of the distribution is determined by the standard deviation, . Since the normal distribution is very often used, notation is used for it. If the random variable X is distributed, with mean and variance 2, it is written as X The notation , ( x ) is used for the density function of X 2 a special normally N(,2). N(,2), and , ( x ) for the cumulative distribution function. 2 Definition 5.8 Normal Distribution A random variable X is defined to be normally distributed if its density is given by ,2 ( x ) 2 2 1 e -(x- ) / 2 2 101 where the parameters and 2 satisfy - < < and 2 > 0. Theorem 5.8 If X is a normal random variable, E[X] = and variance var[X] = 2 Below are some examples of graphs of normal distributions. These graphs are called normal curves. Figure 5.1 is a sketch of two normal curves having the same variances but different means. The two curves have the same form but are located at different positions along the horizontal axis. In Figure 5.2 are two normal curves with the same mean but different variances, thus they are centered at the same position on the horizontal axis, but have different forms. The normal curve with the larger variance is lower and has a wider spread. Figure 5.3 sketches two normal curves with different means and different variances. Figure 5.1 Normal curves with 1 2 and 12 22 102 Figure 5.2 Normal curves with 1 = 2 and 12 22 Figure 5.3 Normal curves with 1 2 and 12 22 5.2.3.1 Areas Under the Normal Curve The graph of any continuous probability distribution may be constructed so that the areas under the curve, bounded by the lines at X = x1 and X = x2 may be obtained. This area is actually equal to the probability that the random variable will assume a value between x 1 and x2. Thus for the normal curve in figure 5.4, the shaded area represents the probability P(x1 < X < x2). Figure 5.4 Shaded area is equal to P(x1 < X < x2). 103 Once the parameters and 2 are specified, the graph of the probability density function N(,2) is completely determined. If a table of probabilities is available for the normal distribution under study, it would be easier to make use of this table than to use integral calculus. However, it would be a very tedious and hopeless task to attempt to set up separate tables or curves for every conceivable pair of and 2. Fortunately, every normal random variable X may be transformed to a new set of observations Z that is also normally distributed but with mean 0 and variance equal to 1. Z is called the standard normal random variable or standard score. In symbols, ZN(0,1), and any value of a normal random variable X with mean and variance 2 may be transformed to its standard score or standard normal value Z using the formula Z = X . Table A1 in the appendix shows the areas under the standard normal curve corresponding to P(0 < Z < z). Definition 5.9 Standard Normal Distribution The distribution of a normal random variable with mean zero and variance equal to 1 is called a standard normal distribution. The probability that the normal random variable X with mean and variance 2 will assume a value between x1 and x2, where x1 < x2, is equal to the probability that the standard normal random variable Z will assume a value in the interval z1 to z2, where z1 = z2 = x2 , that is, x1 and P(x1 < X < x2) = P(z1 < Z < z2) = 0,1 (z 2 ) 0,1 (z1 ) . 104 Below are some examples on determining probabilities of normal random variables by finding equivalent areas under the standard normal curve, using Table A1. Example 10 Let Z be a random variable with the standard normal distribution. Find (a) P(0 < Z < 1.34) (b) P(-0.65 < Z < 0) (c) P(-1.38 < Z < 2.56) (d) P(0.75 < Z < 1.45) (e) P(-2.32 < Z < -0.34) (f) P(|Z| < 2.34) (g) P(|Z| > 1.13) Solution (a) P(0 < Z < 1.34) is equal to the area under the standard normal curve between 0 and 1.34. Thus from table A1, look down the first column until 1.3 then continue right to column 4, The entry is 0.4099. Therefore, P(0 < Z < 1.34) = 0.4099 (b) Since the normal curve is symmetric, then P(-0.65 < Z < 0) = P(0 < Z < 0.65) = 0.2422 105 (c) P(-1.38 < Z < 2.56) = P(-1.38 < Z < 0) + P(0 < Z < 2.56) = P(0 < Z < 1.38) + P(0 < Z < 2.56) = 0.4162 + 0.4968 = 0.9130 (d) P(0.75 < Z < 1.45) = P(0 < Z < 1.45) - P(0 < Z < 0.75) = 0.4265 – 0.2734 = 0.1531 (e) P(-2.32 < Z < -0.34) = P(-2.32 < Z < 0) - P(-0.34 < Z < 0) = P(0 < Z 2.32) - P(0 < Z < -0.34) = 0.4898 – 0.1331 = 0.3567 (f) P(|Z| < 2.34) = P(-2.34 < Z < 2.34) = P(-2.34 < Z < 0) + P(0 < Z < 2.34) = P(0 < Z < 2.34) + P(0 < Z < 2.34) = 0.4904 + 0.4904 (g) P(Z > 1.13) = 0.5 - P(0 < Z < 1.13) = 0.5 – 0.3708 = 0.1292 /// 106 Example 11 Let X be a normal random variable with mean 5 and variance 16. Find (a) P(5 < X < 10) (b) P(X > 18) Solution = 5 and = 4 Z= X X 5 4 P(x1 < X < x2) = P(z1 < Z < z2) 10 5 55 <X< ) = P(0 < Z < 1.25) = 0.3944 4 4 18 5 (b) P(X<18) = P(Z< ) = P(Z<3.25) = 0.5 + P(0 < Z < 3.25) 4 (a) P(5<X<10) = P( = 0.5 + 0.4994 = 0.9994 /// Example 12 Let X be a normal random variable with mean = 20 and standard deviation = 5, find the value of x for the following probabilities: (a) P(20 < X < x) = 0.4948 (b) P(X > x) = 0.9382 Solution (a) P(20<X<x) = P( 20 20 x 20 Z ) = P(0< Z < z) = 0.4948 5 5 107 From Table A1, P(0 < Z < 2.56) = 0.4948 x 20 = 2.56, and x = 32.8 5 x 20 x 20 (b) P(X > x) = P(Z > ) = 0.9382 and z = 5 5 Thus, The standard score z should be at the left side of zero. From Table A1, P(0 < Z < 1.54) = 0.4382, which, by symmetry is equal to P(-1.54 < Z < 0). Thus, the standard value that we are looking for is z = -1.54. x 20 = -1.54 5 x = (-1.54)(5) + 20 = 12.3 /// 5.2.3.2 Applications of the Normal Distribution Below are some problems in which the distribution of the data, or the distribution of the random variables under consideration are approximated closely with the normal distribution. Example 13 A certain town in Nueva Ecija received an average rainfall, according to PAGASA, of 9.32 centimeters for the month of April. Assuming a normal distribution with a standard deviation of 2.85 centimeters, find the probability that next April, the town receives 108 (a) less than 11.92 centimeters of rain (b) more than 4 centimeters but not over 8 centimeters or rain (c) more than 14.2 centimeters of rain Solution X = rainfall in cm X N(9.32, 2.852) (a) P(X < 11.92) = P(Z < 11.92 9.32 ) = P(Z < 0.32) 2.85 = P(Z<0) + P(0<Z<0.32) = 0.5 + 0.1255 = 0.6255 (shaded area) (b) P(4 < X < 8) = P( 4 9.32 8 9.32 Z ) = P(-1.87 < Z < -0.46) 2.85 2.85 = P(0.46<Z<1.87) = P(0<Z<1.87) – P(0<Z<0.46) = 0.4693 – 0.1772 = 0.2921 (c) P(X > 14.2) = P(Z > 14.2 9.32 ) = P(Z > 1.71) 2.85 = 0.5 – P(0<Z<1.71) = 0.5 – 0.4564 = 0.0436 /// Example 14 The weights of pineapples received by DOLE have a mean of 1.4 kilos and a standard deviation of 0.12 kilo. (a) What percentage of all these pineapples is heavier than 1.6 kilos? (b) What percentage of the pineapples is between 1.25 and 1.55 kilos? 109 Solution X = weight of a pineapple received by DOLE X N(1.4,0.122) (a) P(X > 1.6) = P(Z > 1 .6 1 .4 ) = P(Z > 1.67) 0.12 = 0.5 – P(0<Z<1.67) = 0.5 – 0.4525 = 0.0475 Hence 4.75% of all these pineapples is heavier than 1.6 kilos (b) P(1.25<X<1.55) = P( 1.25 1.4 1.55 1.4 <Z< ) 0.12 0.12 = P(-1.25<Z<1.25) = P(-1.25<Z<0) + P(0<Z<1.25) = 2P(0<Z<1.25) = 2(0.3944) = 0.7888 Thus, 78.88% of the pineapples is between 1.25 and 1.55 kilos. /// Example 15 The IQ of the 2000 first year student applicants to CLSU are approximately normally distributed with a mean or 115 and a standard deviation of 10. If the university requires an IQ of at least 95, how many of these students will not be accepted on this basis with other qualifications disregarded? Solution X = IQ of a first year student applicant X N(115,100) P(X < 90) = P(Z< 95 115 ) = P(Z<-2) = P(Z>2) 10 = 0.5 – P(0<Z<2) = 0.5 – 0.4772 = 0,0228 110 2.28% of the student applicants’ IQ is below 95. Therefore, 0.0228 x 2000 = 46 students will not be accepted based on IQ qualification. /// Example 16 The indicated amount of hot tea which a dispenser puts into 6-ounce disposable styro-cups varies from cup to cup, and follows a random variable having a normal distribution with a standard deviation of 0.1 ounce. If only 10 percent of the cups are to contain less than 6 ounces of hot tea, what must the average fill of the cups be? Solution X = amount of hot tea dispensed by a machine X N(, 0.12) P(X < 6) = P(Z < 6 ) 0.1 = P(Z < z) = 0.10 Obviously, z is at the left of 0, it is a negative value. P(Z < z) = P(Z > -z) = 0.10 P(Z < z) = P(Z > -z) = 0.50 – P(0<Z<-z) = 0.10 Thus, P(0<Z<-z) = 0.40 From table A1, -z = 1.28 (the closest value) Solving for from -z = 6 : 0.10 = 6 + (0.10)(1.28) = 6.128 ounces. /// 111 5.3 Normal and Poisson Approximations to the Binomial 5.3.1 Normal Approximation to the Binomial Distribution When n, the number of trials, is very large, and p, the probability of a success on an individual trial, is close to 0.5, the normal distribution can be used to closely approximate the binomial distribution. This normal approximation to the binomial distribution is useful, especially where there is a need to use the binomial probability function repeatedly to obtain the desired probability. Making use of the normal approximation gives a very close answer and simplifies the solution. Theorem 5.9 If X is a binomial random variable with mean = np and variance 2 = npq, then the limiting form of the distribution of Z X np npq as n , is the standard normal distribution. Example 17 A balanced coin is flipped 12 times. (a) What is the exact probability of getting 5 heads? (b) Find the normal approximation to the probability in (a). Solution (a) Using p = 0.5, 12 P(X = 5) = (0.5) 5 (0.5) 7 = 792 (0.5)12 = 0.1934 5 (b) To find the normal distribution approximation to this probability, 5 is represented by the interval from 4.5 to 5.5, to correct continuity. 112 The mean = np = 12(0.5) = 6, and = np (1 p) 12(0.5)(0.5) = 1.732. P(4.5 <X<5.5) = P( 4.5 6 5. 5 6 Z ) = P(-0.87 < Z < -0.29) 1.732 1.732 = P(0.27 < Z < 0.87) = P(0<Z<0.87) – P(0<Z<0.27) = 0.3078 – 0.1064 = 0.2014 (approximate probability which differs only by 0.008 from the exact probability) /// Example 18 Suppose in the preceding example, the fair coin is tossed 100 times, and what is asked is the probability of obtaining at least 40 heads? Solution To solve this problem using the formula for the binomial distribution, we would have to find the sum of probabilities corresponding to 40, 41, 42, … and 100 heads, or subtract from 1 the sum of probabilities of 0,1,2,…, 39 heads. This will be a lot of work, but if the normal approximation is used, the area to the right of 39.5 need only to be found. For continuity, 40 is represented by the interval from 39.5 to 40.5, 41 is represented by the interval 40.5 to 41.5, and so on. Since = np = 100(0.5) = 50 and = np (1 p) 100(0.5)(0.5) = 5 P(X>39.5) = P(Z > 39.5 50 ) = P(Z > -2.1) 5 = P(-2.1<Z<0)+ P(Z >0) = 0.4821 + 0.5 = 0.9821 113 5.3.2 Poisson Approximation to the Binomial Distribution If the parameter n is very large or approaches infinity and p approaches 0 in such a way that np remains constant, say equal to , then the Poisson distribution very closely approximates the Binomial distribution. For a fixed x, n x p (1 - p) n-x x e - x . x! Example 19 MMDA records show that the probability is 0.00007 that a car will have a flat tire while driving through the EDSA – Aurora Boulevard crossing. Find the probability that among 25000 cars passing through this crossing, at least three will have a flat tire. Solution It will be very difficult to use the formula for the binomial distribution to answer this problem. Even a scientific calculator will not be able to give us the combination to 25000 taken 3 at a time. The only way to answer this is through the poisson approximation: = np = 25000(0.00007) = 1.75 X=3 P(X>3) = 1 – P(X<3) = 1 – [P(X=0) + P(X=1) + P(X=2)] e -1.75 (1.75) 0 e -1.75 (1.75)1 e -1.75 (1.75) 2 =10! 1! 2! = 1 – (0.174) – (0.304) – (0.266) = 1 – (0.744) = 0.256 /// 114 Actitivities: For numbers 1 - 5 , identify the distribution of the random variable, find the mean and variance, and solve/answer the question asked in the problem. 1. Consider the experiment of rolling a fair die and define X as the random variable which assigns 1 if the number of dots that appears is even and 0 if the number of dots that appears is odd. a. What are the possible values of X? b. Find P(X = 1) and P(X = 0). 2. A fitness club has 15 members, 10 of which prefer the exercise bicycle and 8 prefer the aerobic stepper. Suppose 7 members are selected at random, find the probability that at most 2 use the bicycle. 3. If 15% of the patients who take a certain medication get a headache, find the probability that if 6 people take the medication, 2 will get a headache. 4. A 700 paged book has 140 typographical errors randomly distributed in the pages. Find the probability that any given page has exactly 2 errors. 5. Suppose the time until a certain bulb fails to light up follows an exponential distribution with a mean of 8000 hrs. Find the probability that a bulb burns out some time between 6000 to 7000 hours. Solve the following problems completely. Use Table A1 for problems involving the normal distribution. 6. Find z if the area under the normal curve (a) Between 0 and z is 0.3531 (b) To the right of z is 0.0197 115 (c) To the right of z is 0.9265 (d) Between –z and z is 0.8444 7. A employee travels by train every day going to his office. Being a crossword puzzle buff, he makes use of his 30 minutes on the train answering a puzzle published in either newspaper A or B. Timing himself in this activity, he knows that it takes him an average of 25.5 minutes to solve a puzzle from newspaper A with a standard deviation of 4 minutes, while a puzzle from newspaper B takes him also 25.5 minutes on the average with a standard deviation of 2 minutes. What is the probability that he will complete the puzzle if she buys (a) newspaper A (b) newspaper B 8. The kalamansi trees in an orchard have a mean height of 5.34 feet with a standard deviation of 0.5 feet. Assuming that the distribution of these trees is approximately normal, find (a) what percentage of the trees are less than 5 feet tall (b) what percentage are at least 5.5 feet tall (c) what percentage of the trees are between 5 and 5.5 feet fall 9. In number (8), if there are 3000 kalamansi trees in the orchard, how many are (a) less than 5 feet tall (b) at least 5.5 feet tall (c) between 5 and 5.5 feet tall 10.A youth advocate organization has conducted annual walkaton for charitable purposes. They have established that on the average, 6% of the participants fail to finish the walk. What is the probability that fewer than 20 walkers out of 1000 walkers will fail to finish? 11. The registrar of a university assigns its students to sections following a rule that a class should be equally populated by male and female students. Suppose a professor in a class randomly pick 116 a student to answer a question using drawlots, what is the probability that out of 50 questions, a male student will be called upon (c) more than 30 times (d) fewer than 20 times 117