Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dependence and Measuring of Its Magnitudes Boyan Dimitrov Math Dept. Kettering University, Flint, Michigan 48504, USA Outline What I intend to tell you here, can not be found in any contemporary textbook. I am not sure if you can find it even in the older textbooks on Probability & Statistics. I have read it (1963) in the Bulgarian textbook on Probability written by Bulgarian mathematician Nikola Obreshkov . Later I never met it in other textbooks, or monographs. There is not a single word about these basic measures in the Encyclopedia on Statistical Science published more than 20 years later. 1. 1. 2. 3. 4. 5. Introduction All the prerequisites are: What is probability P(A) of a random event A, When we have dependence between random events, What is conditional probability P(A|B) of a random event A if another event B occurs, and Several basic probability rules related to pairs of events. Some interpretations of the facts, related to probability content. Who are familiar with Probability and Statistics, will find some known things, and let do not blame the textbooks for the gaps we fill in here. For beginners let the written here be a challenge to get deeper into the essence of the concept of dependence. 2. Dependent events. Connection between random events – Let А and B be two arbitrary random events. – А and B are independent only when the probability for their joint occurrence is equal to the product of the probabilities for their individual appearance, i.e. when it is fulfilled P( A B) P( A) P( B). (1) 2. Connection between random events (continued) the independence is equivalent to the fact, that the conditional probability of one of the events, given that the other event occurs, is not changed and remains equal to its original unconditional probability, i.e. P( A | B) P( A). (2) The inconvenience in (2) as definition of independence is that it requires P(B)>0 i.e. B has to be a possible event. Otherwise, when P(B)=0, the conditional probability P( A B) P( A | B) P( B) is not defined (3) 2. Connection between random events (continued) A zero event, as well as a sure event is independent with any other event, including themselves. The most important fact is that when equality in P( A B) P( A) P( B). does not hold, the events А and В are dependent. The dependence in the world of the uncertainty is a complex concept. The textbooks do avoid any discussions in this regard. In the classical approach equation (3) is used to determine the conditional probabilities and as a base for further rules in operations with probability. We establish a concept of dependence and what are the ways of its interpretation and measuring when А and В are dependent events. 2. Connection between random events (continued) Definition 1. The number ( A, B) P( A B) P( A) P( B) is called connection between the events А and В. Properties δ1) The connection between two random events equals zero if and only if these events are independent. This includes the cases when some of the events are zero, or sure events. δ2) The connection between events А and В is symmetric, i.e. it is fulfilled ( A, B) ( B, A). 2. Connection between random events (continued) δ3) ( A, B) ( B, A). δ4) ( Aj , B) ( Aj , B) δ5) ( A C , B) ( A, B) (C , B) ( A C , B) δ6) ( A, A) ( A, A ) P( A)[1 P( A)] These properties show that the connection function between events has properties, analogous to properties of probability function (additive, continuity). 2. Connection between random events (continued) δ7) The connection between the complementary events and B is the same as the one between А and В; A δ8) If the occurrence of А implies the occurrence of В, i.e. if A B, then ( A, B) P( A) P( A) and the connection between А and В is positive. The two events are called positively associated. δ9) When А and В are mutually exclusive, then the connection between А and В is negative. δ10) When ( A, B) 0, the occurrence of one of the two events increases the conditional probability for the occurrence of the other event. The following is true: P( A | B) P( A) ( A, B) P( B) . 2. Connection between random events (continued) δ11) The connection between any two events А and В satisfies the inequalities max{ P( A) P( B), [1 P( A)][1 P( B)]} ( A, B) ( A, B) min{ P( A)[1 P( B)], [1 P( A)]P( B)} We call it Freshe-Hoefding inequalities. They also indicate that the values of the connection as a measure of dependence is between – ¼ and + ¼ . 2. Connection between random events (continued) One more representation for the connection between the two events А and В ( A, B) [ P( A | B) P( A)]P( B). If the connection is negative the occurrence of one event decreases the chances for the other one to occur. Knowledge of the connection can be used for calculation of posteriori probabilities similar to the Bayes rule! We call А and В positively associated when ( A, B ) 0, and negatively associated when ( A, B) 0 . 2. Connection between random events (continued) Example There are 1000 observations on the stock market. In 80 cases there was a significant increase in the oil prices (event A). Simultaneously it is registered significant increase at the Money Market (event В) в 50 cases. Significant increase in both investments (event ) is observed in 20 occasions. The frequency estimated probabilities produce: P ( A) 80 .08 1000 Definition 1 says , P( B) 50 .05 , 1000 P( A B) 20 .02 1000 ( A, B) =.02 – (.08)(.05) = .016. If it is known that there is a significant increase in the investments in money market, then the probability to see also significant increase in the oil price is P ( A | B ) = .08 + (.016)/(.08) = .4. 2. Connection between random events (continued) Analogously, if we have information that there is a significant increase in the oil prices on the market, then the chances to get also significant gains in the money market at the same day will be: P ( B | A) = .05 + (.016)/(.08) = .25. Here we assume the knowledge of the connection ( A, B) , and the individual prior probabilities Р(А) and Р(В) only. It seems much more natural in the real life than what Bayes rule requires. 2. Connection between random events (continued) Remark 1. If we introduce the indicators of the random events, i.e. IA 0 then I A 1 , when event A occurs, and when the complementary event occurs, E( I A ) P( A) and Cov( I A , I B ) E( I A .I B ) E( I A ) E( I B ) E( I AB ) P( A) P( B) ( A, B) Therefore, the connection between two random events equals to the covariance between their indicators. 2. Connection between random events (continued) Comment: The numerical value of the connection does not speak about the magnitude of dependence between А and В. The strongest connection must hold when А=В. In such cases we have ( A, B) P( A) P 2 ( A). Let see the numbers. Assume А = В, and P(A)= .05. Then the event A with itself has a very low connection value .0475. Moreover, the value of the connection varies together with the probability of the event A. Let P(A)=.3, P(B)=.4 but А may occur as with В, as well as with and P(A|B)=.6. Then ( A, B) (.6 .3)(.4) .12 The value of this connection is about 3 times stronger then the previously considered, despite the fact that in the firs case the occurrence of В guarantees the occurence of А. 3. Regression coefficients as measure of dependence between random events. P ( A | B ) is the conditional The conditional probability measure of the chances for the event A to occur, when it is already known that the other event B occurred. When В is a zero event the conditional probability can not be defined. It is convenient in such cases to set P ( A | B ) =P(A). Definition 2. Regression coefficient of the event А with respect to the event В is rB (A) = P( A | B) -- P( A | B) . The regression coefficient is always defined, for any pair of events А and В (zero, sure, arbitrary). 3. Regression coefficients (continued) Properties (r1) The equality to zero rB (A) = rA (B) = 0 takes place if and only if the two events are independent. rA ( A) 1 (r2) rA ( A) 1 ; . (r3) rB ( A j ) rB ( A j ) (r4) rS ( A) rø ( A) 0 (r5) The regression coefficients and are numbers with equal signs - the sign of their connection . However, their numerical values are not always equal. To be valid rB (A) = rA (B) it is necessary and sufficient to have P( A)[1 P( A)] = P( B)[1 P( B)] 3. Regression coefficients (continued) (r6) The regression coefficients and are numbers between –1 and 1, i.e. they satisfy the inequalities 1 r ( A) 1 ; B 1 rA ( B) 1 (r6.1) The equality rB (A)= 1 holds only when the random event А coincides (is equivalent) with the event В. Тhen is also valid the equality r (B) =1; A (r6.2) The equality rB (A) = - 1 holds only when the random event А coincides (is equivalent) with the event complement of the event В. Тhen is also valid rA (B)= - 1, and respectively B - the AB . 3. Regression coefficients (continued) (r7) It is fulfilled rB ( A) = - rB (A) , rB ( A) (r8) Particular relationships ( A, B) rB ( A) , P( B)[1 P( B)] (r9) =- rB (A) rA ( B) , rB ( A) rB ( A). ( A, B) P( A)[1 P( A)] 3. Regression coefficients (continued) For the case where we have A B ( A, B) = P( A)[1 P( B)] P( A) P( B) rB (A) = P( A) P( B) rA (B) = P ( B ) P ( A) 3. Regression coefficients (continued) For the case where we have ( A, B) = P( A) P( B) rB (A) = P( A) P( B) rA (B) = P( B) P( A) A B 3. Regression coefficients (continued) A B the measures In the general case of dependence may be positive, or negative. If Р(А)=Р(В)=.5, and P ( A B )=.3, then the connection and both regression coefficients are positive; if P ( A B ) =.1, these all measures are negative. The sign and magnitude of the dependence measured by the regression coefficients could be interpreted as a trend in the dependence toward one of the extreme situations A B , A B , or A B and the two events are independent 3. Regression coefficients (continued) Example 1 (continued): We calculate here the rA (B) two regression coefficients rB (A) and according to the data of Example 1. The regression coefficient of the significant increase of the gas prices on the market (event A), in regard to the significant increase in the Money Market return (event B) has a numerical value rB (A) =(.016)/[(.05)(.95)] = .3368 In the same time we have rA (B) = (.016)/[(.08)(.92)] = .2174 . 3. Regression coefficients (continued) There exists some asymmetry in the dependence between random events - it is possible one event to have stronger dependence on the other than the reverse. The true meaning of the specific numerical values of these regression coefficients is still to be clarified. We guess that it is possible to use it for measuring the magnitude of dependence between events. In accordance with the distance of the regression coefficient from the zero (where the independence stays) the values within .05 distance could be classified as “one is almost independent on the other”; Distances between .05 to .2 from the zero may be classified as weakly dependent case Distances is between .2 and .45 could be classified as moderately dependent; Cases with | rB ( A) | from.45 to .8 to be called as in average dependent, above .8 to be classified as strongly dependent. This classification is pretty conditional, made up by the author. 3. Regression coefficients (continued) The regression coefficients satisfy the inequalities P( A) P( A) 1 P( A) 1 P( A) max , ' rB ( A) min P( B) 1 P( B) P( B) 1 P( B) P( B) P( B) 1 P( B) 1 P( B) max , , rA ( B) min P( A) 1 P( A) P( A) 1 P( A) These also are called Freshe-Hoefding inequalities. 4. Correlation between two random events Definition 3. Correlation coefficient between two events A and B we call the number R A, B = rB ( A) rA ( B) Its sign, plus or minus, is the sign either of the two regression coefficients. An equivalent representation R A, B = ( A, B) P( A) P( A) P( B) P( B) = P( A B) P( A) P( B) P( A) P( A) P( B) P( B) 4. Correlation (continued) Remark 3. The correlation coefficient R A, B between the events I A ,I B А and В equals to the correlation coefficient between the indicators of the two random events A and B . Properties R1. It is fulfilled R A, B = 0 if and only if the two events А and В are independent. R2. The correlation coefficient always is a number between –1 and +1, i.e. it is fulfilled -1≤ R A, B ≤ 1. R2.1. The equality to 1 holds if and only if the events А and В are equivalent, i.e. when А = В. R2.2. The equality to - 1 holds if and only if the events А and B are equivalent 4. Correlation (continued) R3. The correlation coefficient has the same sign as the other measures of the dependence between two random events А and В , and this is the sign of the connection. R4. The knowledge of R A, B allows calculating the posterior probability of one of the events under the condition that the other one is occurred. For instance, P(B | A) will be determined by the rule P ( B | A) = P(B) + R A, B P( A) P( B) P( B) P( A) The net increase, or decrease in the posterior probability compare to the prior probability equals to the quantity added to P(B), and depends only on the value of the mutual correlation. 4. Correlation P ( B | A) (continued) = P (B ) - R A, B P( A) P( B) P( B) P( A) R5. It is fulfilled R A, B= R A, B = - R A, B ; R = R A, B A, B R6. R A, A 1; RA, A 1; R7. Particular Cases. When A B, then R A, B P ( A) P ( B ) P ( A) P ( B ) RA, S RA, 0 ; If A B then R A, B P( A) P( B) P( A) P( B) 4. Correlation (continued) The use of the numerical values of the correlation coefficient is similar to the use of the two regression coefficients. As closer is R A, B to the zero, as “closer” are the two events А and В to the independence. Let us note once again that R A, B= 0 if and only if the two events are independent. 4. Correlation (continued) As closer is R A, B to 1, as “dense one within the other” are the events А and В, and when R A, B = 1, the two events coincide (are equivalent). As closer is R A, B to -1, as “dense one within the other” are the events А and B , and when R = - 1 the two events A, B coincide (are equivalent). These interpretations seem convenient when conducting research and investigations associated with qualitative (nonnumeric) factors and characteristics. Such studies are common in sociology, ecology, jurisdictions, medicine, criminology, design of experiments, and other similar areas. 4. Correlation (continued) Freshe-Hoefding inequalities for the Correlation Coefficient P( A) P( B) P( A) P( B) P( A) P( B) P( A) P( B) max , R ( A , B ) min , P ( A ) P ( B ) P ( A ) P ( B ) P ( A ) P ( B ) P ( A ) P ( B ) 4. Correlation (continued) Example 1 (continued): We have the numerical values of the two regression coefficients and from the previous section. In this way we get R A, B = (.3368)(.2174) = .2706. Analogously to the cases with the use of the regression coefficients, could be used the numeric value of the correlation coefficient for classifications of the degree of the mutual dependence. Any practical implementation will give a clear indication about the rules of such classifications. The correlation coefficient is a number in-between the two regression coefficients. It is symmetric and absorbs the misbalance (the asymmetry) in the two regression coefficients, and is a balanced measure of dependence between the two events. 4. Correlation (continued) Examples can be given in variety areas of our life. For instance: Consider the possible degree of dependence between tornado touch downs in Kansas (event A), and in Alabama (event B). T A sociology a family with 3, or more children (event In A), and an income above the average (event B); In medicine someone gets an infarct (event A), and a stroke (event B). More examples, far better and meaningful are expected when the revenue of this approach is assessed. 5. Empirical estimations The measures of dependence between random events are made of their probabilities. It makes them very attractive and in the same time easy for statistical estimation and practical use. 5. Empirical Estimations (contd) Let in N independent experiments (observations) the random event А occurs k A times, the random event В occurs k B times, and the event A B occurs k AB times. Then statistical estimators of our measures of dependence will be respectively: k AB k A k B ˆ ( A, B) N N N 5. Empirical Estimations (contd) The estimators of the two regression coefficients are k A B k A k B N N N kA k (1 A ) N N k A B k A k B N N N rˆA ( B) ; rˆB ( A) = kB kB (1 ) N N The correlation coefficient has estimator Rˆ ( A, B ) = k A B k A k B N N N kA k A kB kB (1 ) (1 ) N N N N 5. Empirical Estimations (contd) Each of the three estimators may be simplified when the fractions in numerator and denominator are multiplied by , we will not get into detail. The estimators are all consistent; the estimator of the connection δ(А,В) is also unbiased, i.e. there is no systematic error in this estimate. The proposed estimators can be used for practical purposes with reasonable interpretations and explanations, as it is shown in our discussion, and in our example. 6. Some warnings and some recommendations The introduced measures of dependence between random events are not transitive. It is possible that А is positively associated with B, and this event В to be positively associated with a third event С, but the event А to be negatively associated with С. To see this imagine А and В compatible (non-empty intersection) as well as В and С compatible, while А and С being mutually exclusive, and therefore, with a negative connection. Mutually exclusive events have negative connection; For non-exclusive pairs (А, В) and (В, С) every kind of dependence is possible. 6. Some warnings and some recommendations (contd) One can use the measures of dependence between random events to compare degrees of dependence. We recommend the use of Regression Coefficient for measuring degrees of dependence. For instance, let | rB ( A) || rC ( A) | then we say that the event А has stronger association with С compare to its association with B. In a similar way an entire rank of association of any fixed event can be given with any collection of other events. 7. An illustration of possible applications Alan Agresti Categorical Data Analysis, 2006. Table 1: Observed Frequencies of Income and Job Satisfaction Pi , j , Pi .. , P. j Job Satisfaction Income US $$ < 6,000 Very Little Moderate Dissatisfied Satisfied ly Satisfied Very Total Satisfied Marginal ly 20 24 80 82 206 6,000–15,000 22 38 104 125 289 15,000-25,000 13 28 81 113 235 > 25,000 7 18 54 92 171 Total Marginally 62 108 319 412 901 7. An illustration … (contd) Table 2: Empirical Estimations of the probabilities for each Pi , j , Pi.. , P. j particular case Job Satisfaction Income US $$ Very Dissati sfied Little Satis fied Moderat ely Satisf ied < 6,000 .02220 .02664 .08879 .09101 .22864 6,000–15,000 .02442 .04217 .11543 .13873 .32075 15,000-25,000 .01443 .03108 .08990 .12542 .26083 > 25,000 .00776 .01998 .05993 .10211 .18978 .06881 .11987 .35405 .45727 1.00000 Total Very Satisfi ed Total 7. An illustration … (contd) Table 3: Empirical Estimations of the connection function for each particular category of Income and Job Satisfaction Job Satisfaction Income US $$ Very Dissatisfied Little Satisfied Moderate ly Satisfied Very Satisfied Total < 6,000 0.006467 -.00077 0.00784 -0.01354 0 6,000–15,000 0.0023491 0.003722 0.001868 -0.00794 0 15,000-25,000 -0.003517 -0.00019 -0.00245 0.00615 0 > 25,000 -0.005298 -0.00277 -0.00726 0.015329 0 0 0 0 0 0 Total 7. An illustration … (contd) Surface of the Connection Function (Z variable), between Income level (Y variable) and the Job Satisfaction Level (X variable) , according to Table 3. Connection Function between Income Levesl and Satisfaction Levels 0.02 0.015 0.01 Connection 0.005 0 values -0.005 -0.01 -0.015 Series1 Series2 Series3 S3 1 2 Income Level S1 3 4 Satisfaction Level Series4 7. An illustration … (contd) Table 4: Empirical Estimations of the regression coefficient between each particular category of income with respect to the job satisfaction rSatisfaction j ( IncomeGroupi ) Job Satisfaction Income US $$ Very Dissatisfied Little Satisfied < 6,000 0.109327 -.00727 0.034281 -.05456 NA 0.035276 0.00817 -0.03199 NA -0.0107 0.024782 NA -0.03175 0.061761 NA 0 0 6,000–15,000 0.036603 15,000-25,000 -0.054899 -0.00176 > 25,000 Total -0.005298 -0.02625 0 0 Moderatel y Satisfied 0 Very Satisfied Total 7. An illustration … (contd) Surface of the Regression Coefficient rSatisfaction j ( IncomeGroupi ) Function Regression Coefficients of Sattisfaction w.r. to Income Level 0.15 0.1 Regr. Coeff. 0.05 values 0 Series1 Series2 -0.05 -0.1 1 2 Satisfaction Level S3 Income Level S1 3 4 Series3 Series4 7. An illustration … (contd) Table 5: Empirical Estimations of the regression coefficient between each particular category of the job satisfaction with respect to the income groups rIncomeGroup ( Satisfaction j ) i Job Satisfaction Income US $$ < 6,000 Very Dissatisfied 0.036670 Little Satisfied Moderatel y Satisfied Very Satisfied Total -0.00435 0.04445 -.07677 0 0.010783 0.017082 0.008576 -0.03644 0 15,000-25,000 -0.018246 -0.00096 -0.01269 0.0316 0 -0.034460 -0.01801 -0.04723 0.099594 0 NA 0 6,000–15,000 > 25,000 Total NA NA NA 7. An illustration … (contd) Surface of the Regression Coefficient Function, according to Table 5. rIncomeGroupi ( Satisfaction j ) Regression Coefficients surface for Satisfaction w.r.to Income 0.1 0.05 Regr. Coeff. values Series1 0 -0.05 Series2 S4 -0.1 1 2 Satisfaction Level S1 3 4 Income Level Series3 Series4 7. An illustration … (contd) Table 6: Empirical Estimations of the correlation coefficient between each particular income group and the categories of the job satisfaction Job Satisfaction Income US $$ < 6,000 6,000–15,000 Very Dissatisfied 0.060838 Little Satisfied Very Satisfied Total -0.005623 0.03904 -.06472 NA 0.019883 0.024548 15,000-25,000 -0.031649 -0.001302 > 25,000 Total -0.053363 -0.02174 NA NA Moderate ly Satisfied 0.008371 -0.03414 NA -0.01165 0.028117 NA -0.03872 0.078742 NA NA NA NA 7. An illustration … (contd) R( IncomeGroupi , Satisfaction j ) Surface of the Correlation Coefficient Function according to Table 6. Correlation Function for the Income Level and the Job Satisfaction Levels 0.1 0.05 Correlation values Series1 0 Series2 -0.05 S4 -0.1 1 2 Income Level S1 3 4 Job Satisfaction Series3 Series4 7. An illustration … (contd) A prediction of the Income group … Table 7: Forecast of the probabilities P(A|B) =P(A)+ δ(A,B)/P(B) of particular income group given the categories of the job satisfaction Job Satisfaction Income US $$ Very Dissatisfied Little Satisfied Moderately Satisfied Very Prior Satisfied P(A) < 6,000 0.32258 0.22223 0.25079 0.19903 .22864 6,000–15,000 0.35484 0.35184 0.32601 0.30339 .32075 15,000-25,000 0.20968 0.16668 0.25393 > 25,000 0.11289 0.16666 0.16927 Total 1.00 1.00 1.00 0.27428 .26083 0.22329 .18978 1.00 1.00 7. An illustration … (contd) A prediction of the Income group … Table 8: Forecast of the probabilities P(B|A) =P(B)+ δ(A,B)/P(A) of particular income group given the categories of the job satisfaction Job Satisfaction Income US $$ Very Dissatisfied Little Satisfied Moderately Satisfied Very Satisfied Total < 6,000 0.09708 0.11651 0.38835 0.39806 1.00 6,000–15,000 0.07612 0.13149 0.35861 0.43253 1.00 15,000-25,000 0.55317 0.07660 0.34468 > 25,000 0.02093 0.10526 0.31579 .06881 .11987 Total .35405 0.48085 1.00 0.53802 1.00 .45727 1.00 6. CONCLUSIONS We discussed four measures of dependence between two random events. These measures are equivalent, and exhibit natural properties. Their numerical values may serve as indication for the magnitude of dependence between random events. These measures provide simple ways to detect independence, coincidence, degree of dependence. If either measure of dependence is known, it allows better prediction of the chance for occurrence of one event, given that the other one occurs. If applied to the events A=[X<x], and B=[Y<y], these measures immediately turn into measures of the LOCAL DEPENDENCE between the r.v.’s X and Y associated with the point (x,y) on the plane. References [1] A. Agresti (2006) Categorical Data Analysis, John Wiley & Sons, Hew York. [2] B. Dimitrov, and N. Yanev (1991) Probability and Statistics, A textbook, Sofia University “Kliment Ohridski”, Sofia (Secоnd Edition 1998, Third Edition 2007). [3] N. Obreshkov (1963) Probability Theory, Nauka i Izkustvo, Sofia (in Bulgarian). [4] Encyclopedia of Statistical Sciences (1981 – 1988), v. 1 – v. 9. Editors-in-Chief S. Kotz and N. L. Johnson, John Wiley & Sons, New York . References (contd) Genest, C. and Boies, J. (2003). Testing dependence with Kendall plots. The American Statistician 44, 275-284. Kolev, N., Goncalves, M. and Dimitrov, B. (2007). On Sibuya's dependence function. Submitted to Annals of the Institute of Statistical Mathematics (AISM). Kotz, S. and Johnson, N., Editors-in-Chief (1982 1988). Encyclopedia of Statistical Sciences, v. 1 - v. 9. Wiley & Sons. Nelsen, R. (2006). An Introduction to Copulas. 2nd Edition, Springer, New York. References (contd) Schweitzer, B. and Wol, E.F. (1981). On nonparametric measures of dependence, Ann. of Statistics 9, 879-885. Sibuya, M. (1960). Bivariate extreme statistics. Annals of the Institute of Statistical Mathematics 11, 195-210. Sklar, A. (1959). Fonctions de repartition a n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 8, 229-331.