Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Lecture-1: Descriptive Statistics [Measures of central tendency] Akm Saiful Islam Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET) April, 2008 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Descriptive Statistics Measures of Central Tendency Measures of Location Measures of Dispersion Measures of Symmetry Measures of Peakdness WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Measures of Central Tendency The central tendency is measured by averages. These describe the point about which the various observed values cluster. In mathematics, an average, or central tendency of a data set refers to a measure of the "middle" or "expected" value of the data set. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Measures of Central Tendency Arithmetic Mean Geometric Mean Weighted Mean Harmonic Mean Median Mode WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Arithmetic Mean The arithmetic mean is the sum of a set of observations, positive, negative or zero, divided by the number of observations. If we have “n” real numbers x1 , x 2 , x3 , ......., x n , their arithmetic mean, denoted by x , can be expressed as: n x1 x 2 x3 ............. x n x n x x i 1 n i WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Arithmetic Mean of Group Data if z1 , z 2 , z3 ,......... ., z k are the mid-values and f1 , f 2 , f 3 ,........, f k are the corresponding frequencies, where the subscript ‘k’ stands for the number of classes, then the mean is fz z f i i i WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Geometric Mean Geometric mean is defined as the positive root of the product of observations. Symbolically, G ( x1 x 2 x 3 x n ) 1/ n It is also often used for a set of numbers whose values are meant to be multiplied together or are exponential in nature, such as data on the growth of the human population or interest rates of a financial investment. Find geometric mean of rate of growth: 34, 27, 45, 55, 22, 34 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Geometric mean of Group data If the “n” non-zero and positive variatevalues x1 , x2 ,........, xn occur f , f ,......., f times, respectively, then the geometric mean of the set of observations is defined by: 1 G x1 f1 f2 x 2 x n fn n Where N f i i 1 1 N 2 n n fi xi i 1 1 N WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Geometric Mean (Revised Eqn.) Ungroup Data Group Data G ( x1 x2 x3 xn ) G ( x1 f1 x2 f 2 x3 f 3 xn ) 1 G AntiLog N 1 G AntiLog N n i 1 Log xi n i 1 f i Log xi WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Harmonic Mean Harmonic mean (formerly sometimes called the subcontrary mean) is one of several kinds of average. Typically, it is appropriate for situations when the average of rates is desired. The harmonic mean is the number of variables divided by the sum of the reciprocals of the variables. Useful for ratios such as speed (=distance/time) etc. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Harmonic Mean Group Data The harmonic mean H of the positive real numbers x1,x2, ..., xn is defined to be Ungroup Data H n n i 1 Group Data H 1 xi n n i 1 fi xi WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Exercise-1: Find the Arithmetic , Geometric and Harmonic Mean Class Frequency (f) x fx f Log x f/x 20-29 3 24.5 73.5 4.17 8.17 30-39 5 34.5 172.5 7.69 6.9 40-49 20 44.5 890 32.97 2.23 50-59 10 54.5 545 17.37 5.45 60-69 5 64.5 322.5 9.05 12.9 Sum N=43 2003.5 71.24 35.64 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Weighted Mean The Weighted mean of the positive real numbers x1,x2, ..., xn with their weight w1,w2, ..., wn is defined to be n w x i i x i 1 n w i i 1 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Median The implication of this definition is that a median is the middle value of the observations such that the number of observations above it is equal to the number of observations below it. If “n” is odd Me X 1 2 ( n 1) If “n” is Even 1 M e X n X n 1 2 2 2 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Median of Group Data h M e Lo fo n F 2 L0 = Lower class boundary of the median class h = Width of the median class f0 = Frequency of the median class F = Cumulative frequency of the premedian class WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Steps to find Median of group data 1. Compute the less than type cumulative frequencies. 2. Determine N/2 , one-half of the total number of cases. 3. Locate the median class for which the cumulative frequency is more than N/2 . 4. Determine the lower limit of the median class. This is L0. 5. Sum the frequencies of all classes prior to the median class. This is F. 6. Determine the frequency of the median class. This is f0. 7. Determine the class width of the median class. This is h. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example-3:Find Median Age in years Number of births Cumulative number of births 14.5-19.5 677 677 19.5-24.5 1908 2585 24.5-29.5 1737 4332 29.5-34.5 1040 5362 34.5-39.5 294 5656 39.5-44.5 91 5747 44.5-49.5 16 5763 All ages 5763 - WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Mode Mode is the value of a distribution for which the frequency is maximum. In other words, mode is the value of a variable, which occurs with the highest frequency. So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The mode is not necessarily well defined. The list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3. WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example-2: Find Mean, Median and Mode of Ungroup Data The weekly pocket money for 9 first year pupils was found to be: 3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8 Mean 5 Median 4 Mode 4 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Mode of Group Data 1 M 0 L1 h 1 2 L1 = Lower boundary of modal class Δ1 = difference of frequency between modal class and class before it Δ2 = difference of frequency between modal class and class after H = class interval WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Steps of Finding Mode Find the modal class which has highest frequency L0 = Lower class boundary of modal class h = Interval of modal class Δ1 = difference of frequency of modal class and class before modal class Δ2 = difference of frequency of modal class and class after modal class WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam Example -4: Find Mode Slope Angle (°) Midpoint (x) Frequency (f) Midpoint x frequency (fx) 0-4 2 6 12 5-9 7 12 84 10-14 12 7 84 15-19 17 5 85 20-24 22 0 0 n = 30 ∑(fx) = 265 Total