Download Range and Percentile (2)

Summary of Prev. Lecture   Central Tendency Mode   Median   Highest frequency with Nominal or Category data Middle value that can avoid outliers' influence Mean    Arithmetic Mean: First and Second Moment Geometric Mean Weighted Mean 1 Distribution Descriptor 2 1. Measure of Dispersion (2) Geography Jinmu Choi 2. Range and Percentile (2) 3. Mean Deviation, Variance, Std. Dev. (3) 4. Weighted Var. and Std. Dev., CV (3) 5. Skewness and Kurtosis (2) Summary and Next… 2 Dispersion  Dispersion: How the values are concentrated or scattered around the mean and along the value line    Very similar to the mean Quite different from the mean Just scattered around Xa: 1, 3, 5, 7, 9, 11, 13: Mean = Range = Xb: -11, -5, 1, 7, 13, 19, 25: Mean = Range = 3 Dispersion Measures  Magnitude of dispersion      Range: Maximum – Minimum Percentiles Mean deviations Standard deviations Direction and Sharpness   Skewness Kurtosis 4 Range  Range: Maximum – Minimum   The greater the range in a data series, the more dispersed the data are Only how far the values are scattered Xb: -11, -5, 1, 7, 13, 19, 25 : Mean = Range = Xc: -11, -10, 6, 7, 8, 24, 25: Mean = Range = 5 Percentiles  Milestones within the range of data   Sorting and counting ¼, ½, ¾ of the total observations from the minimum Medium = ½ from the minimum = 50% Xb: -11, -5, 1, 7, 13, 19, 25 : Mean = Range = Percentile Xc: -11, -10, 6, 7, 8, 24, 25: Mean = Range = Percentile 6 Mean Deviation  Dispersion using all values The average difference from all values to their mean Xa: 1, 3, 5, 7, 9, 11, 13: n xi  x Mean Dev. = 3.4286 Xb: -11, -5, 1, 7, 13, 19, 25: D  i 1 n Mean Dev. = 10.285  Only concern the distance of the values from the mean, not the direction M.:5 M.Dev. = 2.22… 1 2 3 4 5 6 7 8 9 M.:6 M.Dev. = 3.33…   1 2 3 4 5 6 7 8 18 7 Variance   Squared difference from the mean Population variance n 2    x    n 2 i i 1 n   x  2 i i 1 n  2 Sample variance  x  x  n S2  i 1 2 i n 1 2   xi    xi    i 1   i 1  n 1 n(n  1) n n 2 8 Standard Deviation  Averaged squared deviation   The magnitude or scale of the original dataset Mean: 201.23, Var.: 88432.30, Std. Dev. : 297.38 n  x      i 1  x  x  n 2 i S n i 1 2 i n 1 Resembling Normal distribution with Standard Dev. x   x    About 68% of the data value:   About 95% of the data value: About 99% of the data value: 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 x  2  x  2 x  3  x  3 M.:5 Std.Dev. = 2.58… M.:6 Std.Dev. = 4.76… 18 9 Weighted Variance  Variance for grouped data     n 2  n  x    2 i i 1  n  x  2 i i 1 n  f x  x   f x k k 2 i i  2 w 2 i i  xw 2  w2  i 1 k  i 1 k Get the range for each group (class) fi fi   Get mid value for each group (class) i 1 i 1 Put mid value for each observation 2 n n n 2   2 Calculate variance using list of mid values xi  x   xi    xi    i 1  2 S   i 1 n 1 Range Mid value 4~50 4~50 4~50 4~50 4~50 4~50 4~50 4~50 4~50 4~50 50~200 50~200 50~200 200~1000 200~1000 200~1000 27 27 27 27 27 27 27 27 27 27 125 125 125 600 600 600 i 1 n 1  k S  2 w fx i 1 2 i i k f i 1 i n(n  1)  xw 1 10 2 Weighted Standard Deviation   Square root of weighted variance Sw  Unweighted variance: 88432.30  Unweighted std. dev.: 297.38  Weighted variance: 1537.7615  Weighted std. dev.: 39.21 Why they are differ?  Variations in each group have been removed fx i 1 k f i 1 2 1 i   k w   xw 2 i i i 1 Unweighted Vs. Weighted statistics   k f i xi  x w  2 k f i 1 i 11 Coefficient of Variation   Problem of Mean, Variance: Sensitive to scale Standard deviation X: 1 3 5 7 9 11 13: mean 7, std. dev.: 4 Y: 10 30 50 70 90 110 130: mean 70, std.dev.: 40  Coefficient of variation     To check just scale difference between two datasets S  CV  CV  x x Mean: the center of the data Standard deviation: how much dispersion the data have Both (CV): difference in magnitude for comparing multiple datasets 12 Skewness  Third moment statistic: Directional bias of the distribution of the data  x  x  n Sk    n 3 X axis: numerical range Y axis: frequency Positive skewness   i Use frequency distribution (histogram)   i 1 3 Bulk < Mean Negative skewness  Mean < Bulk 13 Kurtosis  Fourth moment statistic: Sharpness of the distribution of the data  x  x  n K      i n 4 3 Use histogram   i 1 4 X axis: numerical range Y axis: frequency Kurtosis of normal dist.: 3 Normal distribution: K=0 High Kurtosis (sharp peak): K>0 Low Kurtosis (flat): K<0 14 Summary  Dispersion        Range: gives boundary Percentile: gives clustering of observation Mean Deviation: magnitude of dispersion Variance and Standard Deviation: magnitude of dispersion Weighted Variance and Standard Deviation: dispersion of grouped values Coefficient of Variation: removes scale differences Direction and Sharpness   Skewness: direction from mean Kurtosis: sharpness compared to normal distribution 15 Next   Lab3: Additional Statistics and MAUP Lecture 4: Relationship Descriptor 1. Correlation Analysis (Ch 3, pp.94-107) 16

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Range and Percentile (2)