Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LSSG Green Belt Training Descriptive Statistics Descriptive Statistics 1 Describing Data: Summary Measures Measures of Central Location Mean, Median, Mode Measures of Variation Range, Variance and Standard Deviation Descriptive Statistics 2 Mean •It is the Arithmetic Average of data values: x= Sample Mean n xi i =1 n xi + x 2 + + xn = n •The Most Common Measure of Central Tendency •Affected by Extreme Values (Outliers) 0 1 2 3 4 5 6 7 8 9 10 Mean = 5 0 1 2 3 4 5 6 7 8 9 10 12 14 Descriptive Statistics Mean = 6 3 Median •Important Measure of Central Tendency •In an ordered array, the median is the “middle” number. •If n is odd, the median is the middle number. •If n is even, the median is the average of the 2 middle numbers. •Not Affected by Extreme Values 0 1 2 3 Median 4 5 6 7 =85 9 10 0 1 2 Median 3 4 5 6 =7 58 9 10 12 14 Descriptive Statistics 4 Mode •A Measure of Central Tendency •Value that Occurs Most Often •Not Affected by Extreme Values •There May Not be a Mode •There May be Several Modes •Used for Either Numerical or Categorical Data Mode = 5 Descriptive Statistics No 0 1Mode 2 3 4 5 6 5 Measures Of Variability Range and Inter Quartile Range Variance and Standard Deviation Coefficient of Variation Descriptive Statistics 6 Range • Measure of Variation • Difference Between Largest & Smallest Observations: Range = Highest Value – Lowest Value • Ignores How Data Are Distributed: Range = 12 - 7 = 5 Range = 12 - 7 = 5 7 8 9 10 11 12 7 Descriptive Statistics 8 9 10 11 12 7 Inter Quartile Range Difference between the 75th percentile (3rd Quartile) and the 25th percentile (1st Quartile) Eliminates Effects of Outliers Captures how data are distributed around the median (2nd Quartile) Q1 Q2 Q3 IQR Descriptive Statistics 8 Variance •Important Measure of Variation •Shows Variation About the Mean 2 ( ) m 2 Xi 2 s = m) s = N •For the Population: •For the Sample: S2= 2 2 ( ) Xi X (X-i - ) 2 s = n Descriptive Statistics 1 n -1 9 Standard Deviation •Most Important Measure of Variation •Shows Variation About the Mean 2 ( ) m Xi •For the Population: s = N •For the Sample: s = (X i For the Population: use N in the denominator. - X n -1 )2 For the Sample : use n - 1 in the denominator. Descriptive Statistics 10 Sample Standard Deviation s = (X i - X n -1 X i : Data: 10 12 n=8 )2 For the Sample : use n - 1 in the denominator. 14 15 17 18 18 24 Mean =16 (10 - 16)2 + (12 - 16)2 + (14 - 16)2 + (15 - 16)2 + (17 - 16)2 + (18 - 16)2 + (24 - 16)2 8-1 Sample Standard Deviation= 4.24 Descriptive Statistics 11 Comparing Standard Deviations Data A 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 3.3 Data B 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = .92 Data C 11 12 13 14 15 16 17 18 19 20 21 Descriptive Statistics Mean = 15.5 s = 4.57 12 Coefficient of Variation Relative Variation (adjusted for the mean) Measured Adjusts as a % for differences in magnitude of data Comparison of variation across groups S CV = 100% X Descriptive Statistics 13 Comparing Coefficient of Variation Stock A: Average Price last year = $50 Standard Deviation = $5 Stock B: Average Price last year = $100 Standard Deviation = $5 S CV = 100% X Coefficient of Variation: Stock A: CV = 10% Stock B: CV = 5% Descriptive Statistics 14 Shape of Distribution Describes How Data Are Distributed Measures of Shape: Symmetric or skewed Left-Skewed Mean Median Mode Symmetric Mean = Median = Mode Descriptive Statistics Right-Skewed Mode Median Mean 15 BOX PLOTS Captures Many Statistics in One Chart Mean Max Min Q1 Median Q3 Descriptive Statistics 16