Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Business Statistics: A Decision-Making Approach 7th Edition Chapter 3 Describing Data Using Numerical Measures Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Chapter Goals After completing this chapter, you should be able to:  Compute and interpret the mean, median, and mode for a set of data  Compute the range, variance, and standard deviation and know what these values mean  Construct and interpret a box and whisker graph  Compute and explain the coefficient of variation and z scores  Use numerical measures along with graphs, charts, and tables to describe data Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-2 Chapter Topics  Measures of Center and Location   Other measures of Location   Weighted mean, percentiles, quartiles Measures of Variation   Mean, median, mode Range, interquartile range, variance and standard deviation, coefficient of variation Using the mean and standard deviation together  Coefficient of variation, z-scores Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-3 Summary Measures Describing Data Numerically Center and Location Other Measures of Location Mean Median Mode Variation Range Percentiles Interquartile Range Quartiles Weighted Mean Variance Standard Deviation Coefficient of Variation Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-4 Measures of Center and Location Overview Center and Location Mean Median Mode Weighted Mean n x x i1 i XW n  i1 i i i N x wx   w wx   w i N Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. W i i i Chap 3-5 Mean (Arithmetic Average)  The Mean is the arithmetic average of data values  Population mean N = Population Size N x x1  x 2    xN   N N i1  i Sample mean n = Sample Size n x x i1 i n Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. x1  x 2    x n  n Chap 3-6 Mean (Arithmetic Average) (continued)    The most common measure of central tendency Mean = sum of values divided by the number of values Affected by extreme values (outliers) 0 1 2 3 4 5 6 7 8 9 10 Mean = 3 1  2  3  4  5 15  3 5 5 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. 0 1 2 3 4 5 6 7 8 9 10 Mean = 4 1  2  3  4  10 20  4 5 5 Chap 3-7 Median   In an ordered array, the median is the “middle” number, i.e., the number that splits the distribution in half The median is not affected by extreme values 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 Median = 3 Median = 3 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-8 Median (continued)  To find the median, sort the n data values from low to high (sorted data is called a data array)  Find the value in the i = (1/2)n position  The ith position is called the Median Index Point  If i is not an integer, round up to next highest integer Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-9 Median Example (continued) Data array: 4, 4, 5, 5, 9, 11, 12, 14, 16, 19, 22, 23, 24  Note that n = 13  Find the i = (1/2)n position: i = (1/2)(13) = 6.5  Since 6.5 is not an integer, round up to 7  The median is the value in the 7th position: Md = 12 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-10 Shape of a Distribution  Describes how data is distributed  Symmetric or skewed Left-Skewed Symmetric Right-Skewed Mean < Median Mean = Median Median < Mean (Longer tail extends to left) Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. (Longer tail extends to right) Chap 3-11 Mode       A measure of location The value that occurs most often Not affected by extreme values Used for either numerical or categorical data There may be no mode There may be several modes 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6 Mode = 5 No Mode Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-12 Weighted Mean  Used when values are grouped by frequency or relative importance Example: Sample of 26 Repair Projects Days to Complete Frequency 5 4 6 12 7 8 8 2 Weighted Mean Days to Complete: XW  w x w Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. i i  (4  5)  (12  6)  (8  7)  (2  8) 4  12  8  2  164  6.31 days 26 i Chap 3-13 Review Example  Five houses on a hill by the beach $2,000 K House Prices: $2,000,000 500,000 300,000 100,000 100,000 $500 K $300 K $100 K $100 K Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-14 Summary Statistics House Prices: $2,000,000 500,000 300,000 100,000 100,000  Mean:  Median: middle value of ranked data = $300,000  Mode: most frequent value = $100,000 Sum 3,000,000 ($3,000,000/5) = $600,000 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-15 Which measure of location is the “best”?  Mean is generally used, unless extreme values (outliers) exist  Then Median is often used, since the median is not sensitive to extreme values.  Example: Median home prices may be reported for a region – less sensitive to outliers Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-16 Other Location Measures Other Measures of Location Percentiles The pth percentile in a data array:   p% are less than or equal to this value (100 – p)% are greater than or equal to this value (where 0 ≤ p ≤ 100) Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Quartiles  1st quartile = 25th percentile  2nd quartile = 50th percentile = median  3rd quartile = 75th percentile Chap 3-17 Percentiles  The pth percentile in an ordered array of n values is the value in ith position, where p i (n) 100  If i is not an integer, round up to the next higher integer value Example: Find the 60th percentile in an ordered array of 19 values. p 60 i (n)  (19)  11.4 100 100 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. So use value in the i = 12th position Chap 3-18 Quartiles  Quartiles split the ranked data into 4 equal groups: 25% 25% Q1  Q2 25% 25% Q3 Note that the second quartile (the 50th percentile) is the median Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-19 Quartiles  Example: Find the first quartile Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22 (n = 9) Q1 = 25th percentile, so find i : 25 i = 100 (9) = 2.25 so round up and use the value in the 3rd position: Q1 = 13 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-20 Box and Whisker Plot  A graphical display of data using a central “box” and extended “whiskers”: Example: 25% 25% 25% 25% * * Outliers Lower Limit 1st Quartile Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Median 3rd Quartile Upper Limit Chap 3-21 Constructing the Box and Whisker Plot * * Outliers Lower Limit 1st Quartile The lower limit is Q1 – 1.5 (Q3 – Q1)     Median 3rd Quartile Upper Limit The upper limit is Q3 + 1.5 (Q3 – Q1) The center box extends from Q1 to Q3 The line within the box is the median The whiskers extend to the smallest and largest values within the calculated limits Outliers are plotted outside the calculated limits Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-22 Shape of Box and Whisker Plots  The Box and central line are centered between the endpoints if data is symmetric around the median  (A Box and Whisker plot can be shown in either vertical or horizontal format) Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-23 Distribution Shape and Box and Whisker Plot Left-Skewed Q1 Q2 Q3 Symmetric Q1 Q2 Q3 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Right-Skewed Q1 Q2 Q3 Chap 3-24 Box-and-Whisker Plot Example  Below is a Box-and-Whisker plot for the following data: Min Q1 Q2 Q3 Max 0 2 2 2 3 3 4 5 6 11 27 * 0 2 3 6 12 Upper limit = Q3 + 1.5 (Q3 – Q1) = 6 + 1.5 (6 – 2) = 12  27 27 is above the upper limit so is shown as an outlier This data is right skewed, as the plot depicts Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-25 Measures of Variation Variation Range Interquartile Range Variance Standard Deviation Population Variance Population Standard Deviation Sample Variance Sample Standard Deviation Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Coefficient of Variation Chap 3-26 Variation  Measures of variation give information on the spread or variability of the data values. Same center, different variation Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-27 Range   Simplest measure of variation Difference between the largest and the smallest observations: Range = xmaximum – xminimum Example: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Range = 14 - 1 = 13 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-28 Disadvantages of the Range  Ignores the way in which data are distributed 7 8 9 10 11 12 Range = 12 - 7 = 5  7 8 9 10 11 12 Range = 12 - 7 = 5 Sensitive to outliers 1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5 Range = 5 - 1 = 4 1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120 Range = 120 - 1 = 119 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-29 Interquartile Range  Can eliminate some outlier problems by using the interquartile range  Eliminate some high-and low-valued observations and calculate the range from the remaining values.  Interquartile range = 3rd quartile – 1st quartile Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-30 Interquartile Range Example Example: X minimum Q1 25% 12 Median (Q2) 25% 30 25% 45 X Q3 maximum 25% 57 70 Interquartile range = 57 – 30 = 27 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-31 Variance  Average of squared deviations of values from the mean  Population variance: N σ  2  i1 N n Sample variance: s  2 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. 2 (x  μ)  i 2 (x  x )  i i1 n -1 Chap 3-32 Standard Deviation    Most commonly used measure of variation Shows variation about the mean Has the same units as the original data  N Population standard deviation: σ  Sample standard deviation: i1 N n s Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. 2 (x  μ)  i 2 (x  x )  i i1 n -1 Chap 3-33 Calculation Example: Sample Standard Deviation Sample Data (Xi) : 10 12 n=8 s 14 15 17 18 18 24 Mean = x = 16 (10  x ) 2  (12  x ) 2  (14  x ) 2    (24  x ) 2 n 1  (10  16) 2  (12  16) 2  (14  16) 2    (24  16) 2 8 1  130 7  4.3095 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-34 Comparing Standard Deviations Same mean, but different standard deviations: Data A 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 3.338 20 21 Mean = 15.5 s = .9258 20 21 Mean = 15.5 s = 4.57 Data B 11 12 13 14 15 16 17 18 19 Data C 11 12 13 14 15 16 17 18 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. 19 Chap 3-35 Coefficient of Variation  Measures relative variation  Always in percentage (%)  Shows variation relative to mean  Is used to compare two or more sets of data measured in different units Population σ CV   μ   100%   Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Sample  s   CV    x   100%   Chap 3-36 Comparing Coefficients of Variation  Stock A:  Average price last year = $50  Standard deviation = $5 s $5   CVA     100%   100%  10% $50 x  Stock B:   Average price last year = $100 Standard deviation = $5 s $5 CVB     100%   100%  5% $100 x Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Both stocks have the same standard deviation, but stock B is less variable relative to its price Chap 3-37 The Empirical Rule   If the data distribution is bell-shaped, then the interval: μ  1σ contains about 68% of the values in the population or the sample 68% μ μ  1σ Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-38 The Empirical Rule   μ  2σ contains about 95% of the values in the population or the sample μ  3σ contains about 99.7% of the values in the population or the sample 95% 99.7% μ  2σ μ  3σ Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-39 Tchebysheff’s Theorem  Regardless of how the data are distributed, at least (1 - 1/k2) of the values will fall within k standard deviations of the mean  Examples: At least within (1 - 1/12) = 0% ……..... k=1 (μ ± 1σ) (1 - 1/22) = 75% …........ k=2 (μ ± 2σ) (1 - 1/32) = 89% ………. k=3 (μ ± 3σ) Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-40 Standardized Data Values  A standardized data value refers to the number of standard deviations a value is from the mean  Standardized data values are sometimes referred to as z-scores Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-41 Standardized Population Values x μ z σ where:  x = original data value  μ = population mean  σ = population standard deviation  z = standard score (number of standard deviations x is from μ) Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-42 Standardized Sample Values xx z s where:  x = original data value  x = sample mean  s = sample standard deviation  z = standard score (number of standard deviations x is from μ) Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-43 Standardized Value Example  IQ scores in a large population have a bellshaped distribution with mean μ = 100 and standard deviation σ = 15 Find the standardized score (z-score) for a person with an IQ of 121. Answer: x  μ 121 100 z   1.4 σ 15 Someone with an IQ of 121 is 1.4 standard deviations above the mean Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-44 Using Microsoft Excel  Descriptive Statistics are easy to obtain from Microsoft Excel  Use menu choice: Data / data analysis / descriptive statistics  Enter details in dialog box Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-45 Using Excel  Select: Data / data analysis / descriptive statistics Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-46 Using Excel (continued)  Enter dialog box details  Check box for summary statistics  Click OK Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-47 Excel output Microsoft Excel descriptive statistics output, using the house price data: House Prices: $2,000,000 500,000 300,000 100,000 100,000 Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-48 Chapter Summary  Described measures of center and location  Mean, median, mode, weighted mean  Discussed percentiles and quartiles  Created Box and Whisker Plots  Illustrated distribution shapes  Symmetric, skewed Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-49 Chapter Summary (continued)  Described measure of variation  Range, interquartile range, variance, standard deviation, coefficient of variation  Discussed Tchebysheff’s Theorem  Calculated standardized data values Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-50