Download 3.2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 3
Section 2
Measures of
Dispersion
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 1 of 27
Chapter 3 – Section 2
● Learning objectives
1

2

3

4

5

The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Use Chebyshev’s inequality
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 2 of 27
Chapter 3 – Section 2
● Comparing two sets of data
● The measures of central tendency (mean,
median, mode) measure the differences
between the “average” or “typical” values
between two sets of data
● The measures of dispersion in this section
measure the differences between how far
“spread out” the data values are
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 3 of 27
Chapter 3 – Section 2
● Learning objectives
1

2

3

4

5

The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Use Chebyshev’s inequality
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 4 of 27
Chapter 3 – Section 2
● The range of a variable is the largest data value
minus the smallest data value
● Compute the range of
6, 1, 2, 6, 11, 7, 3, 3
● The largest value is 11
● The smallest value is 1
● Subtracting the two … 11 – 1 = 10 … the range
is 10
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 5 of 27
Chapter 3 – Section 2
● The range only uses two values in the data set –
the largest value and the smallest value
● The range is not resistant
● If we made a mistake and
6, 1, 2
was recorded as
6000, 1, 2
● The range is now ( 6000 – 1 ) = 5999
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 6 of 27
Chapter 3 – Section 2
● Learning objectives
1

2

3

4

5

The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Use Chebyshev’s inequality
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 7 of 27
Chapter 3 – Section 2
● The variance is based on the deviation from the
mean
 ( xi – μ ) for populations
 ( xi – x ) for samples
● To treat positive differences and negative
differences, we square the deviations
 ( xi – μ )2 for populations
 ( xi – x )2 for samples
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 8 of 27
Chapter 3 – Section 2
● The population variance of a variable is the sum
of these squared deviations divided by the
number in the population
2
2
2
2
(x

μ)
(x

μ)

(x

μ)

...

(x

μ)
 i
2
N
 1
N
N
● The population variance is represented by σ2
● Note: For accuracy, use as many decimal places
as allowed by your calculator
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 9 of 27
Chapter 3 – Section 2
● Compute the population variance of
6, 1, 2, 11
● Compute the population mean first
μ = (6 + 1 + 2 + 11) / 4 = 5
● Now compute the squared deviations
(1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36
● Average the squared deviations
(16 + 9 + 1 + 36) / 4 = 15.5
● The population variance σ2 is 15.5
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 10 of 27
Chapter 3 – Section 2
● The sample variance of a variable is the sum of
these squared deviations divided by one less
than the number in the sample
2
(x1  x )2  (x2  x )2  ...  (xn  x )2
 (xi  x )

n -1
n 1
● The sample variance is represented by s2
● We say that this statistic has n – 1 degrees of
freedom
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 11 of 27
Chapter 3 – Section 2
● Compute the sample variance of
6, 1, 2, 11
● Compute the sample mean first
x = (6 + 1 + 2 + 11) / 4 = 5
● Now compute the squared deviations
(1–5)2 = 16, (2–5)2 = 9, (6–5)2 = 1, (11–5)2 = 36
● Average the squared deviations
(16 + 9 + 1 + 36) / 3 = 20.7
● The sample variance s2 is 20.7
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 12 of 27
Chapter 3 – Section 2
● Why are the population variance (15.5) and the
sample variance (20.7) different for the same set
of numbers?
● In the first case, { 6, 1, 2, 11 } was the entire
population (divide by N)
● In the second case, { 6, 1, 2, 11 } was just a
sample from the population (divide by n – 1)
● These are two different situations
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 13 of 27
Chapter 3 – Section 2
● Why do we use different formulas?
● The reason is that using the sample mean is not
quite as accurate as using the population mean
● If we used “n” in the denominator for the sample
variance calculation, we would get a “biased”
result
● Bias here means that we would tend to
underestimate the true variance
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 14 of 27
Chapter 3 – Section 2
● Learning objectives
1

2

3

4

5

The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Use Chebyshev’s inequality
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 15 of 27
Chapter 3 – Section 2
● The standard deviation is the square root of the
variance
● The population standard deviation
 Is the square root of the population variance (σ2)
 Is represented by σ
● The sample standard deviation
 Is the square root of the sample variance (s2)
 Is represented by s
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 16 of 27
Chapter 3 – Section 2
● If the population is { 6, 1, 2, 11 }
 The population variance σ2 = 15.5
 The population standard deviation σ = 15.5  3.9
● If the sample is { 6, 1, 2, 11 }
 The sample variance s2 = 20.7
 The sample standard deviation s = 20.7  4.5
● The population standard deviation and the
sample standard deviation apply in different
situations
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 17 of 27
Chapter 3 – Section 2
● Learning objectives
1

2

3

4

5

The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Use Chebyshev’s inequality
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 18 of 27
Chapter 3 – Section 2
● The standard deviation is very useful for
estimating probabilities
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 19 of 27
Chapter 3 – Section 2
● The empirical rule
● If the distribution is roughly bell shaped, then
 Approximately 68% of the data will lie within 1
standard deviation of the mean
 Approximately 95% of the data will lie within 2
standard deviations of the mean
 Approximately 99.7% of the data (i.e. almost all) will
lie within 3 standard deviations of the mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 20 of 27
Chapter 3 – Section 2
● For a variable with mean 17 and standard
deviation 3.4
 Approximately 68% of the values will lie between
(17 – 3.4) and (17 + 3.4), i.e. 13.6 and 20.4
 Approximately 95% of the values will lie between
(17 – 2  3.4) and (17 + 2  3.4), i.e. 10.2 and 23.8
 Approximately 99.7% of the values will lie between
(17 – 3  3.4) and (17 + 3  3.4), i.e. 6.8 and 27.2
● A value of 2.1 (less than 6.8) and a value of 33.2
(greater than 27.2) would both be very unusual
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 21 of 27
Chapter 3 – Section 2
● Learning objectives
1

2

3

4

5

The range of a variable
The variance of a variable
The standard deviation of a variable
Use the Empirical Rule
Use Chebyshev’s inequality
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 22 of 27
Chapter 3 – Section 2
● Chebyshev’s inequality gives a lower bound on
the percentage of observations that lie within k
standard deviations of the mean (where k > 1)
● This lower bound is
 An estimated percentage
 The actual percentage for any variable cannot be
lower than this number
● Therefore the actual percentage must be this
value or higher
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 23 of 27
Chapter 3 – Section 2
● Chebyshev’s inequality
● For any data set, at least


1
1 
 100%

k 2 

of the observations will lie within k standard
deviations of the mean, where k is any number
greater than 1
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 24 of 27
Chapter 3 – Section 2
● How much of the data lies within 1.5 standard
deviations of the mean?
● From Chebyshev’s inequality


1
1 
 100%  55.6%
2

1.5 

so that at least 55.6% of the data will lie within
1.5 standard deviations of the mean
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 25 of 27
Chapter 3 – Section 2
● If the mean is equal to 20 and the standard
deviation is equal to 4, how much of the data lies
between 14 and 26?
● 14 to 26 are 1.5 standard deviations from 20


1  1  100%  55.6%
2

1
.
5


so that at least 55.6% of the data will lie between
14 and 26
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 26 of 27
Summary: Chapter 3 – Section 2
● Range
 The maximum minus the minimum
 Not a resistant measurement
● Variance and standard deviation
 Measures deviations from the mean
 Not a resistant measurement
● Empirical rule
 About 68% of the data is within 1 standard deviation
 About 95% of the data is within 2 standard deviations
Sullivan – Fundamentals of Statistics – 2nd Edition – Chapter 3 Section 2 – Slide 27 of 27
Related documents