Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Measures of Dispersion
Week 4
Dispersion
• Two groups of three students
Group 1
4
7
10
Group 2
7
7
7
• Mean mark
Group 1
4 + 7 + 10 = 21/3 = 7
Group 2
7 + 7 + 7 = 21/3 = 7
• Same mean mark, but Group 1’s marks are
widely spread, Group 2’s are all the same
• The following diagram reinforces this point
2
3
Range
• The absolute difference between the
highest and lowest value of the raw data
• Group of students 4
7
10
• Range = Maximum – Minimum
10 – 4 = 6
4
Interquartile Range
• This is the absolute difference between
the upper and lower quartiles of the
distribution.
• Interquartile Range =
Upper Quartile - Lower Quartile
• See next powerpoints for estimating
quartiles
5
Quartiles (1)
• Upper quartile: that value for which 25%
of the distribution is above it and 75%
below
• Lower quartile: that value for which 75%
of the distribution is above it and 25%
below
6
Quartiles (2)
• If the data is ungrouped, then put the data
in order in an array
• Find the quartile position , then estimate
its value, as previously for the median
• Upper quartile (Q3): position = 3(n + 1)
4
• Lower quartile (Q1): position = (n + 1)
4
7
Quartiles (3)
Example: ungrouped data:
3, 5, 6, 9, 15, 27, 30, 35, 37
• Lower quartile: position = n + 1 = 9 + 1 = 2.5th
4
4
Lower quartile: value = 5.5
(mid-way between 2nd and 3rd number in array)
• Upper quartile: position = 3(n + 1) = 3(9 + 1)
4
4
= 7.5th
Upper quartile: value = 32.5
(mid-way between 7th and 8th number in array)
8
Quartiles (4)
• Grouped data: use the same approach as
for estimating the median for grouped data
in week 4, except this time use the
quartile positions
9
Semi-Interquartile Range
• This is half the interquartile range. It is
sometimes called the Quartile Deviation
• Semi-Interquartile Range
= Upper Quartile - Lower Quartile
2
10
Example
Using previous ungrouped data
Interquartile range
= UQ - LQ
= 32.5 – 5.5 = 27
Semi-interquartile range = UQ - LQ
2
= 32.5 – 5.5
2
= 27 = 13.5
2
11
Mean Deviation
• Average of the absolute deviations from
the arithmetic mean (ignoring the sign)
• When two straight lines (rather than
curved brackets) surround a number or
variable it is referred to as the modulus
and we ignore the sign
12
Mean Deviation of ungrouped data
• X1 = 2,
X2 = 4,
X3 = 3
• MD = X 1  X  X 2  X  X 3  X
n
• MD =
2  3 4  3 3 3
3
=
1  1  0
3
=⅔
13
Variance
• If we square all the deviations from the
arithmetic mean, then we no longer need
to bother with dropping the signs since all
the values will be positive.
• We can then replace the straight line
brackets (modulus) for the Mean Deviation
with the more usual round brackets.
• Variance is the average of the squared
deviations from the arithmetic mean
14
Variance: ungrouped data (1)
•
Variance =
 X
n
i 1
i
X

2
n
• To calculate the variance
1. Calculate the mean value
X
2. Subtract the mean from each value in turn,
that is, find X i  X
2
3. Square each answer to get
Xi  X


15
Variance: ungrouped data (2)
4. Add up all these squared values to get
 X
n
i 1
X
i
5. Divide the result by n to get
 X
n
i 1
1
X

2

2
n
6. You now have the average of the squared deviations
from the mean (in square units)
16
Standard deviation (SD)
• This is simply the square root of the
variance
• An advantage is that we avoid the square
units of the variance
• Larger SD, larger the average dispersion
of data from the mean
• Smaller SD, smaller the average
dispersion of data from the mean
17
Example 1: variance/standard
deviation
xi
x1 - x
(x1 – x)2
4
7
10
Total
4–7=-3
7–7= 0
10 – 7 = 3
(-32) = 9
02 = 0
32 = 9
18
18
Solutions
 X
n
Variance =
i 1
i
X
n

2
18
  6 square units
3
Standard deviation is square root of 6
= 2.449 units
19
Example 2: variance/standard
deviation
xi
xi - x
7
7
7
Total
7–7=0
7–7=0
7–7=0
(xi – x)2
02 = 0
02 = 0
02 = 0
0
20
Solution
 X
n
Variance =
i 1
i
X
n

2
0
  0 square units
3
Standard deviation is square root of 0 = 0
i.e. there is no spread of values
21
Variance of grouped data
j
S 
2
F X
i
i 1
j
F
i 1
i

  Fi X i

 i 1 j

  Fi
 i 1
j
2
i






2
where Fi = Frequency of ith class interval
Xi = mid point of ith class interval
j = number of class intervals
22
Price of item (£)
No of items
sold
LCB
Fi
UCB
Xi
FiXi
FiXi^2
1.5
2.5
15
2
30
60
2.5
3.5
2
3
6
18
3.5
4.5
19
4
76
304
4.5
5.5
10
5
50
250
5.5
6.5
14
6
84
504
246
1136
60
23
1136  246 
S 


60  60 
2
2
S2 = 18.93 – 4.12
S2 = 18.93 – 16.81
S2 = £2.122
S = √ 2.12 = £1.45
24
Co-efficient of variation (C of V)
• A measure of relative dispersion
S
• Given by X i.e. the standard
deviation divided by the arithmetic
mean of the data.
• Data sets with a higher co-efficient of
variation have higher relative
dispersion
25
Related documents