Download Rules of Data Dispersion

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Rules of Data Dispersion
• By using the mean and standard deviation, we
can find the percentage of total observations
that fall within the given interval about the
mean.
Rules of Data Dispersion
• Empirical Rule
• Chebyshev’s Theorem
(IMPORTANT TERM: AT LEAST)
Empirical Rule
Applicable for a symmetric bell shaped
distribution / normal distribution.
There are 3 rules:
i. 68% of the observations lie in the interval
(mean ±SD)
ii. 95% of the observations lie in the interval
(mean ±2SD)
iii. 99.7% of the observations lie in the interval
(mean ±3SD)
Empirical Rule
Empirical Rule
• Example: 95% of students at school are
between 1.1m and 1.7m tall. Assuming this
data is normally distributed can you calculate
the mean and standard deviation?
Empirical Rule
Empirical Rule
• The age distribution of a sample of 5000
persons is bell shaped with a mean of 40 yrs
and a standard deviation of 12 yrs. Determine
the approximate percentage of people who
are 16 to 64 yrs old.
Chebyshev’s Theorem
Chebyshev’s Theorem
• Applicable for any distribution /not normal
distribution
1
(1

) of the observations will be in
• At least
k
the range of k standard deviation from mean
where k is the positive number exceed 1 or
(k>1).
2
Chebyshev’s Theorem
• Example
Assuming that the weight of students in this
class are not normally distributed, find the
percentage of student that falls under 2SD.
Chebyshev’s Theorem
• Consider a distribution of test scores that are
badly skewed to the right, with a sample
mean of 80 and a sample standard deviation
of 5. If k=2, what is the percentage of the data
fall in the interval from mean?
Measures of Position
To describe the relative position of a certain
data value within the entire set of data.
•z scores
•Percentiles
•Quartiles
•Outliers
Quartiles
• Divide data sets into fourths or four equal
parts.
Boxplot
IQR  Q3  Q1
Lower Fence  Q1  1.5( IQR)
Upper Fence  Q3  1.5( IQR)
Boxplot
Outliers
• Extreme observations
• Can occur because of the error in
measurement of a variable, during data entry
or errors in sampling.
Outliers
Checking for outliers by using Quartiles
Step 1: Determine the first and third quartiles of data.
Step 2: Compute the interquartile range (IQR).
IQR  Q3  Q1
Step 3: Determine the fences. Fences serve as cutoff
points for determining outliers.
Lower Fence  Q1  1.5( IQR)
Upper Fence  Q3  1.5( IQR)
Step 4: If data value is less than the lower fence or
greater than the upper fence, considered outlier.
Related documents