Download EDA Answer Keys

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
EDA Answer Keys
EDA Activities #1-3 Answers
1. Women’s Ages
Summary statistics:
Column
n Mean Std. dev. Median Min Max Q1 Q3 IQR
Women Age (years) 40 33.225 12.453992
31.5 12 59 23 41 18
This data describes the ages of 40 women in years. The shape of the data set was
slightly skewed right. The best center was the median of 31.5 years. So the
average age of these women is 31.5 years old. The best measure of spread is the
IQR of 18 years. So typical women’s ages were 18 years from each other. In fact,
typical ages were in between 23 years old (Q1) and 41 years old (Q3). There were
no outliers or unusual ages in the data set. The youngest girl was 12 years old and
the oldest woman was 59 years old, but neither was unusual.
2. Men’s Ages
Summary statistics:
Column
n Mean Std. dev. Median Min Max Q1 Q3 IQR
Men Age (years) 40 35.475 13.926524
32.5 17 73 25.5 45 19.5
This data describes the ages of 40 men in years. The shape of the data set was
slightly skewed right. The best center was the median of 32.5 years. So the
average age of these men is 32.5 years old. The best measure of spread is the IQR
of 19.5 years. So typical men’s ages were 19.5 years from each other. In fact,
typical ages were in between 25.5 years old (Q1) and 45 years old (Q3). There
were no outliers or unusual ages in the data set. The youngest man was 17 years
old and the oldest man was 73 years old, but neither was unusual.
3. Women’s Heights in inches
Summary statistics:
Column
n Mean Std. dev. Median Min Max Q1 Q3 IQR
Women Ht (in) 40 63.195 2.7412284 63.35 57 68 61.35 64.9 3.55
Mean = 63.20
Standard Deviation = 2.74
63.20-2.74<Typical <63.20+2.74
60.46 < Typical < 65.94
The data describes the heights of 40 women in inches. The data set was bell
shaped. The best center was the mean of 63.20 inches. So the average height for
the women was 63.20 inches. The best measure of spread was the standard
deviation of 2.74 inches. So typical women’s heights were 2.74 inches from the
mean. So typical heights were between 60.46 inches and 65.94 inches. There
were no outliers (no unusual heights). The tallest woman was 68 inches and the
shortest was 57 inches, but neither was unusual.
4. Men’s Heights in inches
Summary statistics:
Column n Mean Std. dev. Median Min Max Q1 Q3 IQR
Men Ht (in) 40 68.335 3.0195559
68.3 61.3 76.2 66.3 70.15 3.85
Mean = 68.34
Standard Deviation = 3.02
68.34 – 3.02 <Typical < 68.34 + 3.02
65.32 < Typical < 71.36
The data describes the heights of 40 men in inches. The data set was bell shaped.
The best center was the mean of 68.34 inches. So the average height for the men
was 68.34 inches. The best measure of spread was the standard deviation of 3.02
inches. So typical men’s heights were 3.02 inches from the mean. So typical
heights were between 65.32 inches and 71.36 inches. The tallest man was 76.2
inches and this was an outlier. He was considered unusually tall compared to the
other men in the data. The shortest man was 61.3 inches, but this was not
considered unusual.
5. Women’s weights in pounds
Summary statistics:
Column
n Mean Std. dev. Median Min Max Q1 Q3 IQR
Women Wt (Lbs) 40 146.22 37.62104 135.8 94.3 255.9 116.9 162.95 46.05
This data describes the weights of 40 women in pounds. The shape of the data
set was skewed right. The best center was the median of 135.8 pounds. So the
average weight of these women is 135.8 pounds. The best measure of spread is
the IQR of 46.05 pounds. So typical women’s weights were 46.05 pounds from
each other. In fact, typical weights were in between 116.9 pounds (Q1) and
162.95 pounds (Q3). There were two outliers in the data set at 238.4 pounds and
255.9 pounds. The heaviest woman was unusual considered unusual at 255.9
pounds and the lightest woman was 94.3 pounds, but this was not unusual.
6. Men’s Weights in Pounds
Summary statistics:
Column n Mean Std. dev. Median Min Max Q1 Q3 IQR
Men Wt (Lbs) 40 172.55 26.327163 169.95 119.5 237.1 152.2 190.1 37.9
Mean = 172.55
Standard Deviation = 26.33
172.55 – 26.33 <Typical < 172.55 + 26.33
146.22 < Typical < 198.88
The data describes the weights of 40 men in pounds. The data set was bell
shaped. The best center was the mean of 172.55 pounds. So the average height
for the men was 172.55 pounds. The best measure of spread was the standard
deviation of 26.33 pounds. So typical men’s weights were 26.33 pounds from the
mean. So typical weights were between 146.22 pounds and 198.88 pounds.
There were no outliers in the data set. The heaviest man was 237.1 pounds and
the lightest man was 119.5 pounds but neither of these was unusual.
7. Women’s Pulse Rates
Summary statistics:
Column
n Mean Std. dev. Median Min Max Q1 Q3 IQR
Women Pulse (Beats per min) 40 76.3 12.498615
74 60 124 68 80 12
This data describes the pulse rates of 40 women in beats per minute (bpm). The
shape of the data set was skewed right. The best center was the median of 74
bpm. So the average pulse rate of these women 74 bpm. The best measure of
spread is the IQR of 12 bpm. So typical women’s pulse rates were 12 bpm from
each other. In fact, typical pulse rates were in between 68 bpm (Q1) and 80 bpm
(Q3). There were two outliers in the data set at 104 bpm and 124 bpm. The
highest pulse rate was 124 bpm and was considered unusual. The lowest pulse
rate was 60 bpm, but this was not unusual.
8. Men’s pulse rates
Summary statistics:
Column
n Mean Std. dev. Median Min Max Q1 Q3 IQR
Men Pulse (BPM) 40 69.4 11.297379
66 56 96 60 76 16
This data describes the pulse rates of 40 men in beats per minute (bpm). The
shape of the data set was skewed right. The best center was the median of 66
bpm. So the average pulse rate of these men 66 bpm. The best measure of
spread is the IQR of 16 bpm. So typical men’s pulse rates were 16 bpm from each
other. In fact, typical pulse rates were in between 60 bpm (Q1) and 76 bpm (Q3).
There were no outliers in the data set. The highest pulse rate was 96 bpm and the
lowest pulse rate was 56 bpm, but neither of these was unusual.
9. Women’s Body Mass Index (BMI)
Summary statistics:
Column n Mean Std. dev. Median Min Max Q1 Q3 IQR
Women BMI 40 25.74 6.1655702
23.9 17.7 44.9 20.95 29.4 8.45
This data describes the body mass index of 40 women. The shape of the data set
was skewed right. The best center was the median of 23.9. So the average BMI
for these women 23.9. The best measure of spread is the IQR of 8.45. So typical
women’s BMI were 8.45 BMI points from each other. In fact, typical BMI scores
were in between 20.95 (Q1) and 29.4 (Q3). The highest BMI of 44.9 was an
outlier and unusually high, but the lowest BMI of 17.7 was not unusual.
10. Men’s Body Mass Index (BMI)
Summary statistics:
Column n Mean Std. dev. Median Min Max Q1 Q3 IQR
Men BMI 40 25.9975 3.4307424
26.2 19.6 33.2 23.65 27.6 3.95
Mean = 26.00
Standard Deviation = 3.43
26.00 – 3.43 <Typical < 26.00 + 3.43
22.57 < Typical < 29.43
The data describes the body mass index (BMI) of 40 men. The data set was bell
shaped. The best center was the mean of 26.00. So the average BMI for the men
was 26.00. The best measure of spread was the standard deviation of 3.43. So
typical men’s weights were 3.43 BMI points from the mean. So typical BMI scores
were between 22.57 and 29.43. There were no outliers in the data set. The
highest BMI score was 33.2 and the lowest BMI score was 19.6, but neither of
these was unusual.
EDA Activity 4 Answers
1. For each of the following sample statistics, classify it as a measure of spread
(variability), a measure of center (average), or a measure of position. Then write
a sentence describing what the statistic tells us.
a) Mean: A measure of center or average. It is the balancing point in the
data set in terms of distances. It is only accurate when the data is bell shaped.
b) Standard Deviation: A measure of spread. It is how far typical values in
the data are from the mean. It is only accurate when the data is bell shaped.
c) Minimum: A measure of position. The smallest # in the data set.
d) Range: A measure of spread. Does not represent typical values in the
data and is influenced by outliers.
e) Median: A measure of center or average. This is the true center of the
data when the values are put in order. Approximately 50% of the numbers in the
data set will be less than the median and 50% of the numbers will be higher than
the median. A very accurate measure of center. Often used when the data set is
skewed.
f) Quartile 3 (Q3): A measure of position. Approximately 75% of the data
set is less than Q3. Often used as part of the typical range for skewed data sets.
g) Interquartile Range (IQR): A measure of spread. IQR is a highly accurate
measure of typical spread. It is how far typical values are from each other. It also
measures the middle 50% of the data values and is the length of the box in a
boxplot.
h) Maximum: A measure of position. The largest # in the data set.
i) Quartile 1 (Q1): A measure of position. Approximately 25% of the data
set is less than Q1. Often used as part of the typical range for skewed data sets.
j) Mode: A measure of center. This is the number that occurs most often.
It is a useful statistic in finance and sales. It is often a good measure of center for
bi-modal data sets.
k) Variance: A measure of spread. The Variance is the standard deviation
squared and is a vital statistic in ANOVA testing, but it is only accurate if the data
set is bell shaped.
2. List all the measures of center. Which is the most accurate for bell shaped
(normal) data sets? Which is the most accurate for skewed data sets?
Measures of Center: Mean, Median, Mode,
Bell shape: The mean and median are both accurate, but it is custom to use the
mean as your center and average when bell shaped.
Skewed: The mean is not very accurate and should not be used. The median is
the most accurate center and average for skewed data sets.
3. List all the measures of spread. Which is the most accurate for bell shaped
(normal) data sets? Which is the most accurate for skewed data sets?
Measures of Spread: Range, Standard Deviation, Variance, IQR
Bell shaped: We like to use the standard deviation when the data set is bell
shaped.
Skewed: We should not use the standard deviation or variance when the data set
is skewed. Since quartiles are not effected by outliers and a skew, we should use
IQR as the most accurate measure of spread when a data set is skewed.
4. List all the measures of position.
Min, Max, Q1, Q3
5. A very important statistic that is not a center, spread or position, is the
frequency or sample size. Write a sentence describing the meaning of the sample
size.
The sample size or frequency (n) counts how many numbers are in the data set.
6. Use Statcrunch and the Bear data to find all of the summary statistics we
discussed for the bears weight. You need to give the name of the statistic, the
number and the units. All the weights are in pounds.
Summary statistics:
Column n Mean Variance Std. dev. Median
Weight (Lbs) 54 182.88889 14835.535 121.80121
150
Summary statistics:
Column Range Min Max Q1 Q3 IQR Mode
Weight (Lbs) 488 26 514 86 236 150 No unique
Related documents