Download Univariate data analysis Variance and standard deviation Step 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Univariate data analysis
• Measures of central tendency
– Mean, median, mode
• Measures of spread
– Variance, standard deviation, standard error
• Measures of skew
Variance and standard deviation
• The variance (σ2) of a distribution is the mean of
the squares of the deviations of each observation
from the mean
σ2 =
( x1 − x ) 2 + ( x2 − x ) 2 + .... + ( xi − x ) 2
=
n −1
∑ (x − x)
2
i
n
• The standard deviation (σ) of a distribution is the
square root of the variance
σ=
∑ (x − x)
2
i
n −1
Step 1 – look at the data
• Examine the distribution
– histogram
– scatter plot
• Identify outliers
1
Step 2 – Describe the data
• What is the most effective way to present
the information contained in the data?
– Table, histogram, line graph, box plot, etc.
• Median vs. mean
– Income and age at marriage
Skewed distributions
Bimodal distributions
2
Box plot: a convenient way to describe the central
tendency and spread of a distribution
Descriptive tables
Sandberg, J. 2005. "The Influence of
Network Mortality Experience on
Nonnumeric Response Concerning
Expected Family Size: Evidence From a
Nepalese Mountain Village."
Demography 42:737-756.
3
0
.005
.01
Percent
.015
5
.02
.025
Recoding
20
40
60
AGE OF RESPONDENT
80
0
5
Percent
10
15
1
20
Easier to interpret?
0
18-29
30-39
40-49
50-59
60-69
AGE OF RESPONDENT
70-79
80-89
Too much formatting
25
Percent
20
15
10
5
0
18-29
30-39
40-49
50-59
60-69
70-79
80-89
AGE OF RESPONDENT
4
Way too much formatting
25
Percent
20
15
10
5
0
18-29
30-39
40-49
50-59
60-69
70-79
80-89
AGE OF RESPONDENT
5
Related documents