Download Part F- UNDERSTANDING STATISTCS (Descriptive and Inferential

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Part F- UNDERSTANDING STATISTCS
(Descriptive and Inferential Statistics)

Descriptive statistics - summarize data in order to improve
understanding
 Frequency table (with percents) presents data so easier to grasp than just
a list of all scores

Inferential statistics - based on calculations on a sample with
inferences made to the population
 Sampling error is possible, so often report margin of error (how likely
sample is to be different from population)
 Significance tests help decide how reliable sample results are

Populations are described using Parameters while Samples are
described with Statistics
Review questions



What is the difference between descriptive and inferential
statistics?
What is the difference between parameters and statistics?
How do you calculate a percentage?
Scales of Measurement
(N→O→I→R)
Lower → → → → → → Higher

Nominal - named categories with no order
 Examples: Gender; race; marital status

Ordinal - categories in order high/low or more/less
 Examples: Finish in race; ranks

Interval - equal distance between scale points
 Arbitrary zero, can have negative numbers
 Examples: Test scores; temperature scales (F & C)

Ratio – equal distance between scale points
 Absolute zero, no negative numbers
 Examples: Number of children; income in dollars
→ Which to use? Highest level possible; the higher the level, the more
powerful the analysis that is possible
Review questions


What is the most precise level of data?
What level of data do you get with:
Class rank? (freshman, sophomore, junior, senior)
Number of minutes late to class?
What county you live in?
Your age: 0-20, 21-40, 41-60, 60+
Your annual income in dollars:
Descriptions of Nominal Data


Names not quantities, even if use numbers as name tags (1=male;
2=female)
Give frequencies (f) in each group or number of cases (N for
population; n for sample)
 Generally also give percentages (part ÷ whole) in addition to frequencies
(more intuitive)

Univariate analysis looks at a single variable
 See sample frequency table in text

Bivariate looks at relationship between two variables
 See sample table in text – percentages are better for comparisons,
especially if unequal sized groups
Review questions



What percent of this class is male?
Why do we report percents?
What is the difference between univariate and bivariate
statistics?
SHAPES OF DISTRIBUTIONS
(curves)


Describe quantitative data distribution with a frequency
polygon and then smooth it to see shape
Many distributions’ curves are normal, meaning bell shaped and
symmetrical
 Examples are heights, weights, average rainfall, IQ
 Some distributions are not normal; a few scores in one direction (high or
low) create a tail
 Skewed to the right is called a positive skew (a few high incomes on the right side of
the curve create a tail on the right)
 Skewed to the left is called a negative skew (a few low scores on the left side of the
curve create a tail on the left)
Review questions

What are the characteristics of normal curves?
What causes a negative skew?

What causes a positive skew?

THE MEAN, MEDIAN, AND MODE
(Measures of Central Tendency)

Mean is most frequently used measure of average
 It’s the balance point (positive and negative deviations from the mean
equal zero); Calculate: ∑x ÷ N
_
 M or μ =mean of population; m=mean of sample (or X)
 Major drawback is effect of extreme scores in skewed distribution (pull
mean up or down)

Use median (midpoint) if skewed distribution
 Put scores in order, high to low, if odd number of scores, it’s the middle
one; if even, average two middle ones


Mode is most frequently occurring score
Interval/ratio-use all three measures; ordinal-use only median or
mode; nominal-use mode only
Review questions



How do you calculate the mean?
How do you calculate the median?
What is the mode?
The Mean and
Standard Deviation

Two statistics (center and variation) are used to describe a
distribution of interval/ratio values
Mean (average) tells the center of scores
Standard deviation tells how spread out scores are
 Use SD, S, or σ (sigma) for populations; sd, s for samples
 Text ex: Grp 1: M=15, S=10; Grp 2: M=15, S=.93; Grp 3: M=15, S=0
→ One with larger S has more variation

SD has special relationship to the normal curve
 68% of cases within plus or minus one SD (±1SD)
 95% within ±2SD and 99.7% within ±3SD
Review questions



Mean, median and mode are measures of what?
Range and standard deviation are measures of what?
What percent of scores are within one SD of the mean?
The Median and IQR



Median » measure of center used with skewed data and with
ordinal data (mean not normally used)
Range (high score – low score) » measure of variation used with
median (SD not used with medians)
Because range is highly affected by extreme scores, inter-quartile
range (IQR) is often used
 It is the range found in the middle 50% of scores
 Between the 25th and 75th percentile scores (see text examples)
 Lower median scores indicate lower average
 Lower IQR scores indicate less variation
Review questions

Medians and ranges can be used with what levels of data?


How do you determine the 25th percentile?
What does the IQR tell you?
Conclude Part F
Related documents