Download Lecture 9: One Way ANOVA Between Subjects

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Lecture 9:
One Way ANOVA
Between Subjects
Laura McAvinue
School of Psychology
Trinity College Dublin
Analysis of Variance
• A statistical technique for testing for differences
between the means of several groups
– One of the most widely used statistical tests
• T-Test
– Compare the means of two groups
• Independent samples
• Paired samples
• ANOVA
– No restriction on the number of groups
T-test
Group 1
 
 
 
Mean
Group 2
 
 
 
Mean
Is the mean of one group significantly different to the
mean of the other group?
•t-test: H0 - 1= 2
H1: 1 2
F-test
Group 1
 
 
 
Group 2
 
 
 
Group 3
 
 
 
Mean
Mean
Mean
Is the mean of one group significantly different to the
means of the other groups?
Analysis of Variance
One way ANOVA
Factorial ANOVA
One Independent
Variable
More than One
Independent Variable
Between
Repeated
subjects
measures /
Within
subjects
Different
participants
Same
participants
Two
way
Three
way
Four
way
A few examples…
• Between subjects one way ANOVA
– The effect of one independent variable with three or
more levels on a dependent variable
• What are the independent & dependent variables
in each of the following studies?
–
–
–
–
The effect of three drugs on reaction time
The effect of five styles of teaching on exam results
The effect of age (old, middle, young) on recall
The effect of gender (male, female) on hostility
Rationale
• Let’s say you have three groups and you want to
see if they are significantly different…
• Recall inferential statistics
– Sample
Population
• Your question:
– Are these 3 groups representative of the same
population or of different populations?
Population
Draw 3 samples
1
2
3
Drug 1
Drug 2
Drug 3
Manipulate
the samples
DV
µ1
µ2
µ3
measure effect of
manipulation on a DV
Did the
manipulation
alter the
samples to
such an
extent that
they now
represent
different
populations?
Recall sampling error & the sampling distribution of
the mean…
The means of samples drawn from the
same population will differ a little due to
random sampling error
When comparing the means of a
number of groups, your task …
•Difference due to a true difference
between the samples (representative of
different populations)?
•Difference due to random sampling
error (representative of the same
population)?
If a true difference exists, this is due to
your manipulation, the independent
variable
Steps of NHST
1. Specify the alternative / research hypothesis
At least one mean is significantly different from the others
At least one group is representative of a separate
population
2. Set up the null hypothesis
The hypothesis that all population means are equal
All groups are representative of the same population
Omnibus Ho: µ1= µ2 = µ3
Steps of NHST
3. Collect your data
4. Run the appropriate statistical test
Between subjects one way ANOVA
5. Obtain the test statistic & associated p-value
F statistic
Compare the F statistic you obtained with the distribution of F
when Ho is true
Determine the probability of obtaining such an F value when Ho
is true
Steps of NHST
6. Decide whether to reject or fail to reject Ho on the
basis of the p value
If the p value is very small (<.5), reject Ho…
Conclude that at least one sample mean is significantly
different to the other means…
Not all groups are representative of the same population
How is ANOVA done?
 Assume Ho is true
 Assume that all three groups are representative of the
same population
 Make two estimates of the variance of this
population
 If Ho is true, then these two estimates should be
about the same
 If Ho is false, these two estimates should be different
Two estimates of population variance
•
Within group variance
•
•
Pooled variability among participants in each treatment
group
Between group variance
•
Variability among group means
If Ho is true…
If Ho is false…
Between Groups Variance
Between Groups Variance
Within Groups Variance
Within Groups Variance
=1
>1
Calculations
 Step…





1: Sum of squares
2: Degrees of freedom
3: Mean square
4: F ratio
5: p value
Total Variance In data
SStotal
Between groups
variance
SSbetween
Within groups
Variance
SSwithin
SStotal
•
∑ (xij - Grand Mean )2
•
Based on the difference between each score
and the grand mean
•
The sum of squared deviations of all
observations, regardless of group membership,
from the grand mean
SSbetween
•
n∑ (Group meanj - Grand Mean )2
•
Based on the differences between groups
•
Related to the variance of the group means
•
The sum of squared deviations of the group
means from the grand mean, multiplied by the
number of observations in each group
SSwithin
•
∑ (xij - Group Meanj )2
•
Based on the variability within each group
•
Calculate SS within each group & add
•
The sum of squared deviations within each
group … or …
•
SStotal - SSbetween
Degrees of Freedom
•
Total variance
•
•
•
N–1
Total no. of observations - 1
Between groups variance
•
•
•
K–1
No. of groups – 1
Within groups variance
•
•
•
k (n – 1)
No. of groups (no. in each sample – 1)
What’s left over!
Mean Square
•
SS / df
•
The average variance between or within
groups
•
An estimate of the population variance
•
MSbetween
•
•
SSgroup / dfgroup
MSwithin
• SSwithin / dfwithin
F Ratio
MSbetween
MSwithin
If Ho is true, F = 1
If Ho is false, F > 1
F
MSbetween
Treatment effect + Differences due to chance
MSwithin
Differences due to chance
If treatment has no effect…
F
0 + Differences due to chance
1
Differences due to chance
If treatment has effect…
F
EFFECT > 0 + Differences due to chance
Differences due to chance
>1
MSBG
MSWG
Variance within groups>
variance between groups
F<1
Fail to reject Ho
If there is more variance
within the groups, then
any difference observed
is due to chance
MSBG
MSWG
Variance within groups=
Variance between groups
F =1
Fail to reject Ho
If both sources of variance
are the same, then
any difference observed
is due to chance
MSBG
MSWG
Variance within groups <
variance between groups
F >1
Reject Ho
The more the group means
differ relative to each other
the more likely it is that
the differences are not
due to chance.
Size of F
•
How much greater than 1 does F have to be to reject
Ho?
•
Compare the obtained F statistic to the distribution of F
when Ho is true
• Calculate the probability of obtaining this F value
when Ho is true
• p value
•
If p < .05, reject Ho
•
Conclude that at least one of your groups is
significantly different from the others
ANOVA table
Source of
variation
SS
df
MS
Between
groups
n∑ (Group meanj K - 1
- Grand Mean )2
Within
groups
∑ (xij - Group
Meanj )2
K(n – 1) SSWG / dfWG
Total
∑ (xij - Grand
Mean )2
N-1
SSBG / dfBG
F
MSBetween
MSWithin
p
Prob. of
observing
F-value
when Ho is
true
A few assumptions…
•
Data in each group should be…
•
Interval scale
•
Normally distributed
•
•
Histograms, box plots
Homogeneity of variance
•
•
Variance of groups should be roughly equal
Independence of observations
•
•
Each person should be in only one group
Participants should be randomly assigned to groups
Multiple Comparison Procedures
•
Obtain a significant F statistic
•
Reject Ho & conclude that at least one sample mean is
significantly different from the others
•
But which one?
•
•
•
•
H1: µ1 ≠ µ2 ≠ µ3
H2: µ1 = µ2 ≠ µ3
H3: µ1 ≠ µ2 = µ3
Necessary to run a series of multiple comparisons to
compare groups and see where the significant
differences lie
Problem with Multiple Comparisons
•
Making multiple comparisons leads to a higher
probability of making a Type I error
•
The more comparisons you make, the higher
the probability of making a Type I error
•
Familywise error rate
•
The probability that a family of comparisons contains
at least one Type I error
Problem with Multiple Comparisons
–
familywise
= 1 - (1 - )c
c = number of comparisons
–
Four comparisons run at  = .05
familywise
= 1 - (1 - .05)4
= 1 - .8145
= .19
–
You think you are working at  = .05, but you’re
actually working at  = .19
Post hoc tests
•
Bonferroni Procedure
•
•
/c
Divide your significance level by the number of
comparisons you plan on making and use this more
conservative value as your level of significance
Four comparisons at  = .05
•
•
•
.05 / 4 = .0125
Reject Ho if p < .0125
Post hoc tests
•
Note: Restrict the number of comparisons to the
ones you are most interested in
•
Tukey
•
•
Compares each mean with each other mean in a way
that keeps the maximum familywise error rate to .05
Computes a single value that represents the
minimum difference between group means that is
necessary for significance
Effect Size
•
A statistically significant difference might not mean
anything in the real world
Eta squared
SSbetween
2
 
SStotal
Percentage of variability among
observations that can be
attributed to the differences
between the groups
A little less biased…
Omega squared
SSbetween  (k 1)MSwithin
 
SStotal  MSwithin
2
How big is big? Similar to correlation coefficient
Cohen’s d
Meantreat – Meancontrol
When comparing two groups
SDcontrol