Download Review Cross-tabs, relationships

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Cross-Tabulations
We have been looking at these for some
time already.
An arrangement of two categorical
variables into rows and columns.
Row variable
 Column variable

Tells about relationships between two
categorical variables
1
Depression and a new baby,
Fathers
| depress
baby |
0
1 |
Total
-----------+----------------------+---------0 |
92
59 |
151
|
60.93
39.07 |
100.00
|
75.41
71.95 |
74.02
-----------+----------------------+---------1 |
30
23 |
53
|
56.60
43.40 |
100.00
|
24.59
28.05 |
25.98
-----------+----------------------+---------Total |
122
82 |
204
|
59.80
40.20 |
100.00
|
100.00
100.00 |
100.00
2
Stress and social class
| class
stress |
Low
Middle
Upper |
Total
-----------+---------------------------------+---------Low |
246
90
55 |
391
|
62.92
23.02
14.07 |
100.00
|
59.42
64.75
80.88 |
62.96
-----------+---------------------------------+---------High |
168
49
13 |
230
|
73.04
21.30
5.65 |
100.00
|
40.58
35.25
19.12 |
37.04
-----------+---------------------------------+---------Total |
414
139
68 |
621
|
66.67
22.38
10.95 |
100.00
|
100.00
100.00
100.00 |
100.00
3
What goes into the cells?
Frequencies
Cell
 Margin
 Total

Row percentages
Column percentages
Total percentages
4
Percentages
Independent variable - suspected cause
Dependent variable - suspected effect
Percentages should be based on the
independent or causal variable
5
Stress and social class
| class
stress |
Low
Middle
Upper |
Total
-----------+---------------------------------+---------Low |
246
90
55 |
391
|
62.92
23.02
14.07 |
100.00
|
59.42
64.75
80.88 |
62.96
-----------+---------------------------------+---------High |
168
49
13 |
230
|
73.04
21.30
5.65 |
100.00
|
40.58
35.25
19.12 |
37.04
-----------+---------------------------------+---------Total |
414
139
68 |
621
|
66.67
22.38
10.95 |
100.00
|
100.00
100.00
100.00 |
100.00
6
Make comparisons
Compare categories of the independent
variable
To see effect on proportion in one
category of the dependent variable
To make comparisons we must be sure
the comparisons make sense -- are of
the same thing: not apples with
oranges!
7
Independence
Two variables, A and B, are
independent if p(A) = p(A|B)
p(Stress) = .37, p(Stress|Hi class) = .19
Also, note

p(s|low) = .41 p(s|mid) = .35 p(s|hi) = .19
Also note, these are from the
appropriate percentages, since class
causes stress.
8
Independence
If there is independence, then

p(s) = p(s|lo) = p(s|mid) = p(s|hi)
What would the frequencies be if there
was independence?
p(s) = .37 = p(s|lo) = p(s|mid) = p(s|hi)
 This .37 is taken from the margin
(unconditional probability of stress)

9
Apply this
| class
stress |
Low
Middle
Upper |
Total
-----------+---------------------------------+---------Low |
246
90
55 |
391
|
62.96
62.96
62.96 |
62.96
|
260.65
87.52
42.81 |
-----------+---------------------------------+---------High |
168
49
13 |
230
|
37.04
37.04
37.04 |
37.04
|
153.35
51.48
25.19 |
-----------+---------------------------------+---------Total |
414
139
68 |
621
|
100.00
100.00
100.00 |
100.00
10
Observed and Expected
Are they the same?

Then p(s) = p(s|class) -- Independence
Are they different?

Then p(s) ‡ p(s|class) -- Relationship
How can we tell?

Obs  Exp
2
Exp
11
Look at parts of formula

Obs  Exp
2
Exp
What if we just sum difference without squaring?
How big is a difference of 5 points?
What happens when there are lots of cells
in the table we are looking at?
12
Related documents