Download Class 9 Lecture

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Transcript
Sociology 5811:
Lecture 9: CI / Hypothesis Tests
Copyright © 2005 by Evan Schofer
Do not copy or distribute without
permission
Announcements
• Problem Set #3 Due next week
• Problem set posted on course website
• We are a bit ahead of reading assignments in
Knoke book
• Try to keep up; read ahead if necessary
Review: Confidence Intervals
• General formula for Confidence Interval:
C.I. : Y  Zα/2 (σ Y )
•
•
•
•
Where:
Y-bar is the sample mean
Sigma sub-Y-bar is the standard error of the mean
Z (alpha/2) is the critical Z-value for a given level
of confidence
– If you want 90%, look up Z for 45% (a/2)
– See Knoke, Figure 3.5 on page 87 for info
Small N Confidence Intervals
• Issue: What if N is not large?
• The sampling distribution may not be normal
• Z-distribution probabilities don’t apply…
• Standard CI formula doesn’t work
• Solution: Use the “T-Distribution”
• A different curve that accurately approximates the shape of
the sampling distribution for small N
• Result: We can look up values in a “t-table” to determine
probabilities associated with a # of standard deviations from
the mean.
Confidence Intervals for Small N
• Small N C. I. Formula:
• Yields accurate results, even if N is not large
C.I. : Y  t α/2 (σ̂ Y )
• Again, the standard error can be estimated by the
sample standard deviation:
 s 
C.I. : Y  t α/2 

 N
T-Distributions
• The T-distribution is a “family” of distributions
• In a T-Distribution table, you’ll find many T-distributions to
choose from
– Basically, the shape of sampling distribution varies
with the size of your sample
• You need a specific t-distribution depending on sample size
• One t-distribution for each “degree of freedom”
– Also called “df” or “DofF”
• Which T-distribution should you use?
• For confidence intervals: Use T-distribution for
df = N - 1
• Ex: If N = 15, then look at T-distribution for df = 14.
Looking Up T-Tables
Choose the
desired
probability
for a/2
Find t-value
in correct row
and column
Choose the
correct df
(N-1)
Interpretation
is just like a
Z-score.
2.145 = number
of standard
errors for C.I.!
Answering Questions…
• Knowledge of the standard error allows us to
begin answering questions about populations
• Example: National educational standard requires
all schools to maintain a test score average of 60
• You observe that a sample (N=16, s=6) has a mean of 62
• Question: Are you confident that the school
population is above the national standard?
• We know Y-bar for the sample, but what about m for the
whole school?
• Are we confident that m > 60?
Question: Is m > 60?
• Strategy 1: Construct a confidence interval
around Y-bar
• And, see if the bounds fall above 60
• Visually: Confident that m > 60:
Y
58 59 60 61 62 63 64 65
• Visually: m might be 60 or less
66
Y
58 59 60 61 62 63 64 65
66
Question: Is m > 60?
• Strategy 1: Construct a confidence interval
around Y-bar
– Let’s choose a desired confidence level of .95
– N of 16 is “small”… we must use the t-distribution,
not the Z-distribution
– Look up t=value for 15 degrees of freedom (N-1).
Looking Up T-Tables
Choose the
desired
probability
for a/2
Find t-value
in correct row
and column
Choose the
correct df
(N-1)=15
Result:
t = 2.131
Question: Is m > 60?
• Strategy 1: Construct a confidence interval around Y-bar
 s 
C.I. : Y  t α/2 (σ̂ Y )  Y  t α/2 

 N
 6 
62  2.131
  62  3.47
 16 
• CI is 58.53 to 65.47! We aren’t confident m > 60
Y
58 59 60 61 62 63 64 65 66
Question: Is m > 60?
• Note #1: Results would change if we used a
different confidence level
• A 95% and 50% CIs yield different conclusions:
Y
58 59 60 61 62 63 64 65
66
• Idea: Wouldn’t it be nice to know exactly which
CI would describe the distance from Y-bar to m?
• i.e., to calculate the exact probability of Y-bar falling a
certain distance from m?
Question: Is m > 60?
• Note #2: We typically draw CIs around Y-bar
– But, we can also get the same result focusing on our
comparison point (Y = 60)
• Example: If 60 is outside of CI around Y-bar
Y
58 59 60 61 62 63 64 65
66
• Then, Y-bar is outside of the CI around 60
Y
58 59 60 61 62 63 64 65
66
Question: Is m > 60?
• The critical issue is: How far is the distance
between Y-bar and 60
– Is it “far” compared to the width of the sampling
distribution?
• Ex: Y-bar is more than 2 Standard Errors from 60?
• In which case, the school probably exceeds the standard
– Or, is it relatively close?
• Ex: Y-bar is only .5 Standard Errors from 60
• In which case we aren’t confident…
– Note: If we know the sampling distribution is normal
(or t-distributed), we can convert SE’s to a probability
Question: Is m > 60?
• Strategy 2: Determine the probability of Y-bar =
62, if m is really 60 or less
• Procedure:
– 1. Use Y=60 as a reference point
– 2. Determine how far Y-bar is from 60, measured in
Standard Errors
• Which we can convert to a probability
– 3. Issue: Is it likely to observe a Y-bar as high as 62?
• If this is common to observe, even when m = 60 (or less),
then we can’t be confident that m > 60!
• But, if that is a rare event, we can be confident that m > 60!
Question: Is m > 60?
• Strategy 2: Look at sampling distribution
• Confident that m not 60 or less: m is unlikely to really be
Y
60… because Y-bar
usually falls near the
center of the sampling
distribution!
58 59 60 61 62 63 64 65
• Visually: m might easily be <60
Y
66
In this case, it is
common to get Y-bars of
62 or even higher
58 59 60 61 62 63 64 65 66
Question: Is m > 60?
• Issue: How do we tell where Y-bar falls within
the sampling distribution?
• Strategy: Compute a Z-score
• Recall: Z-scores help locate the position of case
within a distribution
• It can tell us how far a Y-bar falls from the center of the
sampling distribution
• In units of “standard errors”!
• Probability can be determined from a Z-table
• Note: for small N, we call it a t-score, look up in a t-table.
Question: Is m > 60?
• Note: We use a slightly modified Z formula
(Yi  Y )
(Y  μ)
Zi 

sY
σY
• “Old” formula calculates # standard deviations a
case falls from the sample mean
• From Y-sub-i to Y-bar
• New formula tells the number of standard errors a
mean estimate falls from the population mean m
• Distance from Y-bar to m in the sampling distribution
• In this case we compare to hypothetical m = 60.
Question: Is m > 60?
• Let’s calculate how far Y-bar falls from m
– Since N is small, we call it a “t-score” or “t-value”
(Y  μ) (62  60)
t

 2 / σ̂ Y
σY
σ̂ Y
s
6
σ̂ Y 
  1.5
N 4
t  2 / σ̂ Y  2 / 1.5  1.333
• Y-bar is 1.33 standard errors above 60!
Question: Is m > 60?
• Question: What is the probability of t>1.33
• i.e., Y-bar falling 1.333 or more standard errors from m)?
This area
reflects the
probability
Y
58 59 60 61 62 63 64 65 66
• Result: p = about .105
• Note: Knoke t-table doesn’t contain this range… have to
look it up elsewhere or use SPSS to calculate probability.
Question: Is m > 60?
• Result: p = .105
• In other words, if m = 60, we will observe Y-bar
of 62 or greater about 10% of the time
• Conclusion: It is plausible that m is 60 or lower
• We are not 95% confident that m > 60
• Conclusion matches result from confidence interval
• We have just tested a claim using inferential
statistics!
Hypothesis Testing
• Hypothesis Testing:
• A formal language and method for examining
claims using inferential statistics
– Designed for use with probabilistic empirical
assessments
• Because of the probabilistic nature of inferential
statistics, we cannot draw conclusions with
absolute certainty
– We cannot “prove” our claims are “true”
– However, improbable, we will occasionally draw an
un-representative sample, even if it is random
Hypothesis Testing
• The logic of hypothesis testing:
• We cannot “prove” anything
• Instead, we will cast doubt on other claims, thus
indirectly supporting our own
• Strategy:
• 1. We first state an “opposing” claim
• The opposite of what we want to claim
• 2. If we can cast sufficient doubt on it, we are
forced (grudgingly) to accept our own claim.
Hypothesis Testing
• Example: Suppose we wish to argue that our
school is above the national standard
• First we state the opposite:
• “Our school is not above the national standard”
• Next we state our alternative:
• “Our school is above the national standard”
• If our statistical analysis shows that the first claim
is highly improbable, we can “reject” it, in favor
of the second claim
• …“accepting” the claim that our school is doing well.
Hypothesis Testing: Jargon
• Hypotheses: Claims we wish to test
• Typically, these are stated in a manner specific
enough to test directly with statistical tools
– We typically do not test hypotheses such as “Marx
was right” / “Marx was wrong”
– Rather: The mean years of education for Americans
is/is not above 18 years.
Hypothesis Testing: Jargon
• The hypothesis we hope to find support for is
referred to as the alternate hypothesis
• The hypothesis counter to our argument is
referred to as the null hypothesis
• Null and alternative hypotheses are denoted as:
• H0: School does not exceed the national standard
• H-zero indicates null hypothesis
• H1: School does exceed national standard
• H-1 indicates alternate hypotheses
• Sometimes called: “Ha”
Hypothesis Testing: More Jargon
• If evidence suggests that the null hypothesis is
highly improbable, we “reject” it
• Instead, we “accept” the alternative hypothesis
• So, typically we:
• Reject H0, accept H1
– Or:
• Fail to reject H0, do not find support for H1
• That was what happened in our example earlier today…
Hypothesis Testing
• In order to conduct a test to evaluate hypotheses,
we need two things:
• 1. A statistical test which reflects on the
probability of H0 being true rather than H1
• Here, we used a z-score/t-score to determine the probability
of H0 being true
• 2. A pre-determined level of probability below
which we feel safe in rejecting H0 (a)
• In the example, we wanted to be 95% confident… a =.05
• But, the probability was .10, so we couldn’t conclude that
the school met the national standard!
Hypothesis Test for the Mean
• Example: Laundry Detergent
• Suppose we work at the Tide factory
• We know the “cleaning power” of tide detergent,
exactly: It is 73 on a continuous scale.
• “Cleaning Power” of Tide = 73
• You conduct a study of a competitor. You buy 50
bottles of generic detergent and observe a mean
cleaning power of 65
• H0: Tide is no better than competitor (m >= 73)
• H1: Tide is better than competitor (m < 73)
Hypothesis Test: Example
• It looks like Tide is better:
• Cleaning power is 73, versus 65 for a sample of
the competition
• Question: Can we reject the null hypothesis and
accept the alternate hypothesis?
• Answer: No! It is possible that we just drew an
atypical sample of generic detergent. The true
population mean for generics may be higher.
Hypothesis Test: Example
• We need to use our statistical knowledge to
determine:
• What is the probability of drawing a sample
(N=50) with mean of 65 from a population of
mean 73 (the mean for Tide)
• If that is a probable event, we can’t draw very
strong conclusions…
• But, if the event is very improbable, it is hard to
believe that the population of generics is as high
as that of Tide…
• We have grounds for rejecting the null hypothesis.
Hypothesis Test: Example
• How would we determine the probability (given
an observed mean of 65) that the population mean
of generic detergent is really 73?
• Answer: We apply the Central Limit Theorem to
determine the shape of the sampling distribution
• And then calculate a Z-value or T-value based on it
• If we chose an alpha (a) of .05
• If we observe a t-value with probability of only
.0023, then we can reject the null hypothesis.
• If we observe a t-value with probability of .361,
we cannot reject the null hypothesis
Hypothesis Test: Steps
• 1. State the research hypothesis (“alternate
hypothesis), H1
• 2. State the null hypothesis, H0
• 3. Choose an a-level (alpha-level)
– Typically .05, sometimes .10 or .01
• 4. Look up value of test statistic corresponding to
the a-level (called the “critical value”)
• Example: find the “critical” t-value associated with a=.05
Hypothesis Test: Steps
• 5. Use statistics to calculate a relevant test
statistic.
– T-value or Z-value
– Soon we will learn additional ones
• 6. Compare test statistic to “critical value”
– If test statistic is greater, we reject H0
– If it is smaller, we cannot reject H0
Hypothesis Test: Steps
• Alternate steps:
• 3. Choose an alpha-level
• 4. Get software to conduct relevant statistical
test.
– Software will compute test statistic and provide a
probability… the probability of observing a test
statistic of a given size.
– If this is lower than alpha, reject H0
Hypothesis Test: Errors
• Due to the probabilistic nature of such tests, there
will be periodic errors.
• Sometimes the null hypothesis will be true, but
we will reject it
– Our alpha-level determines the probability of this
• Sometimes we do not reject the null hypothesis,
even though it is false
Hypothesis Test: Errors
• When we falsely reject H0, it is called a Type I
error
• When we falsely fail to reject H0, it is called a
Type II error
• In general, we are most concerned about Type I
errors… we try to be conservative.
Hypothesis Tests About a Mean
• What sorts of hypothesis tests can one do?
• 1. Test the hypothesis that a population mean is
NOT equal to a certain value
– Null hypothesis is that the mean is equal to that value.
• 2. Population mean is higher than a value
– Null hypothesis: mean is equal or less than a value
• 3. Population mean is lower than a value
– Null hypothesis: mean is equal or greater than a value
• Question: What are examples of each?
Hypothesis Tests About Means
• Example: Bohrnstedt & Knoke, section 3.93, pp.
108-110. N = 1015, Y-bar = 2.91, s=1.45
• H0: Population mean m = 4
• H1: Population mean m not = 4
• Strategy:
• 1. Choose Alpha (let’s use .001)
• 2. Determine the Standard Error
• 3. Use S.E. to determine the range in which
sample means (Y-bar) is likely to fall 99.9% of
time, IF the population mean is 4.
• 4. If observed mean is outside range, reject H0
Example: Is m =4?
• Let’s determine how far Y-bar is from hypothetical m=4
• In units of standard errors
(Y  μ) (2.91  4)  1.09
t


σY
σ̂ Y
σ̂ Y
sY
1.45
σ̂ Y 

 .046
N
1015
t  1.09 / .046  24.0
• Y-bar is 24 standard errors below 4.0!
Hypothesis Tests About a Mean
• A Z-table (if N is large) or a T-table will tell us
probabilities of Y-bar falling Z (or T) standard
deviations from m
• In this example, the desired a = .001
• Which corresponds to t=3.3 (taken from t-table)
– That is: .001 (i.e, .1%) of samples (of size 1015) fall
beyond 3.29 standard errors of the population mean
– 99.9% fall within 3.29 S.E.’s.
Hypothesis Tests About a Mean
• There are two ways to finish the “test”
• 1. Compare “critical t” to “observed t”
– Critical t is 3.3, observed t = -24
• We reject H0: t of +/-24 is HUGE, very improbable
• It is highly unlikely that m = 4
• 2. Actually calculate the probability of observing
a t-value of 24, compare to pre-determined a
• If observed probability is below a, reject H0
– In this case, probability of t=27 is .0000000000000…
• Very improbable. Reject H0!
Two-Tail Tests
• Visually: Most Y-bars should fall near m
• 99.9% CI: –3.3 < t < 3.3, or 3.85 to 4.15
Mean of 2.91
(t=24) is far
into the red
area (beyond
edge of graph)
Sampling Distribution
of the Mean
3.85
Z=-3.3
4
4.15
Z=+3.3
Hypothesis Tests About a Mean
• Note: This test was set up as a “two-tailed test”
• Meaning, that we reject H0 if observed Y-bar falls in either
tail of the sampling distribution
• Ex: Very high Y-bar or very low Y-bar means reject H0
– Not all tests are done that way… Sometimes you only
reject H0 if Y-bar falls in one particular tail.
Hypothesis Testing
• Definition: Two-tailed test: A hypothesis test in
which the a-area of interest falls in both tails of a
Z or T distribution.
• Example: H0: m = 4; H1: m ≠ 4
• Definition: One-tailed test: A hypothesis test in
which the a-area of interest falls in just one tail
of a Z or T distribution.
• Example: H0: m > or = 4; H1: m < 4
• This is called a “directional” hypothesis test.
Hypothesis Tests About Means
• A one-tailed test: H1: m < 4
• Entire a-area is on left, as opposed to half (a/2)
on each side. Also, critical t-value changes.
4
Hypothesis Tests About Means
• T-value changes because the alpha area (e.g., 5%)
is all concentrated in one size of distribution,
rather than split half and half.
•
One tail
vs.
Two-tail:
a.05
a/2.025
a/2.025
Hypothesis Tests About Means
• Use one-tailed tests when you have a directional
hypothesis
– e.g., m > 5
• Otherwise, use 2-tailed tests
• Note: In many instances, you are more likely to
reject the null hypothesis when utilizing a onetailed test
– Concentrating the alpha area in one tail reduces the
critical T-value needed to reject H0
Tests for Differences in Means
• A more useful and interesting application of these
same ideas…
• Hypothesis tests about the means of two
different groups
– Up until now, we’ve focused on a single mean for a
homogeneous group
– It is more interesting to begin to compare groups
– Are they the same? Different?
• We’ll do that next class!