Download Hypothesis Testing - Sys

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Inference
Inference
β€’ Confidence Intervals: Estimating a population parameters
β€’ Tests of significance: To assess the evidence provided by data about
some claim on the population.
Test of Significance: A formal procedure for comparing observed data
with a claim (called a hypothesis) whose truth we want to access.
β€’ The hypothesis is a statement about a parameters, such as a mean
β€’ We express the results of a significance test in terms of a probability
that measures how well the data and the hypothesis agree.
Hypotheses
There are two types of hypotheses:
1. The Null Hypothesis states that there is no difference between a
parameter and a specific value. The Null Hypothesis is denoted by
the symbol 𝑯𝒐 and is what we want to disprove.
2. The Alternative Hypothesis states that there is a difference
between a parameter and a specific value. The alternative
hypothesis is denoted by the symbol π‘―πŸ
Ho Hypotheses: Examples
1.
Fourth graders at a school perform equally well in math compared to fourth graders at
another school
2. Babies born in the US are on average the same weight at birth compared to babies
born in the UK.
3. Two groups of nematode worms are treated differently, both sets of worm appear to
have the same life span.
4. Artists are no more likely to be left-handed than people in the general population.
5. On average, the dose of aspirin in a single tablet is 200 mg
6. Women and men are equally likely to be vegetarian
7. People are likely to loose weight whether they are on a protein or carbohydrate diet.
8. The percentage tip left at a family or fine dinning restaurant is the same.
9. Age has no effect on mathematical ability
10. There is no difference in pain relief after chewing willow bark versus taking a placebo
H1 Hypotheses: Examples
Take the opposite of the previous ones.
Example
A pharmaceutical company buys raw material from Joe’s Cheap Chemicals
in bags that are on average 1 Kg in weight. The pharmaceutical company is
suspicious that the bags it gets are consistently less than 1 Kg which
allows Joe’s Cheap Chemicals to make more money by packing less
material.
The company measures 100 consecutive bags of raw material and finds
that the mean weight is 0.95 Kg with standard deviation of 0.15 Kg
Is this good evidence that the pharmaceutical company is being cheated?
Example
We are going to try to answer how likely is it to pick a sample where
the weight of 0.95 is unusual.
Is it quite likely or unlikely?
Example
The Null Hypothesis, Ho: Joe’s Cheap Chemicals is at the perfect weight
π»π‘œ : πœ‡ = 1 𝐾𝑔
The Alternative Hypothesis, H1 : Joe’s Cheap Chemical packages its
bags lighter than they should be:
𝐻1 : πœ‡ < 1 𝐾𝑔
One-tailed and Two Tailed Tests
One-tailed test: Points out that the null hypothesis should be rejected
when the test value is in the critical region on one side of the parameter
being tested.
π»π‘œ : 𝑝 = π‘π‘œ
𝐻1 : 𝑝 < π‘π‘œ π‘œπ‘Ÿ
𝐻1 : 𝑝 > π‘π‘œ
Two-tailed test: Points out that the null hypothesis should be rejected
when the test value is in either of two critical regions.
π»π‘œ : 𝑝 = π‘π‘œ
𝐻1 : 𝑝 β‰  π‘π‘œ
Example
The Null Hypothesis, Ho: Joe’s Cheap Chemicals is at the perfect weight
π»π‘œ : πœ‡ = 1 𝐾𝑔
The Alternative Hypothesis, H1 : Joe’s Cheap Chemical packages its
bags lighter than they should be:
𝐻1 : πœ‡ < 1 𝐾𝑔
Example
The Null Hypothesis, Ho: Joe’s Cheap Chemicals is at the perfect weight
π»π‘œ : πœ‡ = 1 𝐾𝑔
The Alternative Hypothesis, H1 : Joe’s Cheap Chemical packages its
bags lighter than they should be:
𝐻1 : πœ‡ < 1 𝐾𝑔
One Tailed Test.
Example
What is the probability that the weight could by chance
be less that 1 Kg?
First compute the standard error of the sampled means
Example
Let us standardize the value of the sample mean, x with respect
to the population mean. This will give us the standard distance
between the sampled mean and the population mean.
Example
Find the area below -3.333 by looking up the z table:
That is, it is highly unlikely that
the 0.95 Kg could have happened
by chance alone. Therefore we assume
the null hypothesis to be false.
p-values
The p-value is a measure of the strength
of the evidence against the null hypothesis.
p-values are between 0 and 1
With small p-values we reject Ho
But how small?
If the p-value is < 0.05 this is considered
statistically significant. 0.0005 clearly is!
p-values
p-value
> 0.1
Between 0.1 and 0.05
Evidence against Ho
Very Weak or None
Weak
Between 0.05 and 0.01
< 0.01
Strong
Very Strong
p-values
When testing a hypothesis we normally set a threshold to determine significance.
Common thresholds are 0.1%, 0.95% and more rarely 0.99%
If the p-value is beyond these then the Ho is rejected.
Making Mistakes
Low p-values doesn’t prove anything, they just suggest that the null
hypothesis is unlikely. The p-value of 0.0005 means that the chance of
getting a weight less than 0.95 is a 1 in 2000 chance. If is therefore
possible, though unlikely, that the population of bags does indeed have
a weight of 0.95. If this happens, making this mistake is called
Type I error
Type I error is committed when a true null hypothesis is
rejected when in fact it was true.
Ho is true but we reject it
Making Mistakes
Type II error
Type II error is committed when a true null hypothesis is
accepted when in fact it should not have been.
Ho is false but we accept it
Exercise 1
The standard bag of M&Ms candies is 47.9 grams
14 bags are picked at random and weighted. The standard deviation of
all M&M bags is found to be 0.22 grams. 48.07
The sample is on the small size for this test but we’ll continue anyway.
Determine whether the M&Ms bags do not contain the claimed
amount of 47.9 grams at the 0.05 significance level
Exercise 1
State H0 and H1
Exercise 1
State H0 and H1
Ho: mean = 47.9
H1: The mean does not equal 47.9
Exercise 1
Compute the standard error:
Exercise 1
Exercise 1
Compute the z statistic:
Exercise 1
Exercise 1
Is the null hypothesis rejected?
Exercise 1
Note that his is a two tail test at 0.05% significance, therefore we
split that 0.025 for each tail.
The area above or below 2.89 is 0.0019. For the two tailed test we double this
to yield 0.0038.
This value is less than 0.05, therefore the null hypothesis is rejected.
We’ve been shortchanged by M&M
Exercise 2
1,500 cows was fed a special high‐protein grain for a month.
A random sample of 29 were weighed and had gained an average of 6.7 pounds.
If the standard deviation of weight gain for the entire herd is 7.1, test
the hypothesis that the average weight gain per steer for the month was
more than 5 pounds.
State Ho, H1, single or two tailed?
Exercise 3
In national use, a vocabulary test is known to have a mean score of 68
and a standard deviation of 13. A class of 19 students takes the test and
has a mean score of 65.
Is the class typical of others who have taken the test? Assume a
significance level of p < 0.05.
State Ho and H1, single or two trailed test?
Exercise 4
Manager claims average sales for her shop is $1800 a day during winter
months. 10 winter days selected at random, and the mean of the sales is
$1830. The standard deviation of the population is $200.
Can one reject the claim at a significance level of 0.05%?
State Ho and H1, single to two tailed test?
Exercise 5
n the population, the average IQ is 100 with a standard deviation of 15.
A team of scientists wants to test a new medication to see if it has either a
positive or negative effect on intelligence, or no effect at all. A sample of
30 participants who have taken the medication has a mean of 140. Did
the medication affect intelligence, using 0.05% significance?
What is Ho and H1? Single or two tailed?