Download Lecture 6 - Inferential Statistics

PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, ZScores, and Estimation Normal Curve   The normal curve is central to the theory that underlies inferential statistics. The normal curve is a theoretical model.    A frequency polygon that is perfectly symmetrical and smooth. Bell shaped, unimodal, with infinite tails. Crucial point distances along the horizontal axis, when measured in standard deviations, always measure the same proportion under the curve. Normal Curve Computing Z-Scores  To find the percentage of the total area (or number of cases) above, below, or between scores in an empirical distribution, the original scores must be expressed in units of the standard deviation or converted into Z scores. Xi  X Z s Computing Z-Scores – Mean ideology of House delegation by state. Computing Z-Scores: Examples  Z     What percentage of the cases fall between -0.5 and 0.01 on the ideology scale? X i X  0.5  0.01  0.51    2.488 s 0.205 0.205 Z X i  X 0.00  0.01  0.01    0.049. s 0.205 0.205 From Excel, =standardize(-0.5, 0.01, 0.205) = Z = -2.4878; =normsdist(-2.4878); p=0.006427 From Excel, =standardize(0.0,0.01,0.205)= Z = -0.04878; =normsdist(-0.04878); p=0.480547 P-0.5&0.0 = 0.480547-0.006427 = 0.474120. 47.4% of the distribution lies between -0.56 and 0 on the ideology scale. Computing Z-Scores: Examples   What percentage of the House delegations from 1953 to 2005 have more conservative scores than 0.5? (1 - .992 = 0.008 or 0.8%) What percentage have more liberal scores than -0.25? (10.2%). Xi Xmean s 0.5 0.01 -0.25 0.01 Z p 0.205 2.390244 0.991581 0.205 -1.268293 0.102347 Computing Z-scores: Rules   If you want the distance between a score and the mean, subtract the probability from .5 if the Z is negative. Subtract .5 from the probability if Z is positive. If you want the distance beyond a score (less than a score lower than the mean), use the probability score from Excel. If the distance is more than a score higher than the mean), subtract the probability in Excel from 1. Computing Z-scores: Rules  If you want the difference between two scores other than the mean:  Calculate Z for each score, identify the appropriate probability, and subtract the smaller probability from the larger. Estimation Procedures   Bias – does the mean of the sampling distribution equal the mean of the population? Efficiency – how closely around the mean does the sampling distribution cluster. You can improve efficiency by increasing sample size. Estimation Procedures  Point estimate – construct a sample, calculate a proportion or mean, and estimate the population will have the same value as the sample. Always some probability of error. Estimation Procedures  Confidence interval – range around the sample mean.  First step: determine a confidence level: how much error are you willing to tolerate. The common standard is 5% or .05. You are willing to be wrong 5% of the time in estimating populations. This figure is known as alpha or α. If an infinite number of confidence intervals are constructed, 95% will contain the population mean and 5% won’t. Estimation Procedures    We now work in reverse on the normal curve. Divide the probability of error between the upper and lower tails of the curve (so that the 95% is in the middle), and estimate the Z-score that will contain 2.5% of the area under the curve on either end. That Z-score is ±1.96. Similar Z-scores for 90% (alpha=.10), 99% (alpha=.01), and 99.9% (alpha=.001) are ±1.65, ±2.58, and ±3.29. Estimation Procedures    c.i.  X  Z    N where c.i.  confidence interval X  the sample mean Z  the Z score as determined by the alpha level       the population standard error of the mean  N Estimation Procedures – Sample Mean  s  c.i.  X  Z    n 1  where c.i.  confidence interval X  the sample mean Z  the Z score as determined by the alpha level  s     the standard error of the mean  n 1  Only use if sample is 100 or greater Estimation Procedures  You can control the width of the confidence intervals by adjusting the confidence level or alpha or by adjusting sample size. Confidence Interval Examples Mean House Ideology for presidential disaster requests (1953 to 2005) with 90%, 95%, and 99% confidence intervals. Confidence Interval Examples from Presidential Disaster Decisions, 1953 to 2005 Variable Mean Std. Deviation No. of cases Std. Error Cnfd. Int. Lower Bound Upper Bound Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 90% 0.000 0.013 Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 95% -0.002 0.015 Mean Ideology of House Delegation by State 0.006 0.205 2493 0.004 99% -0.004 0.017 Mean Ideology of Senate Delegation by State Mean Ideology of Senate Delegation by State Mean Ideology of Senate Delegation by State -0.022 -0.022 -0.022 0.300 0.300 0.300 2493 2493 2493 0.006 0.006 0.006 90% 95% 99% -0.031 -0.033 -0.037 -0.012 -0.010 -0.006 PPA 501 – Analytical Methods in Administration Lecture 6b – One-Sample and Two-Sample Tests Five-step Model of Hypothesis Testing      Step 1. Making assumptions and meeting test requirements. Step 2. Stating the null hypothesis. Step 3. Selecting the sampling distribution and establishing the critical region. Step 4. Computing the test statistic. Step 5. Making a decision and interpreting the results of the test. Five-step Model of Hypothesis Testing – One-sample Z Scores  Step 1. Making assumptions.     Model: random sampling. Interval-ratio measurement. Normal sampling distribution. Step 2. Stating the null hypothesis (no difference) and the research hypothesis.   Ho: 1   H1:    ; two - tailed 1 test 1   ; one - tailed test or 1   ; one - tailed test Five-step Model of Hypothesis Testing – One-sample Z Scores  Step 3. Selecting the sampling distribution and establishing the critical region.    Sampling distribution = Z distribution. Α=0.05. Z(critical)=1.96 (two-tailed); +1.65 or -1.65 (one-tailed). Five-step Model of Hypothesis Testing – One-sample Z Scores  Step 4. Computing the test statistic.   Use z-formula. Step 5. Making a decision.  Compare z-critical to z-obtained. If zobtained is greater in magnitude than zcritical, reject null hypothesis. Otherwise, accept null hypothesis. Five-Step Model: Critical Choices   Choice of alpha level: .05, .01, .001. Selection of research hypothesis.    Two-tailed test: research hypothesis simplify states that means of sample and population are different. One-tailed test: mean of sample is larger or smaller than mean of population. Type of error to maximize: Type I or Type II.   Type I – rejecting a null hypothesis that is true. Type II – accepting a null hypothesis that is false. Five-Step Model: Critical Choices Five-step Model: Example  Is the average age of voters in the 2000 National Election Study different than the average age of all adults in the U.S. population? Five-step Model of Hypothesis Testing – Large-sample Z Scores  Step 1. Making assumptions.     Model: random sampling. Interval-ratio measurement. Normal sampling distribution. Step 2. Stating the null hypothesis (no difference) and the research hypothesis.   Ho: 1    45.24 H1:   ; two - tailed 1 test Five-step Model of Hypothesis Testing – Large-sample Z Scores  Step 3. Selecting the sampling distribution and establishing the critical region.    Sampling distribution = Z distribution. α=0.05. Z(critical)=1.96 (two-tailed) Five-step Model of Hypothesis Testing – Large-sample Z Scores  Step 4. Computing the test statistic. X  47.21  45.24 1.97 Z (obtained )     4.67  17.88 .4217 N 1798  Step 5. Making a decision. Z (obtained )  Z (critical )  4.67  1.96  Reject the null hypothesis of no difference . The sample is significan tly older than the voting age population Five-Step Model: Small Sample T-test (One Sample)  Formula X  t (obtained )  s N 1 Five-Step Model: Small Sample T-test (One Sample)  Step 1. Making Assumptions.     Random sampling. Interval-ratio measurement. Normal sampling distribution. Step 2. Stating the null hypothesis.   Ho: 1   H1:    ; two - tailed 1 test 1   ; one - tailed test or 1   ; one - tailed test Five-step Model of Hypothesis Testing – One-sample t Scores  Step 3. Selecting the sampling distribution and establishing the critical region.     Sampling distribution = t distribution. Α=0.05. Df=N-1. t(critical) from Appendix A, Table B in Agresti and Franklin. Five-step Model of Hypothesis Testing – One-sample t Scores  Step 4. Computing the test statistic. t (obtained )   X  s N 1 Step 5. Making a decision.  Compare t-critical to t-obtained. If t-obtained is greater in magnitude than t-critical, reject null hypothesis. Otherwise, accept null hypothesis. Five-step Model of Hypothesis Testing – One-sample t Scores  Is the average age of individuals in the JCHA 2000 sample survey older than the national average age for all adults? (Onetailed). Five-Step Model: Small Sample Ttest (One Sample) – JCHA 2000  Step 1. Making Assumptions.     Random sampling. Interval-ratio measurement. Normal sampling distribution. Step 2. Stating the null hypothesis.   Ho: 1    45.24 H1:   ; one - tailed 1 test Five-Step Model: Small Sample Ttest (One Sample) – JCHA 2000  Step 3. Selecting the sampling distribution and establishing the critical region.     Sampling distribution = t distribution. Α=0.05. Df=41-1=40. t(critical) =1.684. Five-Step Model: Small Sample Ttest (One Sample) – JCHA 2000  Step 4. Computing the test statistic. t (obtained )   X  52.78  45.24 7.54    2.29 s N  1 20.866 40 3.299 Step 5. Making a decision.  T(obtained) > t(critical). Therefore, reject the null hypothesis. The sample of residents from the Jefferson County Housing Authority is significantly older than the adult population of the United States. Two-Sample Models – Large Samples   Most of the time we do not have the population means or proportions. All we can do is compare the means or proportions of population subsamples. Adds the additional assumption of independent random samples. Two-Sample Models – Large Samples  Formula.  X Z (obtained )  1 X2 X X 2 1X 2   1X 2 2 s1 s2  N1  1 N 2  1 Five-Step Model – Large TwoSample Tests (Z Distribution)  Step 1. Making assumptions.     Model: Independent random samples. Interval-ratio measurement. Normal sampling distribution. Step 2. Stating the null hypothesis (no difference) and the research hypothesis.   Ho: 1  2 H1:    ; two - tailed 1 2 test 1   2 ; one - tailed test or 1   2 ; one - tailed test Five-Step Model – Large TwoSample Tests (Z Distribution)  Step 3. Selecting the sampling distribution and establishing the critical region.    Sampling distribution = Z distribution. Α=0.05. Z(critical)=1.96 (two-tailed); +1.65 or -1.65 (one-tailed). Five-Step Model – Large TwoSample Tests (Z Distribution)  Step 4. Computing the test statistic. Z (obtained )  X  2 1X 2  X 1 X2 X  1X 2 2 s1 s  2 N1  1 N 2  1 Step 5. Making a decision.  Compare z-critical to z-obtained. If z-obtained is greater in magnitude than z-critical, reject null hypothesis. Otherwise, accept null hypothesis. Five-Step Model – Large TwoSample Tests (Z Distribution)  Do non-white citizens of Birmingham, Alabama, believe that discrimination is more of a problem than white citizens? Five-Step Model – Large TwoSample Tests (Fair Housing)  Step 1. Making assumptions.     Model: Independent random samples. Interval-ratio measurement. Normal sampling distribution. Step 2. Stating the null hypothesis (no difference) and the research hypothesis.   Ho: 1  2 H1:    ; one - tailed 1 2 test Five-Step Model – Large TwoSample Tests (Z Distribution)  Step 3. Selecting the sampling distribution and establishing the critical region.    Sampling distribution = Z distribution. Α=0.05. Z(critical)=+1.65 (one-tailed). Five-Step Model – Large TwoSample Tests (Z Distribution)  Step 4. Computing the test statistic. Z (obtained )  X 2 1 X2  2 s1 s  2 N1  1 N 2  1   2.70  2.14 1.0582 .966 2  141 42  .56 .56   3.224 .008  .022 .173 Step 5. Making a decision.  Z(obtained) is greater than Z(critical), therefore reject the null hypothesis of no difference. Non-whites believe that discrimination is more of a problem in Birmingham. Five-Step Model – Small TwoSample Tests  If N1 + N2 < 100, use this formula.  X t (obtained )  1 X2 X X 1X 2 1X 2 N1s1  N 2 s2 N1  N 2  2 2   2 N1  N 2 N1 N 2 Five-Step Model – Small TwoSample Tests (t Distribution)  Step 1. Making assumptions.      Model: Independent random samples. Interval-ratio measurement. 2 2    2 Equal population variances 1 Normal sampling distribution. Step 2. Stating the null hypothesis (no difference) and the research hypothesis.   Ho: 1  2 H1:    ; two - tailed 1 2 test 1   2 ; one - tailed test or 1   2 ; one - tailed test Five-Step Model – Small TwoSample Tests (t Distribution)  Step 3. Selecting the sampling distribution and establishing the critical region.     Sampling distribution = t distribution. Α=0.05. Df=N1+N2-2 t(critical). See Appendix A, Table B. Five-Step Model – Small TwoSample Tests (t Distribution)  Step 4. Computing the test statistic.  X t (obtained )  1 X2 X X   1X 2 N1s1  N 2 s2 N1  N 2  2 2 1X 2  2 N1  N 2 N1 N 2 Step 5. Making a decision.  Compare t-critical to t-obtained. If t-obtained is greater in magnitude than t-critical, reject null hypothesis. Otherwise, accept null hypothesis. Five-Step Model – Small TwoSample Tests (t Distribution)  Did white and nonwhite residents of the Jefferson County Housing Authority have significantly different lengths of residence in 2000? Five-Step Model – Small TwoSample Tests (JCHA 2000)  Step 1. Making assumptions.      Model: Independent random samples. Interval-ratio measurement. 2 2    2 Equal population variances 1 Normal sampling distribution. Step 2. Stating the null hypothesis (no difference) and the research hypothesis.   Ho: 1  2 H1: 1  2 ; two - tailed test Five-Step Model – Small TwoSample Tests (JCHA 2000)  Step 3. Selecting the sampling distribution and establishing the critical region.     Sampling distribution = t distribution. Α=0.05, two-tailed. Df=N1+N2-2=14+25-2=37 t(critical) from Appendix B = 2.042 Five-Step Model – Small TwoSample Tests (t Distribution)  Step 4. Computing the test statistic. Z (obtained )   X 1 X2 N1s1  N 2 s2 N1  N 2  2 2 2  N1  N 2 N1 N 2  70.21  82.84 14(56.337) 2  25(93.744) 2 25  14 25  14  2 25(14)   12.63 7138.7147 .1114  12.63  12.63   .448 84.4909(.3338) 28.2002  Step 5. Making a decision.  Z(obtained) is less than Z(critical) in magnitude. Accept the null hypothesis. Whites and nonwhites in the JCHA 2000 survey do not have different lengths of residence in public housing. PPA 501 – Analytical Methods in Administration Lecture 6c – Analysis of Variance Introduction    Analysis of variance (ANOVA) can be considered an extension of the t-test. The t-test assumes that the independent variable has only two categories. ANOVA assumes that the nominal or ordinal independent variable has two or more categories. Introduction  The null hypothesis is that the populations from which the each of samples (categories) are drawn are equal on the characteristic measured (usually a mean or proportion). Introduction   If the null hypothesis is correct, the means for the dependent variable within each category of the independent variable should be roughly equal. ANOVA proceeds by making comparisons across the categories of the independent variable. Computation of ANOVA   The computation of ANOVA compares the amount of variation within each category (SSW) to the amount of variation between categories (SSB). 2 Total sum of squares. SST   X  X  i SST   X  N X ; computatio nal 2 SST  SSB  SSW 2 Computation of ANOVA  Sum of squares within (variation within categories). SSW   X  X  2 i k SSW  the sum of the squares within th e categories X k  the mean of a category  Sum of squares between (variation between categories).  SSB   N k X k  X  2 SSB  the sum of squares between th e categories N k  the number of cases in a category X k  the mean of a category Computation of ANOVA  Degrees of freedom. dfw  N  k dfb  k  1 where dfw  degrees of freedom associated with SSW dfb  degrees of freedom associated with SSB N  number of cases k  number of categories Computation of ANOVA  Mean square estimates. SSW Mean square within  dfw SSB Mean square between  dfb Mean square between F Mean square within Computation of ANOVA  Computational steps for shortcut.       Find SST using computation formula. Find SSB. Find SSW by subtraction. Calculate degrees of freedom. Construct the mean square estimates. Compute the F-ratio. Five-Step Hypothesis Test for ANOVA.  Step 1. Making assumptions.      Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances. Step 2. Stating the null hypothesis. H 0  1   2     k H1  at least one of the means is different Five-Step Hypothesis Test for ANOVA.  Step 3. Selecting the sampling distribution and establishing the critical region.       Sampling distribution = F distribution. Alpha = .05 (or .01 or . . .). Degrees of freedom within = N – k. Degrees of freedom between = k – 1. F-critical=Use Appendix D, p. 499-500. Step 4. Computing the test statistic.  Use the procedure outlined above. Five-Step Hypothesis Test for ANOVA.  Step 5. Making a decision.  If F(obtained) is greater than F(critical), reject the null hypothesis of no difference. At least one population mean is different from the others. ANOVA – Example 1 – JCHA 2000 What impact does marital status have on respondent’s rating Of JCHA services? Sum of Rating Squared is 615 Report JCHA Program Rating Marital Status Married Separated Widowed Never Married Divorced Total Mean 3.0313 4.5000 4.6667 4.0556 3.6731 3.8289 N 8 2 6 9 13 38 Std. Deviation 1.70837 .70711 .81650 .79822 1.20927 1.25082 ANOVA – Example 1 – JCHA 2000  Step 1. Making assumptions.      Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances. Step 2. Stating the null hypothesis. H 0  1   2  3   4  5 H1  at least one of the means is different ANOVA – Example 1 – JCHA 2000  Step 3. Selecting the sampling distribution and establishing the critical region.      Sampling distribution = F distribution. Alpha = .05. Degrees of freedom within = N – k = 38 – 5 = 33. Degrees of freedom between = k – 1 = 5 – 1 = 4. F-critical=2.69. ANOVA – Example 1 – JCHA 2000  Step 4. Computing the test statistic. ANOVA Table JCHA Program Rating Between Groups * Marital Status Within Groups Total (Combined) Sum of Squares 10.980 46.908 57.888 df 4 33 37 Mean Square 2.745 1.421 F 1.931 Sig. .128 ANOVA – Example 1 – JCHA 2000 SST   X  N X  615  38(3.8289) 2 2 2 SST  615  557.0981  57.9019  SSB   N k X k  X  2  8(3.0313  3.8289) 2  2(4.5  3.8289) 2  6(4.6667  3.8289) 2  9(4.0556  3.8289) 2  13(3.6731  3.8289) 2  5.0893  0.9008  4.2115  0.4625  0.3156 SSB  10.9797 SSW  SST  SSB  57.9019 10.9797  46.9222 ANOVA – Example 1 – JCHA 2000 dfw  N  k  38  5  33 dfb  k  1  5  1  4 SSW 46.9222 Mean square within    1.4219 dfw 33 SSB 10.9797 Mean square between    2.7449 dfb 4 Mean square between 2.7449 F   1.9304 Mean square within 1.4219 ANOVA – Example 1 – JCHA 2000.  Step 5. Making a decision.  F(obtained) is 1.93. F(critical) is 2.69. F(obtained) < F(critical). Therefore, we fail to reject the null hypothesis of no difference. Approval of JCHA services does not vary significantly by marital status. ANOVA – Example 2 – Presidential Disaster Set What impact does Presidential administration have on the president’s recommendation of disaster assistance? ANOVA – Example 2 – Presidential Disaster Data Set  Step 1. Making assumptions.      Independent random samples. Interval ratio measurement. Normally distributed populations. Equal population variances. Step 2. Stating the null hypothesis. H 0  1   2  3   4  5  6  7  8  9  10 H1  one of the means is different ANOVA – Example 2 – Presidential Disaster Data Set  Step 3. Selecting the sampling distribution and establishing the critical region.      Sampling distribution = F distribution. Alpha = .05. Degrees of freedom within = N – k = 2642 – 10 = 2632. Degrees of freedom between = k – 1 = 10 – 1 = 9. F-critical=1.883. ANOVA – Example 2 – Presidential Disaster Data Set  Step 4. Computing the test statistic. ANOVA – Example 2 – Presidential Disaster Data Set  Step 5. Making a decision.  F(obtained) is 12.863. F(critical) is 1.883. F(obtained) > F(critical). Therefore, we can reject the null hypothesis of no difference. Approval of federal disaster assistance does vary by presidential administration.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 6 - Inferential Statistics