Download Chapter 9: Estimation Using a Single Sample

Chapter 9: Estimation Using a Single Sample Confidence Intervals Inferential Statistics • Our study of confidence intervals begins our study of inferential statistics • In inferential statistics, our objective is to learn about a population from a sample of data • Use a sample of data to decrease our uncertainty about the population the sample was drawn from • More specifically, we’ll be using samples of data to estimate unknown population parameters like  and .  Point Estimates • A single number derived from a sample of data (statistic) that represents a plausible value for a population parameter • First, we decide what is the appropriate statistic. We then collect a random sample of data. The computed statistic is our point estimate -- X as a point estimate for  More than one choice • Interested in the proportion of American voters who support gay marriages • Obviously the appropriate statistic is sample proportion -- p as an estimate for  • Sometimes there’s more than one choice • Sample mean, trimmed mean or median as a point estimate for population mean • How do you choose? Choose the statistic that tends, on average, to be the closest estimate to the true value. Biased and Unbiased Statistic • When there’s more than one choice we want to choose the statistic that is most accurate • Sampling distributions of statistics give us information about how accurate a statistic is for estimating a population parameter • Statistics with sampling distributions that are centered on the parameter we’re trying to estimate are called unbiased • The two unbiased statistics we’ll be studying are sample mean and sample proportion Accuracy of Point Estimates • Even though we might select an unbiased statistic, how accurate is this single number that we calculate? • Remember sampling variability? • Example – samples of 50 from a normal distribution • Using an unbiased statistic with a small standard deviation guarantees no systematic tendency to underestimate or overestimate the parameter and the estimates will be relatively close to the true value Confidence Intervals • How accurate a point estimate is depends on which sample you happen to draw from the population • While the point estimate using an unbiased statistic may be our best single-number best guess – it’s not the only plausible estimate • An alternative to a single number estimate is to provide a range of values or an interval that we feel very confident the true value will fall into • We call this type of estimation confidence intervals Definition of Confidence Interval • An interval of plausible values for the characteristic. It is constructed so that, with a chosen degree of confidence, the value of the characteristic – parameter – will be captured in the interval Confidence Interval • Confidence Interval = Statistic Critical Value x Statistic Std Dev • • • • Statistic Standard Deviation of Sampling Distribution Critical Value Associated confidence level – How much confidence we have in the method used to construct the CI – Not our confidence in any particular interval Basic Concept of CI • We start with the sampling distribution of the statistic we are using • We will be using sampling distributions that are well approximated by a normal distribution • We take a sample and calculate a point estimate, a statistic (unbiased) from that sample Continuing … • With what we know about normal distributions, we know that about 95% of the statistics calculated from random samples will fall within 2 sd of the mean. • The mean of the sampling distribution is centered on the population parameter • If the statistic is within approx 2 sd of the sampling distribution’s mean 95% of the time, then the interval Statistic Critical Value x Statistic Std Dev will capture the mean of the sampling distribution 95% of the time More … • The width of the interval is adjusted by selecting a different confidence level • Typical confidence levels are 90%, 95% and 99% • The endpoints are determined by multiplying the critical values (which are determined by confidence levels) by the sampling distribution standard deviation (sd of the statistic) Large Sample Confidence Interval for a Population Proportion • Parameter of interest is the population proportion  • Statistic used is sample proportion p • Why are large sample CI ?? From last chapter, when sample is large, the statistic is normally distributed • How large is large? n  10 and n1     10 • We know  p   and    1n   p Large Sample Confidence Interval for a Population Proportion • Calculate a sample proportion from a random sample – p number in sample that have characteristic n • Estimate the sample standard deviation – p1  p  n standard error • Choose a confidence level – let’s say 95% • Determine the critical value – Use standard normal table – 1.96 • Calculate your confidence interval – Confidence Interval = Statistic Critical Value x Statistic Std Dev Let’s do an example • Pg 453 Problem # 9.14 In summary • The Large Sample Confidence Interval for  – p is the sample proportion from a random sample – The sample size, n, is large np  10 and n1  p  10 – The CI is p1  p  p  z* n – The desired confidence level determines which critical value is used – Note: This method is not appropriate for small samples Choosing the Sample Size • Terminology: Bound • Confidence Interval = Statistic  Critical Value x Statistic Std Dev • Consider the statistic an estimate of the parameter • Consider ‘critical value x standard deviation’ the bound on the error of your estimate • In the case of population proportions p  z* p1  p  n Finding appropriate sample size • Consider that before you do a study, you may be asked to estimate a particular parameter to a certain degree of accuracy • The question now is, how big a sample should I take to get a specific degree of accuracy at a certain confidence level • We use the ‘bound’ to determine sample size  z * n   1      B 2 • But the population parameter is unknown so we make a reasonable estimate – or use .5 as a conservative estimate for  • Example – pg 454, 9.25 Confidence Interval for Population Mean • We’ll look at these cases: – Population standard deviation is known • n  30 • Small sample but population is approx normal – Population standard deviation is unknown • n  30 • Small sample but population is approx normal Sampling Distribution of the Sample Mean • X   •    X n • When the population is normal, the sampling distribution is normal regardless of sample size • When the population is not normal, the sampling distribution is normal if the sample size is large (CLT). Confidence Interval for Population Mean  Known • X is sample mean from a random sample • Sample size is large or population is approximately normal • Population standard deviation is known • CI is:    X  z*    n Sampling Distribution of the Sample Mean Unknown  • • X   s X  n • When the population is normal, the sampling distribution is normal regardless of sample size • When the population is not normal, the sampling distribution is normal if the sample size is large (CLT) Confidence Interval for Population Mean Unknown  • X is sample mean from a random sample • Sample size is large or population is approximately normal • Population standard deviation is known • CI is: * s t : n  1 df X t n Student’s t-Distribution • Recall that a standard normal distribution is a bell-shaped distribution with parameters and  • The t-distribution is bell-shaped and centered on 0. • There are many t-distributions differentiated by the degrees of freedom – which is n-1 • Each t-curve is a little more spread out than the zcurve but as n gets larger and larger, the tdistribution approaches the z-curve.  Student’s t-Distribution • Recall from our study of sampling distribution the properties of the sampling distribution of X • When the population standard deviation is not known, then X is distributed according to the tdistribution • This distribution will give us critical values a little higher than a normal distribution since we don’t know the value of the population distribution -therefore introducing a little more uncertainty t-Distribution Table • Appendix III in the back of your textbook Choosing the Sample Size • When estimating the population mean using a large sample or a small sample from a normal population, the bound on error estimation, associated with a 95% CL is    B  1.96   n • Since population standard deviation is usually unknown we can – Make a best guess – Divide the Range by 4 Degrees of Freedom • The number of independent pieces of information that go into the estimate of the parameter • The number of values in the calculation of a statistic that are free to vary • The number of pieces of independent pieces of info that go into an estimate minus the number of parameters estimated

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 9: Estimation Using a Single Sample