Download Chapter 7 Point Estimation of Parameters

Chapter 7 Point Estimation of Parameters 7-1 INTRODUCTION z If X is a random variable with probability distribution f (x) , characterized by the unknown parameter θ , and if X 1 , X 2 ,..., X n is a random sample of size n from X, the statistic Θˆ = h( X 1 , X 2 ,..., X ) is called a point estimator of θ . Definition A point estimate of some population parameter θ is a single numerical value θ̂ of a statistic Θ̂ . The statistic Θ̂ is called the point estimator. z z z z As an example, suppose that the random variable X is normally distributed with an unknown mean µ . The sample mean is a point estimator of the unknown population mean µ . That is, µ̂ = X . After the sample has been selected, the numerical value x is the point estimate of µ . Thus, if x1 = 25 , x2 = 30 , xc = 29 , and x4 = 31 , the point estimate of µ is 25 + 30 + 29 + 31 = 28.75 4 Similarly, if the population variance σ 2 is also unknown, a point estimator for σ 2 is the sample variance S 2 . x= z z z We often need to estimate z The mean µ of a single population. 2 z The variance σ (or standard deviation σ ) of a single population. z The proportion p of items in a population that belong to a class of interest. z The difference in means of two populations, µ1 − µ 2 . z The difference in two population proportions, p1 − p 2 . Reasonable point estimates of these parameters are as follows: z For µ , the estimate is µ̂ = x , the sample mean. 2 ˆ 2 = s 2 , the sample variance. z For σ , the estimate is σ 1 z z z For p, the estimate is pˆ = x / n , the sample proportion, where x is the number of items in a random sample of size n that belong to the class of interest. For µ1 − µ 2 , the estimate is µˆ 1 − µˆ 2 = x1 − x2 , the difference between the sample means of two independent random samples. For p1 − p2 , the estimate is pˆ 1 − pˆ 2 , the difference between two sample proportions computed from two independent random samples. 7-2 GENERAL CONCEPTS OF POINT ESTIMATION 7-2.1 Unbiased Estimators Definition The point estimator Θ̂ is an unbiased estimator for the parameter θ if ˆ = θ (7-1) E (Θ) If the estimator is not unbiased, then the difference ˆ − θ (7-2) E (Θ) is called the bias of the estimator Θ̂ . EXAMPLE 7-1 2 z X is a random variable with mean µ and variance σ . z Let X 1 , X 2 ,..., X n be a random sample of size n from the population represented by X. X and sample variance S 2 are z Show that the sample mean unbiased estimators of µ and σ 2 , respectively. z First consider the sample mean. In Equation 5.40a in Chapter 5, we showed that E (X ) = µ . z Therefore, the sample mean X is an unbiased estimator of the population mean µ . 2 z Now consider the sample variance. We have ⎡ n 2⎤ ⎢∑(Xi − X ) ⎥ n 1 ⎥= E ( S 2 ) = E ⎢ i =1 E∑ ( X i − X )2 n −1 ⎥ n − 1 i =1 ⎢ ⎥⎦ ⎢⎣ n n n ⎤ 1 1 ⎡ n 2 2 + − E ∑ ( X i2 + X 2 − 2 XX i ) = E ( X ) E ( X ) 2 E ( Xi )X ⎥ ∑ ∑ ∑ i ⎢ n − 1 i =1 n − 1 ⎣ i =1 i =1 i =1 ⎦ n n ⎤ ⎤ 1 ⎡ 1 ⎡ = E (∑ X i2 ) + nX 2 − 2nX ⋅ X ⎥ = E (∑ X i2 ) − nX 2 ⎥ ⎢ ⎢ n − 1 ⎣ i =1 ⎦ n − 1 ⎣ i =1 ⎦ = = z z n ⎤ 1 1 ⎡n E (∑ X i2 − nX 2 ) = E ( X i2 ) − nE ( X 2 )⎥ ∑ ⎢ n − 1 i =1 n − 1 ⎣ i =1 ⎦ The last equality follows from Equation 5-37 in Chapter 5. Since E ( X i2 ) = µ 2 + σ 2 and E ( X 2 ) = µ 2 + σ 2 / n (See 5-40b), we have ⎤ 1 ⎡n 2 (µ + σ 2 ) − n(µ 2 + σ 2 / n)⎥ ∑ ⎢ n − 1 ⎣ i =1 ⎦ 1 = ( nµ 2 + nσ 2 − nµ 2 − σ 2 ) n −1 = σ2 Therefore, the sample variance S 2 is an unbiased estimator of the population variance σ 2 . E (S 2 ) = z z z z Sometimes there are several unbiased estimators of the sample population parameter. For example, suppose we take a random sample of size n = 10 from a normal population and obtain the data x1 = 12.8, x2 = 9.4, x3 = 8.7, x4 = 11.6, x5 = 13.1, x6 = 9.8, x7 = 14.1, x8 = 8.5, x9 = 12.1, x10 = 10.3. The sample mean x= z 12.8 + 9.4 + 8.7 + 11.6 + 13.1 + 9.8 + 14.1 + 8.5 + 12.1 + 10.3 = 11.04 10 Sample median 10.3 + 11.6 ~ x= = 10.95 2 z 10% trimmed mean (obtained by discarding the smallest and largest 10% of the sample before averaging) xtr (10 ) = 8.7 + 9.4 + 9.8 + 10.3 + 11.6 + 12.1 + 12.8 + 13.1 = 10.98 8 3 z We can show that all of these are unbiased estimates of µ . 7-2.3 Variance of a Point Estimator Definition If we consider all unbiased estimators of θ , the one with the smallest variance is called the minimum variance unbiased estimator (MVUE). Theorem 7-1 If X 1 , X 2 ,..., X n is a random sample of size n from a normal distribution with mean µ and variance σ 2 , the sample mean X is the MVUE for µ . 7-2.4 Standard Error: Reporting a Point Estimate Definition The standard error of an estimator Θ̂ is its standard deviation, given by ˆ ) . If the standard error involves unknown parameters that can be σ Θˆ = V (Θ estimated, substitution of those values into σ Θˆ produces an estimated standard error, denoted by σˆ Θˆ . z z z z The estimated standard error is denoted by sΘ̂ or se(Θˆ ) . Suppose we are sampling from a normal distribution with mean µ and variance σ 2 . The distribution of X is normal with mean µ and variance σ 2 / n . The standard error of X σX = z σ n If we did not know σ but substituted the sample standard deviation S into the above equation, the estimated standard error of X σ̂ X = S n 4 EXAMPLE 7-2 o z Using a temperature of 100 F and a power input of 550 watts, the following 10 measurements of thermal conductivity (in Btu/hr-ft- o F) were obtained: 41.60, 41.48, 42.34, 41.95, 41.86, 42.18, 41.72, 42.26, 41.81, 42.04 o z A point estimate of the mean thermal conductivity at 110 F and 550 watts is the sample mean or x = 41.924 Btu/hr-ft- o F z The standard error of the sample mean is σ X = σ / n z Since is unknown, we may replace it by the sample standard deviation s = 0.284 to obtain the estimated standard error of X as σˆ X = 0.284 s = = 0.0898 10 n 7-2.6 Mean Square Error of an Estimator z The mean square error of an estimator Θ̂ is the expected squared difference between Θ̂ and θ . Definition The mean square error of an estimator Θ̂ of the parameter θ is defined ˆ ) = E (Θ ˆ − θ) 2 as MSE (Θ (7-3) z The mean square error can be rewritten as follows: ˆ ) = E[ Θ ˆ − E (Θ ˆ )]2 + [θ − E (Θ ˆ )]2 = V (Θ ˆ ) + (bias ) 2 MSE (Θ Exercise 7-2 : 7-11, 7-13, 7-17 5 7-3 METHODS OF POINT ESTIMATION 7-3.1 Method of Moments Definition Let X 1 , X 2 ,..., X n be a random sample from the probability distribution f (x), where f (x) can be a discrete probability mass function or a continuous probability density function. The kth population moment (or distribution moment) is E ( X k ) , k = 1, 2, … . The corresponding kth sample moment is n (1 / n)∑i =1 X ik , k = 1,2,... . z z To illustrate, the first population moment is E(X) = µ , and the first n sample moment is (1 / n)∑i =1 X i = X . The sample mean is the moment estimator of the population mean. Definition Let X 1 , X 2 ,..., X n be a random sample from either a probability mass function or probability density function with m unknown parameters ˆ ,Θ ˆ ,..., Θ ˆ are found by equating the θ1 , θ 2 ,..., θ m . The moment estimators Θ 1 2 m first m population moments to the first m sample moments and solving the resulting equations for the unknown parameters. EXAMPLE 7-3 z Suppose that X 1 , X 2 ,..., X n is a random sample from an exponential distribution with parameter λ . z Now there is only one parameter to estimate, so we must equate E(X) to X . z For the exponential, E (X ) = 1 / λ . ˆ = 1 / X is the moment z Therefore E ( X ) = X results in 1 / λ = X , so λ estimator of λ . Ex: z Suppose that the time to failure of an electronic module used in an automobile engine controller is tested at an elevated temperature to accelerate the failure mechanism. z The time to failure is exponentially distributed. z Eight units are randomly selected and tested, resulting in the following failure time (in hours): x1 = 11.96, x2 = 5.03, x3 = 67.40, x4 = 16.07, x5 = 31.50, x6 = 7.73, x7 = 11.10, and x8 = 22.38 . 6 z Because x = 21.65 , the moment estimate of λ is λ = 1 / x = 1 / 21.65 = 0.0462 . 7-4 SAMPLING DISTRIBUTIONS Definition The probability distribution of a statistic is called a sampling distribution. 7-5 SAMPLING DISTRIBUTIONS OF MEANS z The sample mean X = X 1 + X 2 + ... + X n n has a normal distribution with mean µX = µ + µ + ... + µ =µ n σ 2X = σ 2 + σ 2 + ... + σ 2 σ 2 = n n2 and variance z z If we are sampling from a population that has an unknown probability distribution, the sampling distribution of the sample mean will still be approximately normal with mean µ and variance σ 2 / n , if the sample size n is large. This is one of the most useful theorems in statistics, called the central limit theorem. Theorem 7-2: The Central Limit Theorem If X 1 , X 2 ,..., X n is a random sample of size n taken from a population (either finite or infinite) with mean µ and finite variance σ 2 , and if X is the sample mean, the limiting form of the distribution of Z= X −µ σ/ n (7-6) as n → ∞ , is the standard normal distribution. 7 z z z z The normal approximation for X depends on the sample size n. Figure 7-6(a) shows the distribution obtained for throws of a single, six-sided true die. The probabilities are equal (1/6) for all the values obtained, 1, 2, 3, 4, 5, or 6. Figure 7-6(b) shows the distribution of the average score obtained when tossing two dice. Fig. 7-6(c), 7-6(d), and 7-6(e) show the distributions of average scores obtained when tossing three, five, and ten dice, respectively. Figure 7-6 Distributions of average scores from throwing dice. [Adapted with permission from Box, Hunter, and Hunter(1978).] 8 z z In many cases of practical interest, if n ≥ 30, the normal approximation will be satisfactory regardless of the shape of the population. If n < 30, the central limit theorem will work if the distribution of the population is not severely nonnormal. EXAMPLE 7-13 z An electronics company manufactures resistors that have a mean resistance of 100 ohms and a standard deviation of 10 ohms. z The distribution of resistance is normal. z Find the probability that a random sample of n = 25 resistors will have an average resistance less than 95 ohms. z The sampling distribution of X is normal, with mean µ X = 100 and a standard deviation of σX = z z σ 10 = =2 n 25 The desired probability corresponds to the shaded area in Fig. 7-7. Standardizing the point X = 95 in Fig. 7-7, we find that 95 − 100 = −2 . 5 2 p( X < 95) = P( Z < −2.5) = 0.0062 z= EXAMPLE 7-14 z Suppose that a random variable X has a continuous uniform distribution. ⎧1 / 2,4 ≤ x ≤ 6 f ( x) = ⎨ ⎩ 0, otherwise z z z z Find the distribution of the sample mean of a random sample of size n = 40. The mean and variance of X are µ = 5 and σ 2 = (6 − 4) 2 / 12 = 1 / 3 . The central limit theorem indicates that the distribution of X is approximately normal with mean µ X = 5 and variance σ 2X = σ 2 / n = 1 /[3(40)] = 1 / 120 . The distributions of X and are shown in Fig. 7-8. 9 Figure 7-7 Probability for Example 7-13. Figure 7-8 z z z z The distributions of X and X for Example 7-14. We have two independent populations. Let the first population have mean µ1 and variance σ12 and the second population have mean µ 2 and variance σ 22 . Suppose that both populations are normally distributed. The sampling distribution of X 1 − X 2 is normal with mean µ x1 − x2 = µ x1 − µ x2 = µ1 − µ 2 σ 2X1 − X 2 = σ 2X1 − σ 2X 2 = σ12 σ 22 + n1 n2 10 Definition If we have two independent populations with means µ1 and µ 2 and variances σ 22 and σ 22 and if X 1 and X 2 are the sample means of two independent random samples of sizes n1 and n2 from these populations, then the sampling distribution of Z= X 1 − X 2 − (µ1 − µ 2 ) (7-9) σ12 / n1 + σ 22 / n2 is approximately standard normal, if the conditions of the central limit theorem apply. If the two populations are normal, the sampling distribution of Z is exactly standard normal. EXAMPLE 7-15 z The effective life of a component used in a jet-turbine aircraft engine is a random variable with mean 5000 hours and standard deviation 40 hours. z An improvement into the manufacturing process for this component that increases the mean life to 5050 hours and decreases the standard deviation to 30 hours. z A random sample of n1 = 16 components is selected from the “old” process and a random sample of n2 = 25 components is selected from the “improved” process. z What is the probability that the difference in the two sample means X 2 − X 1 is at least 25 hours? Old New µ1 = 5000 µ 2 = 5050 σ1 = 40 σ 2 = 30 µ X1 n1 = 16 = 5000 σ X1 = n2 = 25 µ X 2 = 5050 40 = 10 16 σ X2 = ∴ µ X1 − X 2 = µ X1 − µ X 2 = 5000 − 5050 = −50 σ 2X1 − X 2 = σ 2X 1 + σ 2X 2 = 10 2 + 6 2 = 136 11 30 =6 25 z Corresponding to the value in Fig. 7-9, we find that 25 − 50 = −2.14 136 P( X 2 − X 1 ≥ 25) = P( Z ≥ −2.14) = 0.9838 z= Figure 7-9 The sampling distribution of in Example 7-15. Exercise 7-5 : 7-33, 7-35, 7-37, 7-39, 7-41, 7-43, 7-49, 7-55 12 中華大學資訊工程學系學號：姓名：日期：時間：輔導老師：課程名稱：年級：□一年級班別：□甲 □乙 □二年級地點： □三年級 □丙（二部）課後輔導記錄表 □四年級 □二技專 □研一 □碩專問題描述：問題回覆：系所主任簽章：輔導老師簽章： 13 □研二

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Chapter 7 Point Estimation of Parameters