Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LECTURE 3 Chapter 7.3.1: The expectation and variance of the sample mean We will denote the sample size by n (where n ≤ N ) and the values of the sample numbers by X1, X2, . . . , Xn. The Xi’s are considered to be random variables. Also Xi is not the same as xi: Xi is the value of the ith member of the sample and xi is that of the ith member of the population. xi is fixed and not random. Note We observe from the definition of s.r.s. that X1 and X2 are not independent random variables. The sample mean is defined to be n 1X Xi . X̄ = n i=1 1 X̄ is the usual estimate of the population mean µ. Likewise, the usual estimate of the population total τ is T = N X̄. Since X̄ is a random variable, it has a probability distribution. This distribution is called the sampling distribution of X̄. The sampling distribution of X̄ determines how accurately X̄ estimates µ. 2 Lemma A Let the members of the population of interest be x1, x2, . . . , xN . Denote the distinct values assumed by the population members by ζ1, ζ2, . . . , ζm, and denote the number of population members that have the value ζj by nj , j = 1, . . . , m. Then Xi is a discrete random variable with probability mass function (pmf): nj P (Xi = ζj ) = . N Also E(Xi) = µ, Var(Xi) = σ 2. Proof The only possible values that Xi can take are ζ1, . . . , ζm. 3 Since Xi is the ith member of a s.r.s., the probability that Xi = ζj is nj /N , i.e. nj P (Xi = ζj ) = . N The expected value of Xi is given by m X E(Xi) = ζj P (Xi = ζj ) j=1 m 1 X = nj ζj N j=1 N X 1 = xi N i=1 = µ. 4 Finally, Var(Xi) = E(Xi2) − [E(Xi)]2 m 1 X = nj ζj2 − µ2 N 1 = N 1 = N j=1 N X x2i − µ2 i=1 N X (xi − µ)2 i=1 = σ 2. This proves Lemma A. As a measure of the center of the sampling distribution of X̄, we will use E(X̄). As a measure of the dispersion of the sampling distribution about this center, we will use the standard deviation of X̄. 5 Written assignment 1: #6.4.8. Due Monday 1 February 2010. Theorem A With s.r.s., E(X̄) = µ. Proof From Lemma A, we have E(Xi) = µ. Thus n 1X E(X̄) = E( Xi ) n i=1 = n X 1 n i=1 = µ. E(Xi) This proves Theorem A. Corollary A With s.r.s., E(T ) = τ . Proof We observe that E(T ) = = = = E(N X̄) N E(X̄) Nµ τ. 6 This proves Corollary A. The result that E(X̄) = µ can be interpreted to imply that “on the average” X̄ = µ. In general, if we wish to estimate a population parameter, θ say, by an estimator θ̂ and E(θ̂) = θ, whatever the value θ may be, we say that θ̂ is unbiased. Consequently, X̄ and T are unbiased estimates of µ and τ respectively. We shall next evaluate Var(X̄). From Chapter 4.3 of the text, we have n n 1 XX Cov(Xi, Xj ). Var(X̄) = 2 n i=1 j=1 Notice that Cov(Xi, Xj ) 6= 0 since Xi and Xj are not independent. 7 Lemma B For s.r.s. (i.e. simple random sampling without replacement), σ2 , if i 6= j. Cov(Xi, Xj ) = − N −1 Proof We observe from Chapter 4.3 that Cov(Xi, Xj ) = E(XiXj ) − E(Xi)E(Xj ), (1) and E(XiXj ) m m X X ζk ζl P (Xi = ζk , Xj = ζl ) = = k=1 l=1 m X ζk P (Xi = ζk ) k=1 m X × ζl P (Xj = ζl |Xi = ζk ). l=1 8 (2) Now, P (Xj = ζl |Xi = ζk ) nl /(N − 1) if k 6= l, = (nl − 1)/(N − 1) if k = l. Thus m X ζl P (Xj = ζl |Xi = ζk ) l=1 = X = l6=k m X l=1 nl nk − 1 + ζk ζl N −1 N −1 nl 1 − ζk . ζl N −1 N −1 9 Thus it follows from (2) that E(XiXj ) m m X nk X nl 1 − ζk ) ζk ( ζl = N N −1 N −1 k=1 l=1 m X 1 2 (τ − ζk2nk ) = N (N − 1) k=1 τ2 1 − = N (N − 1) N (N − 1) m X ζk2nk k=1 N X 1 N µ2 − = x2i N − 1 N (N − 1) i=1 1 N µ2 − (µ2 + σ 2) = N −1 N −1 2 σ = µ2 − N −1 σ2 = E(Xi)E(Xj ) − N −1 10 (3) We conclude from (1) and (3) that σ2 if i 6= j. Cov(Xi, Xj ) = − N −1 This proves Lemma B. The following theorem evaluates Var(X̄). Theorem B With s.r.s., σ2 N − n ) Var(X̄) = ( n N −1 σ2 n−1 = (1 − ). n N −1 11 Proof From Corollary A of Chapter 4.3, we have Var(X̄) n X n X 1 = 2 Cov(Xi, Xj ) n 1 = 2 n i=1 j=1 n X i=1 n X X 1 Var(Xi) + 2 Cov(Xi, Xj ) n i=1 j6=i σ2 σ2 1 = − 2 n(n − 1) . n N −1 n The last equality uses Lemmas A and B. This proves Theorem B. 12 Note If the sample is a simple random sample with replacement, then Cov(Xi, Xj ) = 0, if i 6= j, σ2 Var(X̄) = . n In Theorem B, n−1 ) (1 − N −1 is called the finite population correction factor. The ratio n/N is usually known as the sampling fraction. 13 Consequently, if the sampling fraction is very small, sampling with replacement and sampling without replacement essentially give the same results. That is σ2 Var(X̄) ≈ . n In the case of T = N X̄, we have Corollary B With s.r.s., σ2 N − n 2 . Var(T ) = N ( ) n N −1 Proof. We observe that T = N X̄. Hence it follows from Theorem B that Var(T ) = N 2Var(X̄) 2 N −n σ . = N 2( ) n N −1 This proves Corollary B. 14