Download likelihood ratio test

LECTURE 7 8. Hypothesis Testing 8.1 Introduction Exercise 8.1 In 1000 tosses of a coin, 560 heads and 440 tails appear. Is it reasonable to assume that the coin is fair? We need a model of the experiment. Let X ∼ Bin(1000,p) denote the number of heads in 1000 throws and p the probability of obtaining head in one throw. Then we need a hypothesis about p corresponding to the coin being fair. Definition 8.1.1 A hypothesis is a statement about a population parameter. The goal of a hypothesis test is to decide which of two complementary hypotheses is true. Definition 8.1.2 The two complementary hypotheses in a hypothesis testing problem are called the null hypothesis and the alternative hypothesis, denoted by H0 and H1, respectively. In our example we put H0: p = 1/2 H1: p > 1/2 How do we decide which hypothesis to hold for true? 1 LECTURE 7 Definition 8.1.3 A hypothesis test is a rule that specifies: i. For which sample values the decision is made to accept H0 as true. ii. For which sample values H0 is rejected and H1 is accepted as true. The subset of the sample space for which H0 will be rejected is called the rejection region or critical region. The complement of the rejection region is called the acceptance region. A hypothesis test is specified in terms of a test statistic W(X1,…,Xn) = W(X), a function of the sample. In our example, such a test statistic could be finding the probability of the observed number of heads, and more extreme numbers, i.e. x 1000  x 1000  1   1  P(X ≥ 560) =       x x 560   2   2  1000 ≈ 0.0000825 Using the normal approximation we have X ∼ n(500, 250) giving  X  500 559.5  500    ≈ P(Z ≥ 3.763) ≈ 250   250 P(X ≥ 560) = P  0.0000839 2 LECTURE 7 Histogram: 1000 obs from Bin(1000,0.5) 140 120 Frequency 100 80 60 40 20 0 460 480 500 520 540 560 C2 Histogram of C6 2500 Frequency 2000 1500 1000 500 0 420 440 460 480 500 520 540 560 C6 Histogram 100000 obs from Bin(1000,0.5) 2500 Frequency 2000 1500 1000 500 0 432 450 468 486 504 C2 3 522 540 558 LECTURE 7 8.2 Methods of Finding Tests 8.2.1 Likelihood Ratio Tests We are finding a test statistic using the ratio between the value of the likelihood function when maximized under the null hypothesis and maximized under no restriction with respect to the parameter θ. If this ratio is very small it means that the likelihood under H0 is much smaller than the likelihood under H1 and we would reject our null hypothesis. The critical limit is determined by specifying the significance level of the test. Definition 8.2.1 The likelihood ratio test statistic for testing H0: θ ∈ 0 versus H1: θ ∈  c0 is sup L( | x) λ(x) = 0 sup L( | x) .  A likelihood ratio test (LRT) is any test that has a rejection region of the form {x: λ(x)≤c}, where c is any number satisfying 0 ≤ c ≤ 1. “If the upper bound M of a set E belongs to that set then M is called the maximum element” (answers.yahoo.com) 4 LECTURE 7 Example 8.2.2 (Normal LRT) Given: X1,…,Xn random sample from a n(ϴ,1) population. Task: Test H0: ϴ = ϴ0 versus H1: ϴ ≠ ϴ0. Solution: The numerator equals L(ϴ0|x). The unrestricted MLE of ϴ is X , the sample mean. The denominator is L( x |x). The LRT statistic is  n  (2 ) exp  ( xi   0 ) 2 / 2  i 1 = λ(x) =  n  n / 2 (2 ) exp  ( xi  x ) 2 / 2  i 1  n / 2  n  n   2 exp    ( xi  0 )   ( xi  x )2  / 2 = exp  n( x  0 ) 2 / 2   i 1  i 1 We reject H0 for small values of λ(x), or, equivalently, for large values of x  0 . 5  LECTURE 7 Example 8.2.3 (Exponential LRT) X1,…,Xn random sample from an exponential population Given: e  ( x  ) , x   f(x|θ) =  where -  < θ <  x  0, The likelihood function is   xi  n )  ,   x(1) e L(θ|x) =    x(1)   0, (x(1) = min xi) Consider testing H0: θ ≤ θ0 versus H1: θ > θ0. The denominator in the likelihood ratio is maximized when θ is as large as possible in the interval -  < θ ≤ x(1). This gives L(x(1)|x) = e   xi  nx(1) If x(1) is ≤ θ0, then the numerator of λ(x) is also L(x(1)|x). If x(1) is ≥ θ0, then the numerator of λ(x) is L(θ0|x). Therefore we obtain x(1)   0 1, λ(x) =   n( x(1)  0 ) , x(1)   0 e We reject H0 for small values of λ(x), which implies a rejection region {x: x(1) ≥ θ0 – log(c)/n}. Note that the rejection region depends on the sample only through the sufficient statistic X(1). 6 LECTURE 7 If there is a sufficient statistic T(X) for θ, then we can determine an LRT based on T and its likelihood function L*( θ|t) = g(t|θ) rather than L(θ|x). Given that all information about θ is in T(X) such a test should be as good as the test based on X. In fact it is! Theorem 8.2.4 If T(X) is a sufficient statistic for θ and λ*(t) and λ(x) are the LRT statistics based on T and X, respectively, then λ*(T(x)) = λ(x) for every x in the sample space. Proof: The Factorization Theorem ⇒ f(x|θ) = g(T(x)|θ)h(x). sup f ( x |  ) sup L( | x) λ(x) = 0 sup L( | x)  = 0 sup f ( x |  )  0 sup g (T ( x) |  )   0 sup g (T ( x) |  )h( x)  sup L * ( | T ( x)) sup g (T ( x) |  ) = sup g (T ( x) |  )h( x)  0 sup L * ( | T ( x))  7   * (T ( x)) LECTURE 7 Example 8.2.5 (LRT and sufficiency) Suppose X ∼ n(θ,1). X is sufficient for θ and is n(θ,1/n). An LRT based on this statistic would be obtained from λ*(t) = n exp(n( x   0 ) 2 / 2) 2  exp(n( x   0 ) 2 / 2) n exp(n( x  x ) 2 / 2) 2 and H0: ϴ=ϴ0 is rejected for small values of λ*(t), which is equivalent to large values of | x -ϴ0|. In statistics, a nuisance parameter is any parameter which is not of immediate interest but which must be accounted for in the analysis of those parameters which are of interest. The classic example of a nuisance parameter is the variance, σ2, of a normal distribution, when the mean, μ, is of primary interest. (Wikipedia) 8 LECTURE 7 Example 8.2.6 (Normal LRT with unknown variance) Given: X1,…,Xn with Xi ∼ n(μ,σ2) Hypotheses: H0: μ≤ μ0 versus H1: μ > μ0. σ2 is a nuisance parameter. L(  , 2 | x) max λ(x) = { , 2:    0 , 2  0} L(  , | x) 2 max = { , 2:     , 2  0} if ˆ   0 1  =  L(  0 ,ˆ 02 | x) if ˆ   0  L( ˆ ,ˆ 2 | x)  This leads to a test equivalent to Student’s t statistic test. 9 LECTURE 7 8.2.3 Union-Intersection and Intersection-Union Tests Example 8.2.8 (Normal union-intersection test) Given: X1,…,Xn from n(μ,σ2). Test: H0: μ=μ0 versus H1: μ≠μ0 . Now we can write H0 as the intersection of two sets: H0: {μ:μ≤ μ0} ⋂ {μ:μ≥ μ0}. The LRT of H0L: μ≤ μ0 versus H1L: μ> μ0 is X  0 ≥ tL S/ n reject H0L if The LRT of H0U: μ≥ μ0 versus H1L: μ< μ0 is reject H0U if X  0 ≤ tU. S/ n Combining the two tests the union-intersection test of H0: μ=μ0 versus H1: μ≠μ0 is reject H0 if X  0 X  0 ≥ tL or ≤ tU. S/ n S/ n 10 LECTURE 7 So, in general, if our null hypothesis can be written as the intersection H0: θ ∈   and tests are available for each   H0γ: θ ∈ Θγ, then the rejection region for the U-I test is  x : T (x)  R .   Alternatively, if our null hypothesis can be written as the union of separate hypothesis, H0: θ ∈   the rejection   region of the intersection-union test is given by  x : T (x)  R .   11 LECTURE 7 Example 8.2.9 (Acceptance sampling) Considering the quality of upholstery, two properties are of importance, measured by ϴ1 = mean breaking strength, and ϴ2 = probability of passing a flammability test. H0: {ϴ1 ≤ 50 or ϴ2 ≤ .95} versus H1: {ϴ1 > 50 and ϴ2 > .95} A batch of material is acceptable only if H0 is rejected. Obs. on breaking strength: X1,…,Xn assumed to be n(ϴ1,σ2) The LRT of H01: ϴ1 ≤ 50 is rejected if ( X  50) /(S / n )  t . Obs. on flammability tests: Y1,…,Ym with Yi=1 if it passes the test, Yi=0 otherwise. Each Yi is modeled Bernoulli(ϴ2) m The LRT of H02: ϴ2≤.95 is rejected if  yi  b . i 1 Combining the two tests, the intersection-union test is given by m x  50    t and  yi  b (x, y ) : s/ n   i 1 12 LECTURE 7 Exercise 8.2 No of yearly traffic accidents in a city, X, is assumed to be Poisson(λ). Average number in past years, λ = 15. This year X = 10. Does it indicate a drop in the accident rate? Find the probability of {X≤10| λ = 15}! 15i 15 P{X≤10| λ = 15} =  e ≈.11846. i  0 i! 10 Using the normal approximation X ≈ n(15,15) gives  X  15 10.5  15  P{X≤10}=P    =Φ(-1.1619) =.12264 15   15 13 LECTURE 7 Exercise 8.6 Two independent samples: X1,…,Xn ∼ exponential(ϴ) and Y1,…Ym ∼ exponential(μ) (a) Find the LRT test of H0: ϴ=μ versus H1: ϴ≠μ. The likelihood function under H0, is given by n n 1 xi  m 1   xi   yi yi  i 1 1 L(ϴ|x,y) =  e   e  = i 1 m i 1  nm  e i 1 Taking logarithms and differentiating gives the MLE:  x   yi ˆ0  i nm which we insert into the likelihood function: L(ˆ0 |x,y) = ( n  m) n  m n m i 1 i 1 (  xi   yi ) n  m e( n  m) The likelihood function under H1 is given by n n y  i x  i m 1 1 L(ϴ,μ|x,y) =  e   e  = i 1 i 1   xi m  yi  i 1  i 1 1  n  e m  Taking logarithms and differentiating give the MLE´s m n  xi  yi ˆ = i 1 and ̂  i 1 n m 14 LECTURE 7 which we insert into the likelihood function L(ˆ , ̂ |x,y) = nn mm ( xi ) n ( yi ) ( n  m) e m The likelihood ratio is (n  m) n  m  ( xi ) n ( yi ) m  λ(x,y) =  n m m n    n m  ( xi   yi )    yi  (n  m) n  m   xi (b) λ(x,y) =     = n m x  y x  y     n m  i i  i i n ( n  m) n  m nnmm T n 1  T m where T = m  xi  xi   yi (c) The sum of n independent exponentially distributed variables Xi with parameter ϴ is gamma(n,ϴ) and the sum of m independently distributed variables Yi is gamma(m, ϴ). So T is beta(n,m). 15 LECTURE 7 Derived from other distributions (Wikipedia)  The kth order statistic of a sample of size n from the uniform distribution is a beta random variable, U(k)∼B(k,n + 1 − k).[6] If and then If and then  If   and then .  E(X) = kθ Var(X) = kθ2  If , then X has an exponential distribution with rate parameter λ.   If , then X is identical to χ2(ν), the chi-squared distribution with ν degrees of freedom. Conversely, if and c is a positive constant, then .   If is an integer, the gamma distribution is an Erlang distribution and is the probability distribution of the waiting time until the -th "arrival" in a one-dimensional Poisson process with intensity 1/θ. If and , then .   If X has a Maxwell-Boltzmann distribution with parameter a, then .   , then follows a generalized gamma distribution with parameters , , and .   , then ; i.e. an exponential distribution: see skew-logistic distribution. (Wikipedia) 16

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download likelihood ratio test