Download Solutions - School of Mathematics and Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
MAS2317/3317
NEWCASTLE UNIVERSITY
SCHOOL OF MATHEMATICS & STATISTICS
SEMESTER 2 2014–2015
MAS2317/3317
Introduction to Bayesian Statistics: Mid–semester test
Time allowed: 50 minutes
Candidates should attempt all questions. Marks for each question are indicated.
There are SIX questions on this paper.
Answers to questions should be entered directly on this question paper in the spaces provided.
This question paper must be handed in at the end of the test.
Name:................................................................................................................
A single A4 sheet of your own notes may be used during the test.
MAS2317/3317
Please leave margins blank
Beta distribution: If X ∼ Be(α, β) then it has density
xα−1 (1 − x)β−1
,
B(α, β)
f (x|α, β) =
0 < x < 1,
α > 0, β > 0.
Also, E(X) = α/(α + β), V ar(X) = αβ/{(α + β)2 (α + β + 1)},
and M ode(X) = (α − 1)/(α + β − 2).
Exponential distribution: If X ∼ Exp(λ) then it has density
f (x|λ) = λe−λx ,
x > 0,
λ > 0.
Also, E(X) = 1/λ and V ar(X) = 1/λ2 .
Gamma distribution: If X ∼ Ga(α, λ) then it has density
f (x|α, λ) =
λα xα−1 e−λx
,
Γ(α)
x > 0,
α > 0, λ > 0.
Also, E(X) = α/λ and V ar(X) = α/λ2 .
Normal distribution: If X ∼ N (µ, 1/τ ) then it has density
o
n τ
τ 1/2
exp − (x − µ)2 , −∞ < x < ∞,
f (x|µ, τ ) =
2π
2
−∞ < µ < ∞, τ > 0.
Also, E(X) = µ and V ar(X) = 1/τ .
Note: If X ∼ N (0, 1), then the following quantiles apply:
x
Pr(X < x)
1.2816
0.9
1.6449
0.95
1.9600
0.975
2.3263
0.99
2.5758
0.995
Poisson distribution: If X ∼ P o(λ) then it has probability function
f (x|λ) =
λx e−λ
,
x!
x = 0, 1, . . . ,
λ > 0.
Also, E(X) = λ and V ar(X) = λ.
Page 2 of 10
MAS2317/3317
Please leave margins blank
1. Which of the following statements are true and which are false? Delete as appropriate.
A. In Bayesian statistics, the unknown parameter θ in our statistical model is a fixed
but unknown constant which we can estimate with the posterior distribution.
TRUE / FALSE
B. In frequentist statistics, a 95% confidence interval for some unknown parameter θ
“captures” the true value of θ with probability 0.95.
TRUE / FALSE
C. In frequentist statistics, around 95 out of 100 95% confidence intervals for some
unknown parameter θ will “capture” the true value of θ.
TRUE / FALSE
D. In Bayesian statistics, we can always represent prior ignorance for our unknown
parameter θ by assuming that θ ∼ U (0, 1).
TRUE / FALSE
E. If we assume Xi |θ ∼ Ga(k, θ), and θ ∼ Ga(g, h), the posterior for θ will also be
gamma.
TRUE / FALSE
F. The Bayes linear rule tells us that the posterior mean for θ is a weighted sum of
the prior mean for θ and the maximum likelihood estimate for θ.
TRUE / FALSE
[Total Q1: 6 marks]
Page 3 of 10
MAS2317/3317
Please leave margins blank
2. A major earthquake (exceeding 6.7 on the Richter scale) occurs immediately after
one of three geological activities: a degassing burst (DB), an episode of intense air
ionisation (IAO), or a continuous wave of magnetic pulsations (MP), all of which can
be considered disjoint events.
The table below shows the number of recent major earthquakes in San Francisco
that have occurred following each of these three geological activities:
Geological activity
DB
IAO
MP
Number of earthquakes
6
3
1
Past studies show that a major earthquake in San Francisco will occur with probability
0.5, 0.3 and 0.2 if DB, IAO and MP are observed (respectively).
(a) Find the probability that a major earthquake will occur in San Francisco.
Answer:
Let M : Major earthquake. Then
P (M ) = P (M |DB)P (DB) + P (M |IAO)P (IAO) + P (M |M P )P (M P )
= 0.5 × 0.6 + 0.3 × 0.3 + 0.2 × 0.1
41
(3 marks).
=
100
(b) Suppose an earthquake of magnitude 7.1 occurs in San Francisco. Find the posterior
probability distribution for the geological activities. Express your answers in the
form a/41, b/41 and c/41, where a, b and c are to be found.
Answer:
We have:
P (DB|M ) =
P (M |DB)P (DB)
=
P (M )
5
10
×
6
10
41
100
=
30
. (2 marks)
41
Similarly,
P (IAO|M ) =
P (M |IAO)P (IAO)
=
P (M )
3
10
×
41
100
Finally,
P (M P |M ) = 1 − (30/41 + 9/41) =
3
10
=
9
. (2 marks)
41
2
. (1 mark)
41
[Total Q2: 8 marks]
Page 4 of 10
MAS2317/3317
Please leave margins blank
3. The number of fatal shootings in the state of North Carolina, USA, is assumed Poisson
with rate µ per month.
(a) Assuming a random sample x = (x1 , x2 , . . . , xn )T , show that
f (x|µ) ∝ µnx̄ e−nµ ,
µ > 0.
Answer:
We have
f (x|µ) =
n
Y
µxi e−µ
i=1
as required. (2 marks)
xi !
∝µ
Pn
i=1
xi −nµ
e
= µnx̄ e−nµ ,
(b) Discussions with a criminologist regarding µ led to the screenshot shown overleaf,
taken from the MATCH online elicitation tool.
(i) Name the elicitation procedure that has been used.
Answer:
Trial roulette method. (1 mark)
(ii) Write down the fitted distribution for µ, and give the associated density, prior
mean and prior standard deviation.
Answer:
µ ∼ Ga(3.7, 0.2)
π(µ) =
(1 mark)
0.23.7 θ2.7 e−0.2θ
,
Γ(3.7)
E(µ) = 3.7/0.2 = 18.5
SD(µ) =
p
3.7/0.22 = 9.62
θ>0
(1 mark)
(1 mark)
(1 mark)
(iii) Find Pr(µ > 46.6) and interpret this in plain English as part of the feedback
and refinement process with the criminologist.
Answer:
Pr(µ > 46.6) = 0.01, suggesting that one month in a hundred will see the
number of fatal shootings exceed around 46 or 47. (3 marks)
Question 3 continued on next page
Page 5 of 10
Please leave margins blank
Question 3 continued on next page
MAS2317/3317
Page 6 of 10
MAS2317/3317
Please leave margins blank
(c) The number of fatal shootings in North Carolina was recorded for each month in
2014. Combining the associated likelihood in part (a), with the prior identified in
part (b), results in the following posterior distribution for µ:
µ|x ∼ Ga(57.7, 12.2).
(i) Find x̄, the mean number of fatal shootings observed in North Carolina in 2014.
Answer:
We know that
Posterior ∝ Prior × Likelihood
µ56.7 e−12.2µ ∝ µ2.7 e−0.2µ × µ12x̄ e−12µ .
Thus, 12x̄ = 56.7 − 2.7 = 54, giving x̄ = 4.5. (4 marks)
(ii) How do the criminologist’s beliefs compare to the observed data in our sample?
Answer:
The criminologist’s prior suggests around 18/19 fatal shootings per month – in
stark contrast to the observed data of 4.5 fatal shootings per month. (2 marks)
(iii) Does the posterior give more weight to the prior or the likelihood? Explain your
answer.
Answer:
The posterior mean is 57.7/12.2 = 4.73 – much closer to the sample mean than
the prior mean. Hence, the posterior favours the likelihood. (2 marks)
[Total Q3: 18 marks]
Page 7 of 10
MAS2317/3317
Please leave margins blank
4. A sports scientist measures the rate of oxygen consumption, X litres per minute, of
10 randomly chosen athletes immediately after exercise. The sample mean is x̄ = 2.25
litres per minute. She assumes
a Normal distribution for these measurements with a
√
standard deviation of 1/ 6 litres per minute.
A prior study suggests that µ ∼ N (2.6, 0.025).
(a) Show that µ|x ∼ N (2.39, 0.01).
Answer:
Results in the lecture notes give µ|x ∼ N (B, 1/D), where
D = d + nτ = 1/0.025 + 10 × 6 = 100,
and
¯
db + nτ x̄
1/0.025 × 2.6 + 10 × 62.25
=
= 2.39.
D
100
Thus, the posterior variance is 1/100 = 0.01, giving µ|x ∼ N (2.39, 0.01), as required. (3 marks)
B=
(b) Find the posterior probability that the mean rate of oxygen consumption is greater
than 2.518 litres per minute. [Hint: Use the tables on the inside front cover of this
test paper to help you]
Answer:
We require Pr(µ > 2.518|x), giving
2.518 − 2.39
√
Pr Z >
= 1 − Pr(Z < 1.28) = 1 − 0.9 = 0.1.
0.01
(2 marks)
Question 4 continued on next page
Page 8 of 10
MAS2317/3317
Please leave margins blank
(c) Find the minimum sample size required in order to reduce the posterior standard
deviation to at least 0.05.
Answer:
We need
√
1
d + nτ
≤ 0.05
1
≤ 0.0025
(1/0.025) + 6n
6n + 1/0.025 ≥
6n ≥
1
0.0025
1
− 1/0.025
0.0025
n ≥ 60,
i.e. the sample size must be at least n = 60. (5 marks)
[Total Q4: 10 marks]
Page 9 of 10
MAS2317/3317
Please leave margins blank
5. Suppose we have a random sample Xi , i = 1, 2, . . . , 50, from a distribution with the
following probability density function:
f (x|θ) =
θ −θ/x
e
,
x2
x, θ > 0.
Assuming a gamma prior for θ, that is, θ ∼ Ga(g, h), show that
E(θ|x) =
50 + g
.
50
X
h+
x−1
i
i=1
Answer:
The likelihood is
)
(
50
50
50
X
Y
θ −θ/xi Y −2 50
−1
=
.
e
xi θ exp −θ
xi
f (x|θ) =
x2
i=1
i=1
i=1 i
The prior is
hg θg−1 e−hθ
π(θ) =
.
Γ(g)
Thus the posterior is
(
π(θ|x) ∝ θg−1 exp {−hθ} × θ50 exp −θ
)
50
X
i=1
= θ50+g−1 exp − h +
and so
θ ∼ Ga 50 + g, h +
giving the posterior mean stated.
(
50
X
i=1
x−1
i
!
50
X
i=1
x−1
i
! )
θ ,
,
[Total Q5: 8 marks]
THE END
Page 10 of 10
Related documents