Download Modeling spatial risk of the Foot-Mouth

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Middle East respiratory syndrome wikipedia , lookup

Oesophagostomum wikipedia , lookup

Sexually transmitted infection wikipedia , lookup

Chickenpox wikipedia , lookup

Bioterrorism wikipedia , lookup

Hepatitis B wikipedia , lookup

African trypanosomiasis wikipedia , lookup

Eradication of infectious diseases wikipedia , lookup

Hepatitis C wikipedia , lookup

Pandemic wikipedia , lookup

Transcript
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
Modeling spatial risk of the Foot-Mouth-Disease epidemic in South Korea
EunHye Yoo1 and JiYoung Lee2
1 Department
of Geography, University of Buffalo (SUNY), USA
of Geoinformatics, Seoul University, S. Korea
2 Department
*Corresponding author: eunhye@buffalo.edu
Abstract: The 2010/2011 food and mouth (FMD) disease epidemic in South Korea spread nationwide and resulted in substantial economical damage. Its rapid transmission throughout the
country despite the government’s control policy argues for the need to improve our understanding of its spatial dimensions. The spatial point patterns of FMD incidences revealed that FMD
incidences formed statistically significant spatial clusters during the early phase of epidemic but
its significance and the spatial dimensions dynamically evolved over the course of epidemic. We
developed a log-Gaussian Cox process (LGCP) model to quantify and to make inferences on
the spatial distribution of FMD risk, while accounting for both known risk factors and unexplained variation. The FMD risk surface estimated from the LGCP model captured the spatial
heterogeneity of FMD risk driven by environmental and landscape factors and unknown risk
variations at both local and regional scales. This study demonstrated that point pattern analyses
and statistical models enabled investigators to improve our understanding on the spatial dynamics of FMD, which may provide useful information for decision makers to make efficient and
effective control strategies for future outbreaks.
Keywords: foot and mouth disease, spatial point pattern analysis, log-Gaussian Cox process
I INTRODUCTION
Foot and mouth disease (FMD) is considered by far the most serious infection between livestocks due to its rapid transmission (Keeling, 2005). In South Korea (S. Korea) the outbreaks of
FMD during November 2010 and April 2011 spread nationwide and resulted in a total of 3.48
millions susceptible animals being culled (Park et al., 2013). The spatial pattern analysis of
infectious diseases may reveal the shape of disease and guide how to control their future spread.
However, most studies on the recent FMD incidences in S. Korea have overlooked the potential
association between the observed FMD cases and spatial risk factors, and merely focused on
descriptive statistics (Moutou and Durand, 1994; Alexandersen et al., 2003). In this study, we
aim to fill the gap of FMD literature by examining the spatial patterns of infection risk and
their dynamic changes. We will make an inference on the biological processes underlying the
observed FMD cases using a log-Gaussian Cox process (LGCP) model.
II
MATERIAL AND METHODS
2.1 Data
The official public health records of the FMD epidemic in S. Korea were used, which include
the location of farms on which FMD infection was confirmed, the reported date of infection,
the type of livestock infected. The types of livestocks infected include cattle, pig, deer, and
goat, but we focused on cattle and pigs in this paper. Assuming that the spatial variation of
FMD prevalence is affected by both the environmental factors and the spatial configurations of
1
144 / 366
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
livestock farms, we incorporated 2010 national census data on the number of domestic animals
and farms. Figure 1(A) shows the elevation, (B) and (C) are the spatial distribution of livestock
density and the average temperature of December of 2010 and January of 2011. The driving
distance between each grid cell at the same resolution to the index farm in Andong is illustrated
in Figure 1(D). The index farm was considered as an initial seed of infection because there was
hardly any FMD cases reported in S. Korea prior to this epidemic in 2010 (Park et al., 2013).
2500
1200
1000
800
600
400
200
0
(a)
2000
1500
1000
500
0
(b)
35
350
300
250
200
150
100
50
30
25
20
(c)
(d)
Figure 1: Spatial distributions of risk factors; (a) Elevation (in meter), (b) Density of livestock farm
holdings per non-overlapping 5 × 5 km2 areas, (c) Average temperature in winter months (in Fahrenheit),
(d) Driving distance from the point source of infection (in kilometer).
2.2 Spatial point pattern analysis
The K function describes the extent to which there is spatial dependence in the arrangement of
georeferenced point locations of events in the study area (Gatrell et al., 1996). This function
can be estimated from an observed event distribution and its behavior in a particular situation
is assessed. One of the standard reference distribution is homogeneous poisson model. Given
that FMD is infectious disease which forms a spatial cluster, homogeneous model is least sensible to the observed FMD cases. As an alternative, we estimated the inhomogeneous K-function
iso
(r) (Baddeley et al., 2000) where the spatially varying intensity is estimated from the obK̂inhom
served FMD cases. To make an inference about observed point patterns under the consideration
of non-stationarity, we simulated envelops with the inhomogeneous K function. An observed
process is assumed to be different than the null process model, if the estimated K function at a
specific distance lie outside the envelop (Moller and Waagepetersen, 2003).
2
145 / 366
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
2.3 A log-Gaussian Cox Process model
A special case of Cox procesess where log-intensity is a Gaussian process is a log-Gaussian
Cox Process (LGCP) model (Møller et al., 1998; Diggle, 2013). We modeled the spread of
FMD in S. Korea as a poisson process that was partially driven by observed risk factors. This
conceptualization enabled us to link the observed global heterogeneity in the dispersion of FMD
to the spatial patterns of environmental factors. On the other hand, we viewed the unexplained
spatial variability of the point process as outcomes of infectious cases transmitting the disease
to nearby susceptible. Assuming that FMD incidence locations {si , i = 1, . . . , n} are independently distributed conditional on a Gaussian spatial random field U (s), the LGCP model for
FMD is written as
{si ; i = 1, . . . , n}|U(s) ∼ P oisson(Λ(s))
log Λ(s) = μ + X(s)β + U (s)
Cov{U (s), U (s )} = σ 2 ρ[|s − s |/θ]
(1)
(2)
(3)
the vector of covariates X(s) = [X1 (s), . . . , X4 (s)] consists the elevation X1 , the density of
livestock holdings X2 , average temperature of winter months X3 at the event location s, and
X4 denotes a driving distance from the index farm. We included X4 as a risk factor for FMD
risk, because the initial FMD case on November 28th, 2010 served as a source of infection in
subsequent cases and the risk of infection decreases as susceptible is further away. Based on
related studies (Ferguson et al., 2001; Grubman and Baxt, 2004), which showed that cold temperatures aided the virus to persist outside the host for a longer time period, we incorporated the
spatial variability of winter temperature as one of known risk factors in our LGCP model. The
unexplained spatial variation U (s) in FMD risk is a stochastic component (the latent Gaussian
process) whose spatial structure was modeled as a function of the separation vector between any
two incidence locations with a correlation function specified with two parameters — a range θ
and a sill σ 2 . In geostatistical jargon, a range refers to the maximum distance at which spatial
dependence exists and a sill is a maximum variance of the process.
The target of inference of the LGCP model in Equation 2 and 3 includes {β, θ, σ, U (s)}. Møller
et al. (1998) originally used the method of minimum contrast estimation, which is often computationally formidable (Diggle, 2013). We addressed this computational challenge by using two
approximations: the Gaussian Markov Random Field (GMRF) approximation to the Matern
correlation (Banerjee et al., 2004) and integrated nested Laplace approximation (INLA) for the
marginal posterior distributions. More specifically, we used the lgcp function implemented in
geostatsp library (Brown, 2015) where a computational grid that consisted of cells 5 × 5 km2 in
dimensions is superimposed over the study region A. The FMD risk surface Λ(·) was defined
over a collection of cells gl , k = 1, . . . , L that form a disjoint partition of A. Let Y (gl ) be the
case count within the lth cell, where the count follows a Poisson distribution with a FMD risk
Λl . The risk is defined as product of an offset δl and the intensity function (or spatial random
function) Λl , which is a function of known risk factors and a latent Gaussian process as
Yl ∼ P oiss(δl Λl )
log(Λl ) = μ + X1l β1 + X2l β2 + X1l β3 + X2l β4 + Ul
(4)
(5)
The spatial random effects U (s) is approximated by Ul using GMRF where s ∈ gl with Gaussian
priors with a zero mean, θ ∈ [5, 20] km for range of the spatial random effects, and σ ∈ [1, 4]
for its standard deviation.
3
146 / 366
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
G
GGG
GG G
G
G
G
G G
G G
G
G
G
G
G
G
G
G G
G GG
G
GG
G
G
GGGG
G
GG
G
G
G G
G
GG
G
G
G
GG
G G G GGG
G G G
G
G
G G
G
G
G
0.15
GG
G
0.10
G
G
GG
0.05
Proportion
0.20
III RESULTS
The FMD epidemic in 2010/2011 has shown that more than 90% of FMD cases occurred at five
provinces throughout the epidemic and over 60% of the total number of livestock farms in S.
Korea reside within these five provinces. We further examined their relationships in the LGCP
model by taking the total number of the farms in each town and treated them as population at
risk. The observed FMD epidemic curve in Figure 2(a) showed that the major peaks reached on
the week 4 and week 7.
G
G GG
GG
GGG
G
G
G G
G
G
G
G
G
GG
G
GGGG G
G
G
G
GG
G
G
G
G
G G G
GG
GGG
G
GG
G
GG
G
G G
G
G
G
G GG
G
G
G
G
G
G
G
G
G
Period 1
Period 2
Period 3
Period 4
G
0.00
G
G
G
GGG G
G
G
G
GG
GG G
G
G
G
1
3
5
7
9
12
15
21
Week
(a) Time-FMD case distribution over the course of epidemics. Three peaks of the epidemic was denoted by the
symbol of circle at week 1, 4, and 7.
(b) FMD incidences over four time periods.
Figure 2: Temporal and spatial profile of FMD incidences in S. Korea
We defined four time periods of the FMD epidemic to represent the natural phases of the epidemic: weeks 1-2 were the period of rapid spread followed by a sharp decrease (Period 1); in
weeks 3-5 there was a medium- to long-distance spread combined with localized transmission
from newly affected farms (Period 2); in weeks 6-8 the epidemic peaks (Period 3); in weeks
9-22 eradication was achieved (Period 4). The locations of FMD incidences are shown in the
map of Figure 2(b) with different symbols per period. We characterized their spatial patterns
using the inhomogeneous K-function, and the results are summarized in Figure 3. Spatial clustering is commonly found during the first three periods, although the spatial extents at which the
clustering is observed are different with varying statistical significances. K-function for Period
1 in Figure 3(A), for example, shows statistically significant clustering across all distances, but
the spatial cluster exists up to 20 km in Period 2. The K-function estimates for Period 3 (Figure
3(C)) shows statistically significant spatial clustering at medium- to long-distance. Lastly, the
K-function in the last period in Figure 3(D) showed regularity without statistical significance.
Bayesian inference on the LGCP model parameters were conducted using the R-INLA package
in the statistical computing language R (R Core Team, 2015). The posterior distribution of
spatial intensity of FMD and spatial random effects are shown in Figures 4: The lower and
upper bounds of the posterior mean of spatial intensity of FMD cases are shown in Figures
4 (A), (C), and (B), respectively. The corresponding posterior distribution of spatial random
effects are shown in Figures 4(D), (F), and (E). The posterior distribution of parameters are
∗
also summarized in Table 1. The posterior mean of parameter estimates β̂ are similar to β̂
4
147 / 366
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
(B)
4e+09
0e+00
2e+09
K i nh om (r )
4e+09
2e+09
0e+00
K i nh om (r )
6e+09
6e+09
(A)
10
20
30
40
0
20
r (km)
(C)
(D)
30
40
30
40
^ ob s
K i nh om(r )
eo
K ith
nh om(r )
6.0e+09
K i nh om (r )
4e+09
^h i
K i nh om(r )
^l o
K i nh om(r )
0.0e+00
2e+09
0e+00
K i nh om (r )
10
r (km)
1.2e+10
0
0
10
20
30
40
0
r (km)
10
20
r (km)
Figure 3: Inhomogeneous K function for four periods; (A) Period 1, (B) Period 2, (C) Period 3, and
(D) Period 4. The solid and dotted lines, respectively, represent the inhomogeneous K function estimate
and the K function under the null process, a theoretical expected value. The upper and lower simulation
envelopes were shaded in a light grey colour.
except the coefficient of elevation β̂1 which was not statistically significant. It is clear that the
influence of temperature appears to be relatively substantial compared to other risk factors in
LGCP model, too. Both the FMD risk map in Figure 4 and the non-zero estimate of intercept μ̂
indicate that the risk of FMD was not constant over the study region but instead was spatially
varying. Lastly, the posterior mean of range parameter (θ̂) is approximately 14.49 km with
lower and upper bounds of 9.94 km and 20.12 km, respectively.
Table 1: Posterior distribution of LGCP model parameters
(Intercept)
Elevation
Livestock holdings
Average temperature
Driving distance
Range
Std. dev.
Mean
Std. Dev.
0.025 quantile
0.975 quantile
12.25
-0.08
0.30
-3.84
-1.10
14, 492.42
1.12
2.99
0.08
0.08
0.83
0.23
2, 602.11
6.38
-0.23
0.14
-5.48
-1.56
9, 944.41
0.94
18.15
0.07
0.47
-2.21
-0.66
20, 120.24
1.32
IV DISCUSSION AND CONCLUSIONS
We described the FMD transmission process in S. Korea during 2010-2011 using spatial point
pattern analyses. The application of inhomogeneous K function to the farm level FMD cases
5
148 / 366
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
1.4
1.2
1.0
0.8
0.6
0.4
0.2
(A)
(B)
(C)
(E)
(F)
3
2
1
0
−1
−2
−3
(D)
Figure 4: Posterior distribution of the spatial LGCP prediction; (A) 0.025 quantile, (B) Mean, and (C)
0.975 quantile of FMD risk (intensity) E{Λ|Y }; (D) 0.025 quantile, (E) Mean, and (F) 0.975 quantile of
spatial random effects E{U |Y }
across four time periods suggested the presence of dynamically evolving the spatial patterns of
FMD transmission over the course of the epidemic. We found that clusters at short scales during
Period 2 and 3 were not statistically significant, which contrasts to the common assumption that
contagious diseases form a spatial cluster. The absence of short scale clusters (less than 5 km)
in both periods might be due to the enforcement of the government’s control policy including
culling and the movement restriction (Park et al., 2013).
Inhomogeneous K function analysis enabled us to account for the non-stationarity of the disease
spread, but the spatial intensity was purely based on the observed FMD cases. For a further
investigation of FMD transmission with respect to the behaviors and appearance of intensity
function, we developed a LGCP model where spatially varying intensity of the disease was
estimated via a log-linear function of environmental and socioeconomic factors. We also took a
Bayesian approach to explicitly take into account parameter uncertainty. We found that LGCP
models for FMD can be used as a tool for improving our understanding of the spread of FMD
and optimizing disease control. For instance, our analyses demonstrated that temperature plays
a critical role in determining the spread of FMD — probably because the virus tends to persist
in lower temperatures. As shown in the map of FMD intensity in Figure 4(A)-(C), the area with
high intensity is centered at the source of infection where the onset of FMD epidemic started.
On the other hand, the maps of unexplained risk in Figure 4(D)-(F) illustrate the areas with high
to low risk whose shapes and sizes vary over the study area.
The Bayesian estimates of the LGCP model parameters and prediction of random effects enabled us to assess the spatial variability of FMD incidences risk and to identify the spatial
clustering under the explicit consideration of population at risk. The non-zero estimate of spa-
6
149 / 366
Proceedings of Spatial Accuracy 2016
[ISBN: 978-2-9105-4510-5]
tial variability confirmed that FMD incidences formed a surprising tendency to cluster together
even after the spatially inhomogeneous livestock farm density was taken into account. Perhaps LGCP may not be the only means of testing the hypothesis of spatial independence or
interactions between events as variant K-functions can provide similar outcomes. However,
LGCP models provide statistically rigorous inference and estimation and allow investigators
to estimate the second order properties of the underlying process, that is, the spatial dependence of events. Furthermore, the LGCP inference using INLA approximation has relatively
tractable moment properties, which allowed a simple method of parameter estimation and a
flexible specification of the space-time covariance structure. We simplified the mean structure
as a set of spatial covariates but this approach is naturally extendable to incorporate additional
covariate information.
The proposed analysis/modeling framework is general and can be applied to other infectious
diseases or FMD cases in other countries. The proposed LGCP model can fully account for the
temporal aspect of FMD incidences by incorporating relevant temporal covariates and specifying spatio-temporal covariance structure (Diggle, 2013). In doing so, for example, we might
have been able to take into account the changes in the susceptible population due to the culling
policy. We expect that the longer time series of FMD incidence will improve our understanding
(given situation) and allows us to evaluate a wide range of control strategies using the model.
References
Alexandersen S., Zhang Z., Donaldson A., Garland A. (2003). The pathogenesis and diagnosis of foot-and-mouth
disease. Journal of comparative pathology 129(1), 1–36.
Baddeley A. J., Møller J., Waagepetersen R. (2000). Non-and semi-parametric estimation of interaction in inhomogeneous point patterns. Statistica Neerlandica 54(3), 329–350.
Banerjee S., Carlin B. P., Gelfand A. E. (2004). Hierarchical Modeling and Analysis for Spatial Data. Boca Raton,
FL.: Chapman and Hall/CRC.
Brown P. E. (2015). Model-based geostatistics the easy way. Journal of Statistical Software 63(12), 1–24.
Diggle P. J. (2013). Statistical Analysis of Spatial and Spatio-temporal Point Patterns. Boca Raton, FL: CRC
Press.
Ferguson N. M., Donnelly C. A., Anderson R. M. (2001). The foot-and-mouth epidemic in Great Britain: pattern
of spread and impact of interventions. Science 292(5519), 1155–1160.
Gatrell A. C., Bailey T. C., Diggle P. J., Rowlingson B. S. (1996). Spatial point pattern analysis and its application
in geographical epidemiology. Transactions of the Institute of British Geographers, 256–274.
Grubman M. J., Baxt B. (2004). Foot-and-mouth disease. Clinical Microbiology Reviews 17(2), 465–493.
Keeling M. J. (2005). Models of foot-and-mouth disease. Proceedings of the Royal Society of London B: Biological
Sciences 272(1569), 1195–1202.
Møller J., Syversveen A. R., Waagepetersen R. P. (1998). Log Gaussian Cox processes. Scandinavian Journal of
Statistics 25(3), 451–482.
Moller J., Waagepetersen R. P. (2003). Statistical inference and Simulation for Spatial Point Processes. Boca
Raton, FL: CRC Press.
Moutou F., Durand B. (1994). Modelling the spread of foot-and-mouth disease virus. Veterinary Research 25(2),
279–284.
Park J.-H., Lee K.-N., Ko Y.-J., Kim S.-M., Lee H.-S., Shin Y.-K., Sohn H.-J., Park J.-Y., Yeh J.-Y., Lee Y.-H.,
et al. (2013). Control of foot-and-mouth disease during 2010–2011 epidemic, South Korea. Emerging Infectious
Diseases 19(4), 655.
R Core Team (2015). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation
for Statistical Computing.
7
150 / 366