* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download part 1
Theoretical ecology wikipedia , lookup
Regression analysis wikipedia , lookup
Operational transformation wikipedia , lookup
Predictive analytics wikipedia , lookup
Computer simulation wikipedia , lookup
Numerical weather prediction wikipedia , lookup
Data assimilation wikipedia , lookup
History of numerical weather prediction wikipedia , lookup
Vector generalized linear model wikipedia , lookup
Expectation–maximization algorithm wikipedia , lookup
General circulation model wikipedia , lookup
Least squares wikipedia , lookup
Tropical cyclone forecast model wikipedia , lookup
Vector Generalized Additive Models and applications to extreme value analysis (1) (2) Olivier Mestre (1,2) Météo-France, Ecole Nationale de la Météorologie, Toulouse, France Université Paul Sabatier, LSP, Toulouse, France Based on previous studies realized in collaboration with : Stéphane Hallegatte (CIRED, Météo-France) Sébastien Denvil (LMD) SMOOTHER « Smoother=tool for summarizing the trend of a response measurement Y as a function of predictors » (Hastie & Tibshirani) estimate of the trend that is less variable than Y itself  Smoothing matrix S Y*=SY The equivalent degrees of freedom (df) of the smoother S is the trace of S. Allows compare with parametric models.  Pointwise standard error bands COV(Y*)=V=S tS ² given an estimation of ², this allows approximate confidence intervals (values : ±2square root of the diagonal of V) SCATTERPLOT SMOOTHING EXAMPLE  Data: wind farm production vs numerical windspeed forecasts SMOOTHING  Problems raised by smoothers How to average the response values in each neighborhood? How large to take the neighborhoods?  Tradeoff between bias and variance of Y* SMOOTHING: POLYNOMIAL (parametric)  Linear and cubic parametric least squares fits: MODEL DRIVEN APPROACHES SMOOTHING: BIN SMOOTHER  In this example, optimum intervals are determined by means of a regression tree SMOOTHING: RUNNING LINE  Running line KERNEL SMOOTHER  Watson-Nadaraya SMOOTHING: LOESS  The smooth at the target point is the fit of a locally-weighted linear fit (tricube weight) CUBIC SMOOTHING SPLINES  This smoother is the solution of the following optimization problem: among all functions f(x) with two continuous derivatives, choose the one that minimizes the penalized sum of squares n  Y  f  X  i 1 2 i i Closeness to the data    f "  x  dx b 2 a penalization of the curvature of f It can be shown that the unique solution to this problem is a natural cubic spline with knots at the unique values xi Parameter  can be set by means of cross-validation CUBIC SMOOTHING SPLINES  Cubic smoothing splines with equivalent df=5 and 10 Additive models  Gaussian Linear Model  Gaussian Additive model : : IE[Y]=o+1X1+2X2 IE[Y]=S1(X1)+S2(X2) S1, S2 smooth functions of predictors X1, X2, usually LOESS, SPLINE Estimation of S1, S2 : « Backfitting Algorithm »  PRINCIPLE OF THE BACKFITTING ALGORITHM Y=S1(X1)+e  estimation S1* Y-S1*(X1)=S2(X2)+e  estimation S2* Y-S2*(X2)=S1(X1)+e  estimation S1** Y-S1**(X1)=S2(X2)+e  estimation S2** Y-S2**(X2)=S1(X1)+e  estimation S1*** Etc… until convergence Additive models  Additive models One efficient way to perform non-linear regression, but…  Crucial point ADAPTED WHEN ONLY FEW PREDICTORS 2, 3 predictors at most Additive models  Philosophy DATA DRIVEN APPROACHES RATHER THAN MODEL DRIVEN APPROACH USEFUL AS EXPLORATORY TOOLS  Approximate inference tests are possible, but full inferences are better assessed by means of parametric models Generalized Additive models (GAM)  Extension to non-normal dependant variables  Generalized additive models : additive modelling of the natural parameter of exponential family laws (Poisson, Binomial, Gamma, Gauss…). g[µ]==S1(X1)+S2(X2)  Vector Generalized Additive Models (VGAM): one step beyond… Example 1 Annual umber and maximum integrated intensity (PDI) of hurricane tracks over the North Atlantic Number of Hurricanes  Number of Hurricanes in North Atlantic ~ Poisson distribution Factors influencing the number of hurricanes  GAM applied to number of hurricanes (YEAR,SST,SOI,NAO) GAM model  Log()= o+S1(SST)+S2(SOI) PARAMETRIC model  “broken stick model” (with continuity constraint) in SOI, revealed by GAM analysis  log() = o+SOI(1)SOI+SSTSST = o+SOI(1)SOI+SOI(2)(SOI-K)+SSTSST SOI<K SOIK  The best fit obtained for SOI value K=1 log-likelihood=-316.16, to be compared with -318.71 (linearity) standard deviance test allows reject linearity (p value=0.02)  Expectation  of the hurricane number is then straightforwardly computed as a function of SOI and SST EXPECTATION OF HURRICANE NUMBERS OBSERVED vs EXPECTED: r=0.6
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            