Download Econometrics + Data needed

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Metamorphosis of Europe
9-10 December 2015
Vienna, Austria
Session: Econometrics-Merging with Political Processes and
going Global
Professor El Thalassinos
University of Piraeus, Greece
Editor ERSJ: www.ersj.eu
“The dramatic expansion of econometric and
quantitative-modeling techniques has been one of the
most significant trends throughout the social sciences
since the 1990s”.
The bigbadwolf in internet
Metamorphosis of Europe, December 2015
Vienna, Austria
1
Metamorphosis of Europe, December 2015
Vienna, Austria
2
Metamorphosis of Europe, December 2015
Vienna, Austria
3
Metamorphosis of Europe, December 2015
Vienna, Austria
4
Metamorphosis of Europe, December 2015
Vienna, Austria
5
The evolution of models
•Rational-choice
framework of neo-classical economics,
•Mathematical models of risk-analysis,
•Game theory models,
have now spread into the ‘political’ domains of military
conflict, state forms and ethno-linguistic identities.
The resulting discipline—part economics, part statistics, part
quantitative political science—now plays a central role not
only in scholarship, but in formulating policy options for global
institutions.
Metamorphosis of Europe, December 2015
Vienna, Austria
6
The fundamental objection
Econometrics rests on prior assumptions and post-hoc
hypotheses which remain systematically unexamined.
The recent state-of-the-art development theory centers:
•firstly, on mobilizing researchers to reduce complex social
phenomena to quantifiable, comparable series of data—the
process of reduction itself usually involving value judgments
which are scarcely questioned;
•secondly, on the models themselves.
Metamorphosis of Europe, December 2015
Vienna, Austria
7
Critical attention
Critical attention is given to:
•the theoretical assumptions underpinning the ‘hypothetical
leap’ between the statistical result;
•the researcher’s ultimate explanation of it.
Metamorphosis of Europe, December 2015
Vienna, Austria
8
The impact in social sciences
•Social, historical
and political determinants have been
reduced to a set of numbers at the beginning of the process.
•At its end-point they return, in disembedded form; or
embedded only in the commonsense—which is to say,
ideological—assumptions of the researcher.
Metamorphosis of Europe, December 2015
Vienna, Austria
9
The use in social sciences
•Econometrics
must recognize its place, as a lower-order set
of tools which may generate correlations or discrepancies
whose elucidation requires more richly theorized—more
conceptually and empirically developed—forms of enquiry.
•Econometric analysis is used more than with its process.
Anyone can take any piece of information and manipulated
to their use.
•But used correctly and presented properly, its a useful tool.
Metamorphosis of Europe, December 2015
Vienna, Austria
10
(+) or (-) tool of analysis
Alternatives:
•Other techniques not related to econometrics;
•Market analysis;
•Experiment;
•Social models applied to groups;
•Neither of the above.
Metamorphosis of Europe, December 2015
Vienna, Austria
11
Laws and Limits of Econometrics I
•“General
weaknesses and limitations of the econometric
approach.
•A template from sociology is used to formulate six laws that
characterize mainstream activities of econometrics and their
scientific limits.
•Proximity theorems that quantify by explicit bounds how
close we can get to the generating mechanism of the data
and the optimal forecasts of next period observations using a
finite number of observations”.
PETER C. B. PHILLIPS, COWLES FOUNDATION PAPER NO. 1081
Metamorphosis of Europe, December 2015
Vienna, Austria
12
Laws and Limits of Econometrics II
•The
magnitude of the bound depends on the characteristics
of the model and trajectory of the data.
•The future of econometrics by using advanced econometric
methods interactively with a web browser.
•A fundamental issue that bears on all practical economic
analysis is the extent to which we can expect to understand
economic phenomena by the process of developing a theory,
taking observations and fitting a model.
•An especially relevant question in practice is whether there
are limits on how well we can predict future observations
using empirical models that are obtained by such processes.
Metamorphosis of Europe, December 2015
Vienna, Austria
13
Laws and Limits of Econometrics III
•The
true model for any given dataset is unknown.
•The formulated model is correct but still depends on
parameters that need to be estimated from data.
•Data are scarce relative to the number of parameters that
need to be estimated.
•Models have some functional representation that
necessitates the use of nonparametric or semi-parametric
methods.
•The empirical limitations on modeling are greater than in
finite parameter models.
Metamorphosis of Europe, December 2015
Vienna, Austria
14
Laws and Limits of Econometrics IV
•All
models are wrong.The models developed in economic
theory are metaphors of reality, sometimes amounting to a
very basic set of relations that are easily rejected by the data.
•Models continue to be used, often because they contain a
kernel of truth that is perceived as an underlying ‘economic
law’.
•It is advantageous to use this information in crafting an
empirical model even though it is at best only approximately
true.
•It is better than using an entirely unrestricted system or an
arbitrarily restricted one.
Metamorphosis of Europe, December 2015
Vienna, Austria
15
Laws and Limits of Econometrics V
From the Sargan Lecture,The Economic Journal, 113 March
2003 “Laws of Econometrics”
•These laws of econometrics are not intended as universal
truths.
•They purport to express the essence of what is being done
in econometrics and to characterize some of the difficulties
that the econometric approach encounters in explaining and
predicting economic phenomena.
•Related views about modeling have been suggested in
Cartwright (1999) and Hoover (2001).
Metamorphosis of Europe, December 2015
Vienna, Austria
16
Laws and Limits of Econometrics VI
Cartwright:
•Models can be interpreted as machines that generate laws
(so-called nomological machines).
•The laws that may emerge from modeling are analogous to
the morals that we draw from storytelling fables.
Hoover:
•Economic modeling is useful to the extent that it sheds light
on empirical relationships.
•Formal laws seems to do nothing for economics – ‘even
accumulated falsifications or anomalies do not cause
scientists to abandon an approach unless there is the
prospect of a better approach on offer’.
Metamorphosis of Europe, December 2015
Vienna, Austria
17
Laws and Limits of Econometrics VII
Rissanen (1986, 1989):
•Argues against the concept of a true model and sees
statistics as a ‘language for expressing the regular features of
the data’.
Some proximity theorems that measure how close an
empirical model can get (in terms of its likelihood ratio) to
the true model in some parametric family.
Metamorphosis of Europe, December 2015
Vienna, Austria
18
Laws and Limits of Econometrics VIII
Ploberger and Phillips (2001, 2002):
•The bounds in these proximity theorems depend on the data as
well as on the model being used.
•The bounds are greater for trending data than when the data are
stationary.
•These theorems allow for finite parameter families and families
with local misspecifications.
•Modeling algorithms allow for gross misspecification within family
groups.
•Proximity theorems for prediction are also provided in this
approach, quantifying limits on empirical forecasting capability that
are relevant in empirical work where specification is suspect.
Metamorphosis of Europe, December 2015
Vienna, Austria
19
The Six Laws of Econometrics I
•The
six laws of econometrics are not intended as universal
scientific truths.
•They are laws that characterize the activities and limitations
of econometrics.
Metamorphosis of Europe, December 2015
Vienna, Austria
20
First law
C29 Laws and Limits of Econometrics, the Royal Economic Society 2003
1. Some methods work, some don’t
•
Econometrics has in large part been concerned with the
development of statistical machinery that is appropriate for
economic models and economic data.
•
The process occurs because sometimes the usual statistical methods
work well and sometimes they do not.
•
The process is well illustrated by the steady progression of modeling
practice and econometric methodology.
(Fisher,1907; Koopmans, 1937; Tinbergen, 1939; Sargan, 1958 and
1959; Hansen 1982; Johansen, 1988; Phillips and Hansen, 1990;
Jeganathan, 1997; Kim and Phillips, 1999; Robinson and Marinucci,
1998 and 2001; Shimotsu and Phillips 2002 and more)
Metamorphosis of Europe, December 2015
Vienna, Austria
21
Second law
2. It’s different on infinite dimensional spaces
•Econometrics is about trying to achieve generality wherever that is
possible with regard to aspects of a model about which there is little
prior knowledge.
•On the other hand, where a model connects most closely with some
underlying economic hypothesis, we often seek to retain specificity
through direct parameterization.
•These considerations have led to a flowering of work on nonparametric
and semiparametric estimation.
(Hardle and Linton, 1994; Baillie, 1996; Linton, 1996; Horowitz, 1998;
Xiao and Phillips 1998 and 2002; Bandi and Phillips 2002 and more)
Metamorphosis of Europe, December 2015
Vienna, Austria
22
Third law
3. Unit roots always cause trouble
•Unit roots are the new hill people of econometrics.
•Unit roots inevitably cause trouble because of the nonstandard
limit distributions.
•The discontinuities that arise in the limit theory as the
autoregressive parameter passes through unity.
(Chan and Wei, 1987; Park and Phillips, 1988 and 1989; Sims
and Uhlig, 1991; Kim, 1994; Phillips and Ploberger, 1996; Phillips
and Xiao,1998; Phillips et al., 2001; Phillips and Sul, 2002 and
more)
Metamorphosis of Europe, December 2015
Vienna, Austria
23
Fourth law
4. Cross section dependence also causes trouble
•It is convenient and has for long been common econometric practice to assume
cross section independence in panel modeling up to a time specific effect.
•Cross section dependence is often to be expected in microeconomic
applications of firm and individual behaviour.
•It is almost always present in regional or cross country macroeconomic panels.
•It is widely acknowledged as a major characteristic of financial panels.
•Cross section dependence is a rapidly growing field of study in panel data
analysis.
•There are many limitations to the models being used and unresolved difficulties
for empirical workers.
•A primary difficulty arises because there is no natural ordering of cross section
data, making it hard to characterise and model dependence across section.
(Maddala, 1993; Stock and Watson, 1998 and 1999; Bai and Ng, 2001; Moon
and Perron, 2001; Phillips and Sul, 2002 and more)
Metamorphosis of Europe, December 2015
Vienna, Austria
24
Fifth law
5. No one understands trends
•In spite of the importance of trends in macroeconomic research,
particularly in the study of economic growth and growth convergence,
economic theory provides little guidance for empirical research on the
formulation of trend functions.
•This partly explains the rather impoverished class of trend formulations
that are in use in econometrics.
•Most commonly, these are polynomial time trends, simple trend break
polynomials, and stochastic trends, which include unit root models, near
unit root models and fractional processes.
•When the focus is on trend elimination (for instance, in the extraction of
the cyclical component of a series for studying business cycles),
smoothing methods are popular.
(Whittaker, 1923; Schoenberg, 1964; Wahba, 1978; Hodrick-Prescott,
1980; Baxter and King, 1999; Corbae et al., 2002 and more)
Metamorphosis of Europe, December 2015
Vienna, Austria
25
Sixth law
6. Spurious regression has become universal and carries a
pejorative connotation that generally makes empirical researches
anxious to show that their relationships are validated by some
procedure such as a test for cointegration
•The deterministic trend functions can be used as a coordinate
system for measuring the trend behaviour of an observed variable.
•A set of functions can be used as a coordinate basis for studying
another function.
•Continuous stochastic processes such as Brownian motion and
diffusions also have representations in terms of functions with
coefficients that are random variables rather than constant Fourier
coefficients.
•In a similar way, we can write trending data in terms of
coordinates comprised of other trends, like time polynomials,
random walks or other observed trends.
(Hendry, 1980; Phillips, 1998; Clements and Hendry, 1999 and
2001; Phillips, 2002 and more)
Metamorphosis of Europe, December 2015
Vienna, Austria
26
Concluding remarks
•Be
aware of problems and limitations in Econometrics.
•Use the best possible methodology applied in your case.
•Compare and refer to other similar studies.
•Be careful in interpretations and explanation of results.
•Every science has its own gray holes.
•The simple is beautiful.
•Data collection is a very difficult task.
Metamorphosis of Europe, December 2015
Vienna, Austria
27
Metamorphosis of Europe
9-10 December 2015
Vienna, Austria
Session: Data collection
Professor El Thalassinos
University of Piraeus, Greece
Editor ERSJ: www.ersj.eu
“I define statistical data editing (SDE) as those methods that are used to edit
(i.e., clean-up) and impute (fill-in) missing or contradictory data.The end result of
SDE is data that can be used for intended analytic purposes.These include
primary purposes such as estimation of totals and subtotals for publications
that are free of self-contradictory information”.
STATE OF STATISTICAL DATA EDITING AND CURRENT RESEARCH
PROBLEMS William E.Winkler
Metamorphosis of Europe, December 2015
Vienna, Austria
28
Initial problems for “big data”
A set of doubts regarding “big data” collection:
•“Big data” is bringing statistics out into the mainstream (even if they
don’t call it statistics) and it creating lots of opportunities for people
with statistics training.
•Problems with respect to hardware infrastructure and algorithm
design.
•“Small big data” is the dataset that is collected by an individual
whose data collection skills are far superior to his/her data analysis
skills.
•The person who collected the data is often not qualified/prepared to
analyze it. If the data collector didn’t arrange beforehand to have
someone analyze the data, then they’re often stuck.
•The grant that paid for the data collection didn’t budget (enough) for
the analysis of the data.
Metamorphosis of Europe, December 2015
Vienna, Austria
29
Statistical Data Editing (SDE)
SDE can be used in all phases of survey processing:
•frame development;
•form design;
•proposed analytic purposes for which the data are collected;
•quality assurance.
The main goal of SDE might be improved procedures and
greater automation to enhance the ability of survey
managers and analysts to provide accurate published
estimates and micro-data.
Metamorphosis of Europe, December 2015
Vienna, Austria
30
SDE subcategories
1. Fellegi-Holt (FH) methods and systems:
•
FH systems are based on the Fellegi-Holt model (1976) of
editing and typically add various options for imputation.
•
Data files are edited using custom software that incorporates
if-then-else rules developed by subject-matter specialists.
•
If the specialists are unable to develop the full logic needed for
the edit rules, then the subsequent edit software can be in
error.
2. General methods and systems:
General methods are all other methods.
Metamorphosis of Europe, December 2015
Vienna, Austria
31
FH model and systems I
FH provided the theoretical basis of such a system that had
three goals that are:
1.The data in each record should be made to satisfy all edits
by changing the fewest possible variables (fields).
2. Imputation rules should derive automatically from edit
rules.
3.When imputation is necessary, it should maintain the joint
distribution of variables.The key to the FH approach is to
understand the underpinnings of goal 1.
Metamorphosis of Europe, December 2015
Vienna, Austria
32
FH model and systems II
The FH ideas give formal ways of development that greatly
facilitate creating sets of edits.The key features of a Fellegi-Holt
system are:
1. Edit restraints reside in easily modified tables.
2.The logical consistency of the entire edit system can be checked
prior to the receipt of data.
3.The main logic resides in reusable mathematical routines.
4. In one pass through the system, records satisfy edits.
Implementations of FH systems have typically either been for
discrete data (e.g., categorical) to which arbitrary edits are
applied or for continuous data to which ratio or linear
inequality edits are applied.
Metamorphosis of Europe, December 2015
Vienna, Austria
33
Misspecifications in data gathering
Three research areas that have arisen in recent years and depend
heavily on record linkage ideas.
The first is microdata confidentiality and associated reidentification methods.
The second is analytic linking as introduced by Scheuren and
Winkler (1993, 1997). Analytic linking refers to the merging
and proper analysis of data (quantitative and discrete) taken
from two or more files.The analysis is intended to adjust for
the biases due to linkage error.
The third presents some of the methods of information retrieval
and machine learning as used by computer scientists in web
search engines and data mining applications.
Metamorphosis of Europe, December 2015
Vienna, Austria
34
Two sets of data
Basic research problems in for comparison and computing
weights:
•The basic research problems have been open since the work
of Newcombe et al. (1959) and Fellegi and Sunter (1969).
•Partial progress in solving the problems has occurred.
•The major difficulties in all situations have been determining
how identifying information can be used.
• What the relative value of a field is in matching in
comparison with other fields.
Metamorphosis of Europe, December 2015
Vienna, Austria
35
Basic research problems I
1.
When can frequency-based matching improve over
simple agree/disagree matching?
(Newcombe et al.,1959; Fellegi and Sunter, 1969; Winkler,
1988 and1989)
2.What is the best method for estimating parameters under
conditional independence?
(Winkler, 1988 and1990; Nigam et al.,1999)
Metamorphosis of Europe, December 2015
Vienna, Austria
36
Basic research problems II
3.When does accounting for dependencies help in matching?
(Smith and Newcombe,1975;Winkler,1993 and 1994;
Thibaudeau,1993; Armstrong and Mayda,1993; Larsen and
Rubin,1999; Gill, 1999)
4.What are (suitable) ways of estimating error rates?
(Winkler and Thibaudeau, 1991; Rubin and Stern, 1993;
Scheuren and Winkler, 1993; Winkler,1994; Belin and
Rubin,1995; Larsen and Rubin, 1999)
Metamorphosis of Europe, December 2015
Vienna, Austria
37
Advanced research problems I
. Methods and underlying models that are closely related to the basic
ideas of record linkage.
1. Confidentiality of microdata is most closely related because record
linkage methods can be used for evaluating the re-identification risk
in public-use files. Since the quantitative data in a public-use file are
typically masked, new metrics for comparing quantitative data can
yield higher re-identification rates.
2. Analytic Linking is the methodology (Scheuren and Winkler 1997) for
using not directly comparable data items to improve matching and
to account for the effect of matching error in analyses.
3. Data mining and some models for information retrieval in computer
science use Bayesian networks for classifying documents using freeform textual information.
Metamorphosis of Europe, December 2015
Vienna, Austria
38
Advanced research problems II
1. Confidentiality
There is substantially increased need to supply researchers with
large, general-purpose public-use files that can be used for a
variety of analyses. Balancing the analytic needs are the
requirements that agencies not release individually identifiable
data. If a public-use file is created, then agencies must determine if
the file meets analytic needs and is confidential.
(Kim, 1986 and1989; Fuller 1993; Tendick and Matloff, 1994;
Kim and Winkler,1995; De Waal and Willenborg, 1996 and 1998;
Makov and Sanil, 1997; Winkler, 1998; Sweeney, 1999; MateoSanz and Domingo-Ferrer, 1998; Fienberg, Fienberg, Makov, and
Steele, 1998)
Metamorphosis of Europe, December 2015
Vienna, Austria
39
Advanced research problems III
2. Analytic Linking Researchers often have the need to analyze
large amounts of data that result from the merger of two or
more administrative files in which unique identifiers are
unavailable.
Scheuren and Winkler (1993) showed how regression
analyses might be adjusted for biases due to linkage
errors. In the simplest situation of two variables, the
dependent variable might be taken from one file and the
independent variable from another file.
(Besag et al., 1974 and1986 and1995; Geman and Geman,
1984; Belin and Rubin, 1995; Scheuren and Winkler, 1997;
Van Dyk,1999;Winkler, 1999)
Metamorphosis of Europe, December 2015
Vienna, Austria
40
Advanced research problems IV
3. Data Mining Machine learning algorithms that employ
Bayesian networks are tools being applied to classify text into
different groups.
Bayesian networks are one of the standard tools in data
mining.They are also used for information retrieval methods
such as used in some of the web search engines.
(Fellegi and Sunter,1969;Winkler, 1988, 1989 and 1993;
Nigam et al., 1999; Larsen and Rubin,1999).
Metamorphosis of Europe, December 2015
Vienna, Austria
41
Conclusion
•One
conclusion we can draw is that we need to get more
statisticians out into the field both helping to analyze the
data.
•Designing good studies so that useful data are collected in
the first place (as opposed to merely “big” data).
•There aren’t enough of us on the planet to fill the demand.
•Come up with more creative ways to get the skills out there
without requiring our physical presence.
Metamorphosis of Europe, December 2015
Vienna, Austria
42