Download Spatial and spatio-temporal data models for GIS

Document related concepts

Database wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Spatial and Spatio-Temporal
Data Models for GIS
Bart Kuijpers
Limburgs Universitair Centrum
http://alpha.luc.ac.be/~lucp1265/
bart.kuijpers@luc.ac.be
Overview
• Spatial data models in GIS
– layers
– raster model, vector model
– specific models
• Spatial database systems
• Spatio-temporal data models
– challenges and problems
What is the S in GIS?
• 1980s: Geographic Information Systems
– technology for the acquisition and management of spatial information
– software for professional users, e.g. cartographers
– Example: ESRI Arc/View software
• 1990s: Geographic Information Science
– comprehending the underlying conceptual issues of representing data
and processes in space-time
– the science (or theory and concepts) behind the technology
– Example: design spatial data types and operations for querying
• 1990s: Geographic Information Studies
– understanding the social, legal and ethical issues associated with the
application of GISy and GISc
• 2000s: Geographic Information Services
– Web-sites and service centers for casual users, e.g. travelers
– Service (e.g., GPS, mapquest) for route planning
GIS -- What is it?
No easy answer anymore!
• Geographic Information
– information about places on the earth’s surface
– knowledge about “what is where when”
(Don’t forget time!)
• Geographic Information Technologies
– technologies for dealing with this information
• Global Positioning Systems (GPS)
• Remote Sensing (RM)
• Geographic Information Systems (GIS)
Examples of GIS data
•
•
•
Urban Planning, Management
– Land acquisition
– Economic development
– Housing renovation programs
– Emergency response
– Crime analysis
Environmental Sciences
– Monitoring environmental risk
– Modeling stormwater runoff
– Management of watersheds,
floodplains, wetlands, forests
– Environmental Impact Analysis
– Hazardous or toxic facility
siting
Political Science
– Analysis of election results
– Predictive modeling
•
•
•
•
•
Civil Engineering/Utility
– Locating underground facilities
– Coordination of infrastructure
maintenance
Business
– Demographic Analysis
– Market Penetration/ Share
Analysis
Education Administration
– Enrollment Projections
– School Bus Routing
Real Estate
– Neighborhood land prices
– Traffic Impact Analysis
Health Care
– Epidemiology
– Service Inventory
Older definitions of GISy
• The common ground between information processing and
the many fields using spatial analysis techniques.
(Tomlinson, 1972)
• A powerful set of tools for collecting, storing, retrieving,
transforming, and displaying spatial data from the real
world. (Burroughs, 1986)
• A computerised database management system for the
capture, storage, retrieval, analysis and display of spatial
(locationally defined) data. (NCGIA, 1987)
• A decision support system involving the integration of
spatially referenced data in a problem solving environment.
(Cowen, 1988)
Definitions of GIS
“A system of hardware, software, and
procedures designed to support the
• capture,
• management,
• manipulation,
• analysis,
• modeling and
• display of
spatially-referenced data (located on the earth’s
surface) for solving complex planning and
management problems.”
GIS Data Models
Intuitive:
a map with a database behind it.
Purpose:
allows the geographic features in real world locations
to be digitally represented and stored in a database so
that they can be abstractly presented in map form,
and can also be worked with and manipulated to address
some problem.
GIS Data Model
based on
data layers
or themes
Examples of layers or themes
• Data is organized by layers, coverages or themes, with each
theme representing a common feature.
• Layers are integrated using explicit location on the earth’s
surface, thus geographic location is the organizing principal.
Digital Elevation
Models
Watersheds
Streams
Waterbodies
An integrated view
• Layers are integrated using explicit location on the earth’s
surface, thus geographic location is the organizing principal.
Example of layers or themes
roads
Here we have three layers or
themes:
- roads,
- hydrology (water),
- topography (land elevation)
hydrology
They can be related because
precise geographic coordinates are
recorded for each theme.
topography
How are layers described?
•Layers are comprised of two data
types:
• spatial data which describes
location (where)
stored in a shape file in ArcView
• attribute data specifing what, how
much, when
 stored in a database table
GIS systems traditionally maintain
spatial and attribute data separately,
then “join” them for display or
analysis
(for example, in ArcView)
roads
hydrology
topography
How are layers described?
The spatial component
of a layer may be
represented in two ways:
• in raster (image) format
as pixels
•in vector format as
points and lines and
areas (PLA-model)
1. Raster Model
• area is covered by grid with (usually) equal-
sized cells
• cells often called pixels (picture elements);
raster data often called image data
• attributes are recorded by assigning each cell a
single value based on the majority feature
(attribute) in the cell, such as land use type
wheat
fruit
fruit
oats
clover
corn
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6 7 8 9
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
1 1 1 1 1 4 4 5 5 5
2 2 2 2 2 2 2 3 3 3
2 2 2 2 2 2 2 3 3 3
2 2 2 2 2 2 2 3 3 3
2 2 4 4 2 2 2 3 3 3
2 2 4 4 2 2 2 3 3 3
Raster Model: data structures
Runlength Compression (for single layer)
Full Matrix-162 bytes
111111122222222223
111111122222222233
111111122222222333
111111222222223333
111113333333333333
111113333333333333
111113333333333333
111333333333333333
111333333333333333
Run Length (row)-44 bytes
1,7,2,17,3,18
1,7,2,16,3,18
1,7,2,15,3,18
1,6,2,14,3,18
1,5,3,18
1,5,3,18
1,5,3,18
1,3,3,18
1,3,3,18
Raster Model: data structures
Quad-Tree Representation (for single layer)
Raster Model: data structures
Quad-Tree Representation (for single layer)
1
3
0
2
Raster Model: data structures
Quad-Tree Representation (for single layer)
Raster Model: data structures
Quad-Tree Representation (for single layer)
Raster Model: data structures
Quad-Tree Representation (for single layer)
1
0
3
2
5
7
13 15
4
6
12
14
1
3
9
11
0
2
8
10
Raster Model: data structures
Quad-Tree Representation (for single layer)
Raster Model: data structures
Quad-Tree Representation (for single layer)
Raster Model: data structures
Quad-Tree Representation (for single layer)
5
7
13 15
4
6
12
14
1
3
9
11
0
2
8
10
Raster Model: data structures
Quad-Tree Representation (for single layer)
5
7
4
6
0
12
5
7
13 15
4
6
12
14
9
11
1
3
9
11
8
10
0
2
8
10
Raster Model: data structures
Quad-Tree Representation (for single layer)
5
7
4
6
0
[0,2]
white
12
9
11
8
10
[12,2]
blue
[4,1] [5,1]
[6,1] [7,1] [8,1] [9,1] [10,1]
[11,1]
blue white
green red red white white
green
Raster Model
Raster data are good at representing
continuous phenomena, e.g.,
•Wind
speed
•Elevation, slope, aspect
•Chemical concentration
•Likelihood of existence of a certain species
•Electromagnetic reflectance (photographic or
satellite imagery)
Raster Model
• much data comes in this form
•images from remote sensing (LANDSAT, SPOT)
•scanned maps
• digital orthophoto
Raster Model
• best for continuous features:
•elevation
•temperature
•soil type
•land use
• digital elevation
model (DEM)
Raster Model: Pros and Cons
[+] Continuous (surface) data represented
easily
[+] Simple data structure, fast indexing
[–] Shape of discrete polygonal features
generalized by cells
[–] Intersection of two lines
Raster Model: tesselations
•Square grid: equal length sides
–4-connected neighborhood (rook’s case)
•all neighboring cells are equidistant
–8-connected neighborhood (queen’s case)
•all neighboring cells not equidistant
•rectangular
•triangular (3-sided) and hexagonal (6-sided)
–all adjacent cells and points are equidistant
•triangulated irregular network (TIN):
–vector model used to represent continuous surfaces (elevation)
–more later under vector
2. Vector Model
The fundamental concept of vector GIS is that all
geographic features in the real work can be
represented either as:
• points or dots (nodes): trees, poles, fire plugs,
airports, cities
• lines (arcs): streams, streets, sewers,
• areas (polygons): land parcels, cities, counties,
forest, rock type
Example: Because representation depends on shape,
ArcView refers to files containing vector data as shapefiles
Vector Model
Points: represent discrete point features
each point location
has a record in the
table
airports are point features
each point is stored as a
coordinate pair
Vector Model
Lines: represent linear features
each road segment
has a record in the
table
roads are linear features
Vector Model
Lines: fundamental spatial data model
node
vertex
node
vertex
vertex vertex
• Lines start and end at nodes
• line #1 goes from node #2 to node #1
• Vertices determine shape of line
• Nodes and vertices are stored as coordinate
pairs
Vector Model
Polygons: represent bounded areas
each bounded
Polygon has a
record in the
table
landforms and water are
polygonal features
Vector Model
Polygons : fundamental spatial data model
• Polygon #2 is bounded by lines 1 & 2
• Line 2 has polygon 1 on left and polygon 2 on right
Vector Model
Polygons: fundamental spatial data model
• complex data model, especially for larger data sets
• “arc-node topology,” only used for ArcInfo data sets
Vector Model
Shapefile polygon spatial data model
• less complex data model
• polygons do not share bounding lines
Vector Model
ArcInfo coverage spatial data model
• Commonly found format
(due to ArcInfo market dominance)
• Coordinate data not editable in ArcView
(but tabular data are editable in ArcView)
• polygons share bounding lines
Vector Model: illustration of polygon
Data File
A34
A44
A42
A32
A34
B44
B54
B52
B42
B44
C 32
C42
C40
5
4
3
E
A
B
C
D
2
1
0
1
2
3
4
5
C30
C32
D42
D52
D50
D40
D42
E15
E55
E54
E34
E30
E10
E15
Vector Model: illustration of point & polygon
Points File
5
12
11
2
1
4
3
E
2
1
1
B
3
4
C
10
0
A
9
2
3
5
8
4
6
D
7
5
1
2
3
4
5
6
7
8
9
10
11
12
34
44
42
32
54
52
50
40
30
10
15
55
Polygons File
A 1, 2, 3, 4, 1
B 2, 5, 6, 3, 2
C 4, 3, 8, 9, 4
D 3, 6, 7, 8, 3
E 11, 12, 7, 10, 11
1
I
4
II
Smith
Estate A34
IV
2 Birch
Node/Arc/ Polygon and Attribute Data
III
A35
3
Relational Representation: DBMS required!
Cherry
Spatial Data
Node Table
Node ID Easting Northing
1 126.5
578.1
2 218.6
581.9
3 224.2
470.4
4 129.1
471.9
Arc Table
Arc ID From N To N L Poly
I
4
1
II
1
2
III
2
3 A35
IV
3
4
Polygon Table
Polygon ID
Arc List
A34
I, II, III, IV
A35
III, VI, VII, XI
R Poly
A34
A34
A34
A34
Attribute Data
Node Feature Attribute Table
Node ID Control
Crosswalk
1 light
yes
2 stop
no
3 yield
no
4 none
yes
ADA?
yes
no
no
no
Arc Feature Attribute Table
Arc ID Length Condition Lanes Name
I
106 good
4
II
92 poor
4 Birch
III
111 fair
2
IV
95 fair
2 Cherry
Polygon Feature AttributeTable
Polygon ID Owner
Address
A34
J. Smith 500 Birch
A35
R. White 200 Main
Raster versus Vector Model
“raster is faster but vector is corrector” (Joseph Berry)
From: Burrough, Peter A. and Rachael A. McDonnell. (1998). Principles of Geographic Information Systems. p 27.
Raster versus Vector Model
“raster is faster but vector is corrector” (Joseph Berry)
•
Raster data model
– location is referenced by a grid
cell in a rectangular array
(matrix)
– attribute is represented as a
single value for that cell
– much data comes in this form
• images from remote sensing
(LANDSAT, SPOT)
• scanned maps
• elevation data from USGS
– best for continuous features:
•
•
•
•
elevation
temperature
soil type
land use
• Vector data model
– location referenced by x,y
coordinates, which can be
linked to form lines and
polygons
– attributes referenced through
unique ID number to tables
– much data comes in this form
• DIME and TIGER files from
US Census
• DLG from USGS for
streams, roads, etc
• census data (tabular)
– best for features with discrete
boundaries
• property lines
• political boundaries
• transportation
Variety of Vector Models
Spaghetti
model
Topological
model (most common)
Triangulated
Dime
irregular network (TIN)
files and TIGER files
Network
Digital
model
Line Graph (DLG)
Shapefile
Others:
(ArcView/ArcGIS; ESRI)
HPGL, PostScript/ASCII, CAD/.dxf
Vector Model: Spaghetti
information in n-dim. By mdim. hyperspaces (m<n)

POINT;P_LINE;
POLYGON
COMPLEX_OBJECT
Very
efficient algorithms to
detect properties
Recursive
holes
Numerous
query langauges
Source: Lakhan, V. Chris. (1996).
Introductory Geographical Information Systems. p. 54.
Vector Model: Topological
Bernhardsen, Tor. (1999). 2nd Ed. Geographic Information Systems: An Introduction. p. 62. fig. 4.12.
Topological Data Model
The topological data model is used, e.g., by the Census
Bureau of the US: four relations
R1:
every line has two endpoints
R2:
every line has two areas
R3:
every area is surrounded by lines
R4:
every point is surrounded by areas and lines
Vector Model: Topological
connections & relationships between objects are
independent of their coordinates

overcomes major weakness of spaghetti model –
allowing for GIS analysis (Overlaying, Network, Contiguity,

Connectivity)

topological invariants
Topological Data Model
A database in the topological data model consists of a finite
number of
labeled
points
labeled
lines
labeled
areas
p
D

r
in R2
I

H
v A
C


t F
E s B

J
q
Topological Data Model
The topological data model is used, e.g., by the Census
Bureau of the US: four relations
R1:
every line has two endpoints
R2:
every line has two areas
R3:
every area is surrounded by lines
R4:
every point is surrounded by areas and lines
p
D

r
I

H
v A
C


t F
E s B

J
q
Topological Data Model
The topological data model is used, e.g., by the Census
Bureau of the US: four relations
R1:
every line has two endpoints
R2:
every line has two areas
R3:
every area is surrounded by lines
R4:
every point is surrounded by areas and lines
not lossless
p
D

r
I

H
v A
C


t F
E s B

J
q
Topological Data Model
Give for each labeled point the circular list of all lines and
areas that appear clockwise around it:
List(p)=( H  D  I)
List(q)=( J  F  H)
…
lossless
List(s)=( C  B  E)
p
 = unbounded area

r
D
I

H
v A
C


t F
E s B

J
q
Vector Model: Dime files and TIGER files
Image Source: Demers, Michael. N. (2000). 2nd Ed. Fundamentals of Geographic Information Systems. p. 113. fig 4.16.
Vector Model: Dime files and TIGER files
Image Source: Clarke, Keith C. (2001). 3rd Ed. Getting Started with Geographic Information Systems. p 92.
Vector Model: DLGs
Digital Line Graphs
Image Source: Clarke, Keith C. (2001). 3rd Ed. Getting Started with Geographic Information Systems. p. 90
Vector Model: Network
Source: Heywood, Ian and Sarah Cornelius and Steve Carver. An Introduction to Geographical Information Systems. p. 60. fi
Vector Model: Topological data

adding semantics
What kind of binary topological relations exist
between spatial objects?

Vector Model: Topological

adding semantics
What kind of binary topological relations exist
between spatial objects?

Based on region’s interior and boundary (and
exterior)

Determines
whether
interior and boundary
have (non-)empty
interesection
Ao
A
Vector Model: Topological

adding semantics
What kind of binary topological relations exist
between spatial objects?


Egenhofer and Randell Cohn C…
o
A
A
Bo B
 
 
o
B
Ao
B
A
Vector Model: Topological
 
 
disjoint
 
 
meet
 
 
 
 
 
 
contains
inside
equal
 
 
covers
 
 
coveredBy




overlap
Topological relationships
Disjoint
Point/Point
Line/Line
Polygon/Polygon
Topological relationships
Touches
Point/Line
Line/Polygon
Point/Polygon
Polygon/Polygon
Line/Line
Topological relationships
Crosses
Point/Line
Point/Polygon
Line/Line
Line/Polygon
Topological relationships
Overlap
Point/Point
Line/Line
Polygon/Polygon
Topological relationships
Within/contains
Point/Point
Line/Line
Point/Line
Line/Polygon
Point/Polygon
Polygon/Polygon
Topological relationships
Equals
Point/Point
Line/Line
Polygon/Polygon
Topological relationships
line-line relationships
area-area relationships
adjacency
island
touch
branching off
cross
intersect
area-line relationships
line in an area
line ends at
an area
line ends in
an area
point-line relationships
point on line
point beside
a line
line is border
of an area
line intersects
area
line touches area
point-area relationships
point in area
point on border
of an area
TIN: Triangulated Irregular Network Surface
Polygons
Points
Node #
1
2
3
etc
X
0
525
631
Y
999
1437
886
Polygon Node #s Topology
A
1,2,4
B,D
B
2,3,4
A,E,C
C
3,4,5
B,F,G
D
1,4,6
A,H
etc
Z
1456
1437
1423
Elevation points (nodes) chosen
based on relief complexity, and
then their 3-D location (x,y,z)
determined.
Elevation points connected
to form a set of triangular
polygons; these then
represented in a vector
structure.
2
1
A
D
6
H4
E
B
3
C
F
G
5
Attribute Info. Database
Polygons
A
B
C
D
etc.
Var 1
1473
1490
1533
1486
Var 2
15
100
150
270
Attribute data associated
via relational DBMS (e.g.
slope, aspect, soils, etc.)
Vector Model: Shapefile (ArcGIS; ESRI)
This table represents
examples of the shape types
of geographic features in a
data set for a shapefile
Demers, Michael. N. (2000). 2nd Ed. Fundamentals of Geographic Information Systems. p. 114. fig 4.17.
Vector Model:
Others …
HPGL, CAD/.dxf
PostScript/ASCII
Source: Clarke, Keith C. (2001).
3rd Ed. Getting Started with
Geographic Information
Systems. p. 89. fig. 3.12.
Difference between GIS and Spatial Databases
• GIS is a software to visualize and analyze spatial data using
spatial analysis functions such as
– Search Thematic search, search by region, classification
– Location analysis Buffer, corridor, overlay
– Terrain analysis Slope/aspect, drainage network
– Flow analysis Connectivity, shortest path
– Distribution Change detection, proximity, nearest neighbor
– Spatial analysis/Statistics Pattern, centrality,
autocorrelation, indices of similarity, topology: hole
description
– Measurements Distance, perimeter, shape, adjacency,
direction
• GIS uses SDBMS
– to store, search, query, share large spatial data sets
Difference between GIS and Spatial Databases
• SDBMS focusses on
– Efficient storage, querying, sharing of large spatial
datasets
– Provides simpler set based query operations
– Example operations: search by region, overlay, nearest
neighbor, distance, adjacency, perimeter etc.
– Uses spatial indices and query optimization to speedup
queries over large spatial datasets.
• SDBMS may be used by applications other than GIS
– Astronomy, Genomics, Multimedia information systems,
...
• Will one use a GIS or a SDBM to answer the following:
– How many neighboring countries does USA have?
– Which country has highest number of neighbors?
What is a Spatial DBMS?
• A SDBMS is a software module that
– can work with an underlying DBMS
– supports spatial data models, spatial abstract data types
(ADTs) and a query language from which these ADTs
are callable
– supports spatial indexing, efficient algorithms for
processing spatial operations, and domain specific rules
for query optimization
• Example: Oracle Spatial data cartridge
– can work with Oracle 8i DBMS
– Has spatial data types (e.g. polygon), operations (e.g.
overlap) callable from SQL3 query language
– Has spatial indices, e.g. R-trees
Spatial database systems for GIS
Limitations of Relational Data Model
• Values are atomic, complex objects need to be
unnested (first normal form)
• Only atomic types, no subtyping/inheritance,
no encapsulation of operations with data, no OID
• No support for unstructured/heterogeneous data
• No support for infinite relations (spatial and spatiotemporal data) SSNr
Name
Salary
12345
Bart
50K
12346
Sofie
35K
12347
Bill
500M
Spatial Data in the Relational Model
• Boundary representation
• No notion of spatial object/type
• Mismatch with query language, e.g. SQL
(no arithmetic,…)
Name
triangle1
triangle1
triangle1
(0,1)
(0,0)
(1,0)
x
0
0
1
y
1
0
0
Extension of the Relational Model:
ADT-approach
• Relational model is augmented with ad hoc spatial
data types: point, polyline, polygon
• SQL is also extended
• [Güting et al., 1990s]
Extension of the Relational Model:
ADT-approach
• Example: ADT Polygon:
– Constructors
– Methods: containment, overlap, …
– Subtyping/Inheritance: Rectangle isa Polygon
• Query language: SQL
• [Güting et al., 1990s]
Beyond Relational: Constraint Databases
[Kanellakis, Kuper and Revesz, PODS1990]
• Constraint tuple: finite combination of atomic constraints
• Constraint relation: finite set of generalized tuples
• Semantics: infinite point sets
Id
Color
geometry
T1
T2
C1
green
red
yellow
0  x  0  y  1 x+y
x  0  0  y  1  y-x
(x-1)2+(y -1)2 = 1/9
…
Queries in constraint databases
• In practice only linear constraints [DEDALE, DISCO,
COSMOS]
• Natural query languages: FO+Poly, FO+While,
FO+TC, Datalog with polynomial constraints
•
Q(S) = {(x,y) R2|
• S(x,y)(( >0)(x’)(y’)((x-x’) 2+(y-y’) 2<  2 S(x’,y’)))}
Q
0  x  0  y  1 x+y
0  x  1 0  y  1 (x =
0  y = 0  x+y = 1)
Query evaluation in constraint Databases
• Apply Q(S) = {(x,y) R2| S(x,y)(( >0)(x’)(y’)((x-x’) 2+
(y-y’) 2<  2 S(x’,y’)))} on A given by 0  x  0  y  1 x+y
• Plugin: 0  x  0  y  1 x+y (( >0)(x’)(y’)((x-x’) 2+
(y-y’) 2<  2 0  x’  0  y’  1 x’+y’ ))}
• Complexity is huge
• Tarski (1930)
• Collins’ CAD (1975): doubly exponential in #quantifiers
• 1990s: single exponential in #quantifier-alternations
0  x  0  y  1 x+y
0  x  1 0  y  1 (x =
0  y = 0  x+y = 1)
From Spatial Data to Spatio-Temporal Data
• Temporal databases
– 1980s, one of the first special purpose database
applications
• Spatial databases
– well-studied area in GIS,
– in database theory during 1990s
• …
• Spatio-temporal data:
– studied in databases since mid 1990s,
– SSD’99 changes into SSTD’01
– has its specific problems
Spatio-Temporal Data
Examples
• Transportation: truck or ship movement,
airplane flights
• Meteorological: isobaric curves, temperature
• Climate: season, vegetation changes
• Natural disasters: forest fires, oil spills
• Ecology: species migration, vegetation changes,
habitat and land cover changes
• Society and economy: urban growth, land use
changes, epidemics
• Ownership or administrative changes
Spatio-Temporal Data
What is changing where and how?
What?
• 0D points
• 1D lines
• 2D regions
• 3D volumes
Where?
• on 1D line
• in 2D plane
• in 3D space
How?
• continuous evolution
• continuous movement
• discrete evolution
• birth, death, split, merge
Spatio-Temporal Data
What is changing where and how?
What?
• 0D points
• 1D lines
• 2D regions
• 3D volumes
Where?
• on 1D line
• in 2D plane
• in 3D space
How?
• continuous evolution
• continuous movement
• discrete evolution
• birth, death, split, merge
In combination with classical
alpha-numerical data.
Abstract Spatio-Temporal Objects
• Time is isomorphic to the reals R
• n-dimensional abstract spatio-temporal object
O  Rn  R
• It should satisfy
– slice regularity, i.e.,
Ot0 ={(x1,…,xn) | (x1,…,xn,t0)  O} should be a
familiar object (point, square, triangle, polygon)
– f(t)=Ot should be piece-wise continuous
– closure properties
Closure Properties
• A class of spatio-temporal objects C is closed
under an operation op (of arity k)
if for all O1,…, Ok  C, also op(O1,…, Ok )  C
• Relevant operations:
– intersection, union, difference
– temporal/spatial selection
– temporal/spatial projection
• Closure properties are easy to obtain for spatial
objects, more difficult for natural classes of spatiotemporal objects
The ADT-approach [Güting et al., TODS 2000]
• Spatial objects: points, polylines, polygons
• Operations:
– Set theoretic: intersection, union, difference
– Aggregation: min, max, avg, center, area,
volume
– Metrical: distance
– Topological: containment, adjacency, …
The ADT-approach: Temporal Lifting
• Temporal lifting of spatial type  with domain D :
() consist of partial functions R  D
E.g., (points), (polylines), (polygons)
• Temporal lifting of spatial operation
op: 1… k  0
op: (1)  …  (k)  (0)
• E.g. : polygons  polygons  polygons
 : (polygons)  (polygons)  (polygons)
(p1, p2)  {(t, p1(t)  p2(t))| tR}
Example of SQL3 Query
• “Find all pairs of airplanes that came closer
than 500 meters during their flight”
SELECT A.id, B.id
FROM Flights A,Flights B
WHERE A.id <> B.id
AND
minvalue(distance(A.route,B.route))<500
• Example of spatio-temporal join
Concrete Representation
[Forlizzi et al., SIGMOD 2000]
• Spatial objects: region
(=finite set of polygons with holes)
• Spatio-temporal (evolving) objects:
– finitely many spatial slices (=time intervals)
– coordinates are linear functions of time on
each slice
– segments may degenerate but cannot
rotate within a slice
– spatio-temporal object is a polyhedron.
The ADT-approach: Conclusion
• Change: both continuous and discrete
• Closure:
– for union: trivial,
– for intersection, difference, selection, projection:
depends on properties of polyhedra.
• Object-orientation:
– object types, operation signatures
– object easy to implement in object-relational DB
– no natural restriction on movement guarantees
polyhedral structure
• Queries:
– good integration with SQL3, e.g., aggregation
– expressiveness: difficult to establish
– notoriously hard to optimize
Spatio-Temporal Data in the Constraint Model
One extra time variale: t
t=0
Name
MT1
Color
white
geometry
0x-t  0 y-t 
t+1x+y 0  t  4
Spatio-Temporal Data in the Constraint Model
One extra time variale: t
t=1
Name
MT1
Color
white
geometry
0x-t  0 y-t 
t+1x+y 0  t  4
Spatio-Temporal Data in the Constraint Model
One extra time variale: t
t=2
Name
MT1
Color
white
geometry
0x-t  0 y-t 
t+1x+y 0  t  4
Spatio-Temporal Data in the Constraint Model
One extra time variale: t
t=3
Name
MT1
Color
white
geometry
0x-t  0 y-t 
t+1x+y 0  t  4
Spatio-Temporal Data in the Constraint Model
One extra time variale: t
t=4
Name
MT1
Color
white
geometry
0x-t  0 y-t 
t+1x+y 0  t  4
Spatio-Temporal Data in the Constraint Model
• Closure guaranteed
• Natural query languages: FO+Poly, FO+While, Datalog with
Constraints
• Theoretically appealing and suited to study e.g.,
expressiveness of query languages
• but few implementations (mostly with linear constraints
[DEDALE, DISCO,COSMOS])
• Representation problem: a plane circles at constant speed
• More restricted models:
– parametric spatio-temporal objects
– implementations (MLPQ, PReSTO [Revesz, SIGMOD
2000])
Spatio-Temporal Data :
Representation problem
An airplane circles at constant speed.
z=constant
x=x0+r.cos t
y=y0+r.sin t
Spatio-Temporal Query Languages
•First-order logic over the reals: relational calculus
extended with polynomial (in)equalities.
“Do plane1 and plane2 collide at 5 o’clock above
position (0,0)?“
(x)( y)( z)( t)(Plane1(x,y,z,t)  Plane2(x,y,z,t)
 x = 0  y = 0  t = 5)
•Extension of FO with a while loop: finite sequence of
statements & while loops.
• Statement: R := {(x,y,t)| (x,y,t)};
• While-loop: while  do P;
Queries in the Constraint Model
• Closure: naturally available in the model
• problems with aggregation, distance, …
• no object-oriented features
• Query langauges:
• FO + extensions,
• simple semantics,
• expressiveness well-studied,
• implementation requires constraint engine
Parametric Spatio-Temporal Objects
• An atomic geometric object consists of:
– spatial reference object S (in R2)
– reference time I (interval in R)
– transformation function parameterized by time, i.e.,
f: R2xRR2: (x,y;t)  (x’,y’)=f(x,y;t)
• A molecular geometric object consists of a finite number of
atomic objects
Example:
S=[0,1]x[0,1],
I=[1,5],
f(x,y;t)=(tx,ty)
t=0
t=5
Parametric Spatio-Temporal Objects
Example database:
“classical”
ID
pict
time interval
from to dept
arr
Spatio - temporal
Referentie Routeobject
functie
1
A
D 2h00 2h30
f1(x,y,t)
2
B
C 1h30 2h30
f2(x,y,t)
Parametric Spatio-Temporal Objects
Semantics of an atomic geometric object (S,I,f) is
{(x,y;t)| tI((x’)(y’)(x’,y’) S  (x,y)=f(x’,y’;t))}, e.g.:
(
, I, f(x,y,t) )
Semantics of an molecular geometric object is union
of semantics of its atomic objects
3D figure is always a
semi-algebraic set
y
t
x
And it is viewed as:
Reference Objects
arbitrary
Poly
Trax
sides parallel to
coordinate axes
Rect
Transformation Functions
• Affinities:
(x,y;t) 
a(t) b(t) x
e(t)
+ f(t)
c(t) d(t) y
•Scalings:
(x,y;t) 
a(t) 0 x
e(t)
+ f(t)
0 d(t) y
•Translations:(x,y;t) 
1
0
0
1
x
e(t)
+ f(t)
y
•Identity
with a(t), b(t), c(t), d(t), e(t), f(t) linear, polynomial or
rational
Transformation Functions
• Affinities:
(x,y;t) 
a(t) b(t) x
e(t)
+
c(t) d(t) y
f(t)
observe an passing object under an angle
circling airplane
•Scalings:
(x,y;t) 
a(t) 0 x
+ e(t)
0 d(t) y
f(t)
observe an object that you are walking away
from/to, oil spill, forest fire
•Translations:(x,y;t) 
1
0 x
e(t)
wind, transport
0
1 y + f(t)
•Identity
spatial data
with a(t), b(t), c(t), d(t), e(t), f(t) linear, polynomial or
rational
Closure Properties under Boolean Operations
Object O1: (the immediate proximity) of a ship
Object O2: another ship (or a natural hazard)
Object O1 O2 : danger zone
Query: “Is there any danger of collision?”
Parametric Spatio-Temporal Objects
Example database:
“classical”
ID
pict
time interval
from to dept
arr
Spatio - temporal
Referentie Routeobject
functie
1
A
D 2h00 2h30
f1(x,y,t)
2
B
C 1h30 2h30
f2(x,y,t)
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
C
A
D
Closure Properties under Boolean Operations
C
A
D
Closure Properties under Boolean Operations
C
A
D
B
Closure Properties under Boolean Operations
Which classes are closed under , , \?
A class (S,F) is closed under  iff the intersection
of (the semantics) of any two molecular geometric
objects from the class (S,F) can be described as a
geometric object from (S,F).
Property: A class (S,F) is closed under  iff it is
closed under  for atomic objects.
Closure Properties under Boolean Operations
Which classes are closed under ?

Poly
Tr
TraX
Rect
Affinity
Scaling
Translation
Rat Pol Lin Rat Pol Lin Rat
Pol
Lin
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Id
+
+
+
+
Closure Properties under Boolean Operations
Which classes are closed under  , \?
\
Affinity
Scaling
Translation
Rat Pol Lin Rat Pol Lin Rat
Poly
Tr
TraX
Rect
Property:
+
+
+
+
-
- - - - - - - + +
+
-
Pol
Lin
-
-
A class is closed under  iff
it is closed under \.
Id
+
+
+
Closure Properties under Boolean Operations





Closure Properties under Boolean Operations
Example:
A:rectangle
B:rectangle
A: (x,y) (x-t,y-t)
B: (x,y) (x,y-t)
Assume that both objects have the same time interval
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Different shapes of A  B

The intersection of two translating objects
is a SCALING object that cannot be described as a union of
translating objects (translation preserves length + area).
Closure Properties under Boolean Operations
This result is very intuitive, see example:
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
Closure Properties under Boolean Operations
The movement of the intersection can be
described by a scaling.
The intersection of two linear scaling rectangles
is a (union of) linear scaling rectangle(s) …
Closure Properties under Boolean Operations
Proof sketch:
It can be shown that if
0  x   e1 (t ) 
 x   a1 (t )
   

   
d1 (t )  y   f1 (t ) 
 y  0
and
0  x   e2 (t ) 
 x   a2 (t )
   

   
d 2 (t )  y   f 2 (t ) 
 y  0
are the transformations of the original objects, the
intersection’s transformation is described as:
 a1 (t ) xul  a2 (t ) xlr  e1 (t )  e2 (t )

0


 x 
xul  xlr
 x 
  
b1 (t ) yul  b2 (t ) ylr  f1 (t )  f 2 (t )  y 
 y 
0


yul  ylr


 (a2 (t )  a1 (t )) xul xlr  e1 (t ) xlr  e2 (t ) xul 


x

x
ul
lr


linear
 (b2 (t )  b1 (t )) yul ylr  f1 (t ) ylr  f 2 (t ) yul 


yul  ylr


Closure Properties under Boolean Operations
\
Affinity
Scaling
Translation
Rat Pol Lin Rat Pol Lin Rat
Poly
Tr
TraX
Rect
+
+
+
+
-
- - - - - - - + +
+
-
Pol
Lin
-
-
The only class of transformations that
can be used for Boolean operations.
Id
+
+
+
MLPQ/PReSTO System
[Revesz et al., SIGMOD 2000]
• MLPQ: linear constraints
• GIS: operations special to GIS objects
• PReSTO (STDB): spatio-temporal database system
• RECURSIVE
\
Affinity
Scaling
Translation
Rat Pol Lin Rat Pol Lin Rat
Poly
Tr
TraX
Rect
+
+
+
+
-
- - - - - - - + +
+
Rectangle based system
-
Pol
Lin
-
-
Id
+
+
+
PReSTO Example: Cloud over US
PReSTO (STDB): spatio-temporal database system
PReSTO
Data definition in PReSTO:
Nebraska() :- i=1, x1=550, y1=340,
x2=570,y2=355,
t>=0,t<=300,p=-1, s = 0.
clouds(h) :- i=1,x1 - t = 80, y1 - 0.5t = 100,
x2 - 1.1t = 220, y2 - 0.6t = 200,
t >=0, t<=300, p = -1, s = 0, h = 0.
Available operations/queries in PReSTO:
•Union, intersection, difference, complement;
•Projection, selection
Parametric Spatio-Temporal Objects:
Conclusion
•Closure: for union trivial; for intersection and
difference in some cases
•Solution: restrict to polygons or work with Boolean
combination of atomic objects (AND-OR-NOT trees,
see CSG)
•Advantage:
- closure in definition
- easier to construct more complicated objects
(polygons with holes…)
a
d
a
b
b
c
ab c d
a\b
Representing Spatio-Temporal Phenomena
using ADTs or Constraints
Problem with real spatio-temporal data:
(A) data comes with discrete observations
•Within a snapshot (TINs)
•In different snapshots
(B) data lacks clearly identifiable and delineated
objects
(C) modeled movement/evolution is often irregular
(D) Data does not have regular (polyhedral) 3D
structure
Solution to (A), (C) and (D):
•Convert each snapshot to a set of polygons
•Interpolate/approximate between the snapshots
Other Challenges
Data models:
• type system
• representing uncertainty
• integrating different representations
• resolving inconsistencies
Query languages and interfaces:
• multi-dimensional aggregation (ST OLAP)
• visualization
• animation: explicit representations (ADTs) are
more suitable than implicit ones (constraints)
Moving Objects Databases (MOD)
[Wolfson et al., 1997, Su et al., 2001]
Data model:
• point moving in nD plane (e.g., n=2)
• finite chain of infinitely differentiable functions
p:RR2
• MOD = finite set of moving objects
Operations:
• vel(p), acc(p),
• moving direction =vel(p)/||vel(p)||
• speed =||vel(p)||
• distance = dist(p,q)= mint ||p(t)-q(t)||
• same direction
Moving Objects Databases (MOD)
[Wolfson et al., 1997, Su et al., 2001]
Query languages:
• SQL3 [Forlizzi et al, SIGMOD 2000]
• relational calculus with built in functions [Su et al.,
SSTD 2001]
• temporal logic [Wolfson et al, ICDE 1997]
Expressiveness:
• vel(p), acc(p), moving direction, speed, distance are
enough to express all temporal and topological
queries
• PTIME languages
•precision/uncertainty/probability
Final Look
Bottum-up:
• adding spatio-temporal constructs to existing
spatial/GIS systems to cope with application demands
• closure problems, interoperability
Top-down:
• develop a general model (e.g., a constraint model)
with clean semantics, properties and query languages
• will it be used?
More information at:
http://alpha.luc.ac.be/~lucp1265/
bart.kuijpers@luc.ac.be