Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Trajectory Pattern Mining
Fosca Giannotti
Dino Pedreschi
Mirco Nanni
Fabio Pinelli
Chris Andrews
Georgia Institute of Technology
B.S. Computer Science
5th Year Undergraduate
Concepts
 Analyze trajectory of moving objects
A
3mins
B
5mins
C
10mins
D
 Trajectory Patterns – description of frequent behavior relating
to space and time
 Frequent Sequence Pattern (FSP)
 Determine if trajectory sequence matches any trajectory patterns
in a given set
 Study different methods of preparing a Temporally Annotated
Sequence (TAS) for data mining
Trajectory Patterns (T-Patterns)
 Trajectory Pattern
 sequence of time-stamped locations
 S = { ( x0, y0, t0 ) , … , ( xn, yn, tn ) }
 Temporal Annotation
 set of times relating to trajectories
 A = { a1 , a2, … an }
 Temporally Annotated Sequence
 (S,A) = (x0,y0) a1 (x1,y1) a2 … an (xn,yn)
Neighborhood Function
 Neighborhood Function N : R2 -> P (R2)
 Calculates spatial containment of regions
 Input point to find enclosing Region of Interest
 Defines the necessary proximity to fall into a region
 Parameters:
e – radius or necessary proximity of points
Regions of Interest (RoI)
 Performing these comparisons on points is costly
 A simple preprocessing step can alleviate this
 Utilize the Neighborhood Function NR()
 Translate each set of points into regions
 Timestamp is selected from when the trajectory first entered
the region
 Now compare sequence of regions and timestamps using the
TAS mining algorithm presented in [2].
Static RoI
 Neighborhood Function NR()
 Initially receives set of R disjoint spatial regions
 R regions are predefined based on prior knowledge
 Each represents relevant place for processing
 Static NR() simplifies problem of mining patterns
 Sequence of points become grouped
 Result: sequence of regions
 (x,y) a1 (x’,y’) becomes X a1 Y
Dynamic RoI
 Data sets often do not possess predetermined regions
 Instead need to formulate regions based on criteria of
density of the trajectories
 Preprocessing now must determine set R of popular
regions from the data set
 R is now the set of Region of Interests from used by the
Neighborhood Function NR() to translate points into
Regions of Interest
Popular Regions
Grid G of n x m cells
Each cell with density G(i,j)
Density Threshold d
Set R of popular regions
 Each region in R forms rectangular region
 Sets in R are pair wise distinct
 Dense cells always contained in some region in R
 All regions in R have average density above d
 All regions in R cannot expand without their average
density decreasing below d
Grid Density Preparation
 Split space into n x m grid with small cells
 Increment cells where trajectory passes
 Neighborhood Function NR() determines which surrounding cells
 Regression - increment continuously along trajectory
Popular Regions Algorithm
 Algorithm:
PopularRegions( G, d )
 Complexity: O ( |G| log |G| )
 Iteratively consider each dense cell
 For each:
 Expands in all four directions
 Select expansion that maximizes density
 Repeat until expansion would decrease below density
threshold
Results
Evaluating the T-Patterns
 Compute density of each cell of grid
 Compute set of RoI’s by determining Popular Regions
 Translate the input trajectories into sequence of RoI’s
and timestamps for the transitions
 Input the trajectories and times into TAS mining
algorithm[2]
Experiments
 GPS Data
 Fleet of 273 trucks in Athens, Greece
 112,203 total points recorded
 Running both static & dynamic pattern algorithms
 Various parameter settings
 Performance Analysis
 Synthetic Data by CENTRE synthesizer
 50% random & 50% predetermined
Pattern Mining Results
Static found:
Dynamic found:
A t1 B t2
A t1 B’ t2
B
B’’
Execution Time Results
• Increase linearly with increasing
number of input trajectories (both
algorithms)
• Grow when density threshold
decreases
• Static performs better with extreme
threshold
• Static does not perform with middle
threshold
Additional Results
 Increasing radius of spatial neighborhood obtains irregular
performance and large values lead to poor execution times
 Changing time tolerance (t) obtains results similar to
TAS’s
 Increasing the number of points in each trajectory causes
linear growth of execution times
Works Cited
 [1] Trajectory pattern mining, Fosca Giannotti, Mirco
Nanni, Fabio Pinelli, Dino Pedreschi, Proceedings of the
13th ACM SIGKDD international conference on
Knowledge discovery and data mining KDD. ACM,
2007.
 [2] Efficient Mining of Sequences with Temporal
Annotations. F. Giannotti, M. Nanni, and D. Pedreschi. In
Proc. SIAM Conference on Data Mining, pages 346–357.
SIAM, 2006.