Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) Finding Frequent and Maximal Periodic Patterns in Spatiotemporal Databases towards Big Data O.Obulesu1, A.Rama Mohan Reddy2, M.Thrilok Reddy3 1,3 Asst. Professor, Department of IT, SVEC, JNTUA, A.P. 2 Professor, Dept of CSE, SVUCE, Tirupati The periodicity detection is a process of finding temporal regularities within the Time-Series and the goal of analyzing a Time-Series Database is to find how frequent a periodic pattern (full or partial) is repeated within time intervals. In general, there are three types of periodic patterns can be detected in a time series Database such as Symbol Periodicity, Sequence Periodicity or Partial Periodic Patterns and Segment or Full-Cycle Periodicity [22]. We consider a set of Boolean SpatioTemporal (ST) event types such as crime types and their instances. The Boolean event types are important because primary concern is the occurrence or absence of an event type at a particular location and time. Given such a set, cascading spatio-temporal patterns are partially ordered subsets of event types whose instances are spatial neighbours and occur in a series of stages. It is an example of a Cascading Spatiotemporal Pattern Discovery CSTP [24]. The first event1 in the CSTP is the hurricane event. Subsequent events are represented by events such as heavy rainfall, localized flooding and wind damage creates a CSTP from an urban crime dataset. The first event here is bar-closings (represented by circles), and subsequent events are assaults (represented by triangles) and drunk driving (represented by squares). It shows individual instances of events of bar-closing, assault and drunk driving with their location and time. It shows the locations of all event instances. The problem is 80-90% of data produced today is unstructured and Gartner Predicts 800% data growth over next 5 years [26]. Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Abstract— Data mining used to find hidden knowledge from large amount of Databases. Periodic Pattern Mining is useful in Weather Forecasting, Fraud Detection and GIS Applications. In General, spatio-temporal pattern discovery process finds the partially ordered subsets of the eventtypes whose instances are located together and occur serially for a given collection of Boolean spatio-temporal event-types. Big Data concerns large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data is now rapidly expanding in all science and engineering domains, including physical, biological and bio-medical sciences. In this paper, a new framework is proposed to find spatiotemporal patterns from Big Data. Existing algorithms are well in computation of necessary patterns, but more problematic when they are applied to Big Data. Big Data is a new trend used to analyze the datasets that due to their large size and complexity, Developers cannot manage them with traditional current algorithms or data mining software tools. Big Data mining is the capability of extracting useful information from these large datasets or streams of data, that due to its volume, variety, and velocity, it was not possible before to do it. The Big Data challenge is becoming one of the most exciting opportunities for the next years. This Paper focuses on a broad overview of pattern mining algorithms and significance in Spatiotemporal Databases, its current status, trade-offs, and forecast to the big data pattern mining future. Keywords — Periodicity Detection, Spatial Patterns, Big Data, Cascading Spatiotemporal Pattern Discovery, MapR I. INTRODUCTION Data mining is a useful tool for extracting nontrivial, implicit, previously unknown and potentially useful information or pattern from large databases [28].It is the process of applying the methods like neural networks, clustering, genetic algorithms (1950s), decision trees (1960s) and support vector machines (1980s) to data with the intention of uncovering hidden patterns. A TimeSeries Database is a collection of data values gathered generally at uniform interval of time to reflect certain behavior of an entity. In Real World, There are several examples of Time-Series such as Weather Conditions of a particular location, Spending Patterns, Stock Growth, and Transactions in a Supermarket, Network Delays, Power Consumption, Computer Network Fault Analysis and Security Breach detection, Earthquake Prediction. II. BIG DATA PATTERN MINING This section gives motivation and directions for pattern mining towards Big Data research. The basic three parameters of Big Data are volume, variety, and velocity of information, drives unprecedented complexity and opportunity [25]. The main idea of this paper is to introduce periodic pattern mining in big data to be useful in rare event analysis converted as negative patterns including positive patterns [34]. Data and content increase over coming decade. Most of the data generated from web is unstructured. In 2025, 35 Zettabytes of data to be generated. 266 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) Today the storage levels of data in the forms of Gigabytes, Terabytes, Petabytes, Exabytes, Zettabytes and Yottabytes only. Again in 1995, same authors proposed two new algorithms such as AprioriSome and AprioriAll to mine sequential patterns from a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction [2]. In 1998, Roberto J. Bayardo Jr. Proposed a new pattern mining algorithm called the Max-Miner, that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern [3]. Max-Miner is also easily made to incorporate additional constraints on the set of frequent itemsets identified. Later, Garofalakis et al. in 1999, proposed SPIRIT, a method of mining user-specified sequential patterns by using regular expression constraints [35]. In 2000, J. Han, J. Pei and R. Mao proposed an efficient CLOSET Algorithm, for mining closed itemsets with the help of three techniques such as FP-Tree structure, Single prefix path compression and partition based projection mechanism. It is efficient and scalable over large databases, and faster than the traditional algorithms [4]. In 2000, Yves Bastide, Rafik Taouil, Nicolas Pasquier, Gerd Stumme proposed the algorithm PASCAL which introduces a novel optimization of the well-known algorithm Apriori and it is based on pattern counting inference that relies on the concept of key patterns [5]. PASCAL is better than three algorithms such as Apriori, Close and Max-Miner and it is the most efficient algorithm for mining frequent patterns. Later, Jiawei Han, Jian Pei, and Yiwen Yin proposed a novel frequent pattern tree (FP-tree) structure, which is an extended prefix tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth [6]. Later in 2001, Doug Burdick, Manuel Calimlim, Johannes Gehrke proposed a new algorithm called MAFIA, for mining maximal frequent itemsets from a transactional database [7]. This algorithm outperforms previous work by a factor of three to five. Again in 2001, Ilias Tsoukatos and Dimitrios Gunopulos solved the problem of mining spatiotemporal patterns and finding sequences of events that occur frequently in spatiotemporal datasets with the help of DFS_MINE, a new algorithm for fast mining of frequent spatiotemporal patterns in environmental data, which has the advantage that the amount of space required is minimal [8]. 2.1 Pattern Mining Classification: The following different patterns to be useful in further research in Data Mining domain [28]. (a) Kinds of patterns and rules (b) Mining Methods (c) Extensions and applications (a) Kinds of Patterns and Rules: i) Basic Patterns – Frequent patterns, Association rules and Closed/maximal Patterns etc. ii) Multilevel and Multidimensional Patterns – Uniform and Item-set based High dimensional patterns etc. iii) Extended Patterns – Approximate, Uncertain, Compressed, Rare/negative and Colossal Patterns etc. (b) Mining Methods: i) Basic Mining methods – Candidate Generation, Pattern Growth, Vertical format etc. ii) Mining interesting patterns – Interestingness, Constraint-based, Correlation Rules and Exception Rules etc. iii) Distributed, parallel & Incremental – Stream Patterns, Distributed/Parallel and Incremental Mining etc. (c) Extensions and Applications: i) Extended Data types – Sequential & Time-Series, Structural, Spatial, Temporal and Multimedia patterns. ii) Applications – patterns based Classification & Clustering, Semantic Annotation and Privacy Preserving. III. RELATED WORK In 1994, Rakesh Agrawal and Ramakrishnan Srikanth identified the problem of discovering association rules between items in a large database of sales transactions and proposed two new algorithms for solving this problem that are fundamentally different from the known algorithms such as AprioirHybrid [1]. AprioriHybrid is a combination of two algorithms such as Apriori and AprioriTid also has excellent scale-up properties with respect to the transaction size and the number of items in the database. 267 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) In 2002, Mohammed J. Zaki and Ching-Jui Hsiao, proposed a new algorithm called CHARM, an efficient algorithm for mining all frequent closed itemsets [9]. It also uses a technique called diffsets to reduce the memory footprint of intermediate computations and scalable in the number of transactions. In 2003, William Cheung and Osmar R. Zaïane, proposed a novel data structure called CATS Tree. CATS Tree extends the idea of FPTree to improve storage compression and allow frequent pattern mining without generation of candidate itemsets and allow mining with a single pass over the database as well as efficient insertion or deletion of transactions at any time. Later, Zaki in 2001, proposed SPADE, is an algorithm proposed to find frequent sequences using efficient lattice search techniques and simple joins [30]. Another approach called MEMory Indexing for Sequential Pattern mining (MEMISP) is introduced by Lin and Lee in 2002 [31]. In 2003, Jianyong Wang, Jiawei Han and Jian Pei, proposed a winning algorithm CLOSET+. It integrates the advantages of the previously proposed effective strategies. It is better than existing mining algorithms including CLOSET, CHARM and OP, in terms of runtime, memory usage and scalability [10]. Later Guimei Liu Hongjun Lu Wenwu Lou and Jeffrey Xu Yu, proposed a disk-based data structure, CFP-tree (Condensed Frequent Pattern Tree), for organizing frequent patterns discovered from transactional databases and also developed algorithms to efficiently support two important types of queries, namely queries with minimum support constraints and queries with item constraints, against the stored patterns, as these two types of queries are basic building blocks for complex frequent pattern related mining tasks[11]. In 2005, Jianyong Wang, Jiawei Han, Ying Lu, and Petre Tzvetkov, proposed a mining task: mining top-k frequent closed itemsets of length no less than min_l, where k is the desired number of frequent closed itemsets to be mined, and min_l is the minimal length of each itemset and developed an efficient algorithm, called TFP, is developed for mining such itemsets without min_support [12]. TFP has high performance and linear scalability in terms of the database size. Later Huiping Cao, Nikos Mamoulis, and David W. Cheung, modeled the problem of mining sequential patterns from spatiotemporal data by considering both spatial and temporal information [13]. In 2007, Huiping Cao, Nikos Mamoulis and David W. Cheung, solved the problem of mining periodic patterns in spatiotemporal data and proposed an effective and efficient algorithm called STPMine1, for retrieving maximal periodic patterns [32]. In 2010, Zhenhui Li Bolin Ding, Jiawei Han, Roland Kays and Peter Nye, addressed the problem of mining periodic behaviors for moving objects and solved with a two stage algorithm, Periodica [14]. Later, Juyoung Kang and Hwan-Seung Yong, solved the problem of mining spatio-temporal patterns from trajectory data. This method first finds meaningful spatio-temporal regions and extracts frequent spatiotemporal patterns based on a prefix-projection approach from the sequences of these regions [15]. Later, Nizar R. Mabroukeh and C. I. Ezeife, proposed a taxonomy of Sequential Pattern Mining Algorithms, which is represented in Table1 [16]. Later, Zhenhui Li, Ming Ji, Jae-Gil Lee, Lu-An Tang, Yintao Yu, Jiawei Han and Roland Kays, proposed a system called MoveMine, is designed for sophisticated moving object data mining by integrating several attractive functions including moving object pattern mining and trajectory mining [18], [19]. In 2012, P.Mohan and S.Shekhar proposed the cascading spatiotemporal pattern (CSTP) discovery process finds partially ordered subsets of these event-types whose instances are located together and occur serially for a given collection of Boolean spatiotemporal (ST) eventtypes [24]. Discovering CSTPs from ST data sets is important for application domains such as public safety (e.g., identifying crime attractors and generators) and natural disaster planning, (e.g., preparing for hurricanes). The proposed model is mentioned section 4. Table1 Tabular Taxonomy of Sequential-Pattern Mining Algorithms IV. PROPOSED MODEL The following are the big data big challenges in the future era: i) Data models for managing big data ii) Big Data Testing – ex. generation of optimal test data iii) Real-time streaming data analytics iv) Scalable analytics on large data sets v) Systems architecture for big data management vi) Main memory data management techniques 268 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) vii) Energy-efficient data processing viii) Energy-aware Resource Management ix) Benchmarking big data systems x) Security and Privacy of Big Data xi) Failover and reliability for big data systems Fig. 2. Data Stream Processing Engine V. EXPERIMENTAL RESULTS We performed all experiments on a 2.2GHz Intel Core 2 Duo processor PC machine with 3GB main memory Windows XP OS and all programs are implemented in Java. The results from figure 3(a) to 3(f) show that user input screen, prefix tree generation, frequent item sets and generated CFIs & MFIs respectively. Fig. 1. Architecture for Big Data Pattern Mining Nowadays it is Impossible for current databases, technologies and architectures to store and manage huge data. The above model proposes a solution by analyzing existing pattern mining algorithms which are mentioned in Table.1. In 2009, Ho Jin Woo and Won Suk Lee, proposed estMax algorithm to find closed or maximal frequent item sets over online transactional data streams is not easy due to the requirements of a data stream [20]. This method maintains the set of frequent item sets by a prefix tree and extracts all MFIs without any additional superset/subset checking mechanism. The Performance of the estMax method is comparatively analyzed by a series of experiments to identify its various characteristics. It has faster processing time and better performance than estDec algorithm. The main problem identified in stream data is the processing activity which is mentioned in the below figure. 2. Fig. 3(a) Table with list of transactions Fig. 3(b) List of Transactions in a table 269 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) Fig. 3(c) User Supply Support & Confidence Values Fig. 3(f) Generated CFIs and MFIs The future work from this method is to improve the performance of the proposed method estMax, in terms of memory usage. To achieve this, a more compact frame of a prefix tree should be devised. In 2007, Huiping Cao, Nikos Mamoulis and David W. Cheung, solved the problem of mining periodic patterns in spatiotemporal data and proposed an effective and efficient algorithm called STPMine1, for retrieving maximal periodic patterns [32]. We implemented the same algorithm with few modifications in the parameter settings. The language used was Java and the experiments were performed on a Pentium IV 700MHz workstation with 2 GB of memory, running Windows XP Operating System. The future work from this algorithm include the automatic discovery of the period ‘T’ related to frequent periodic patterns and the discovery of patterns with distorted period lengths. The results represented by figure 4(a) to 4(f), show that Data preparation and generated periodic patterns from a large mobile object movement’s data. Fig. 3(d) Generated Prefix Tree Fig. 3(e) Generated Frequent Item sets Fig. 4(a) Data Preparation stage 270 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) Fig. 4(b) Data set of objects movements Fig. 4(e) Region Sequences within Periodicity Fig. 4(c) Generated Region Sequences Fig. 4(f) Region-Pattern sequences In 2010, Jinlin Chen, proposed a novel data structure, UpDown Directed Acyclic Graph (UDDAG), is invented for efficient sequential pattern mining. UDDAG allows bidirectional pattern growth along both ends of detected patterns. It often outperforms PrefixSpan by almost one order of magnitude in scalability tests. UDDAG is also considerably faster than Spade and LapinSpam. We performed all the experiments on a Windows XP Operating System with 2.2 GHz Intel Core 2 Duo Processor PC Machine and 3 GB memory. We implemented in Java and compared with different algorithms such as PrefixSpan, Spade, and LapinSpam, which were all implemented in C++ by previous authors. The results represented by figure 5(a) to 5(c), show that generated sequential patterns by using proposed algorithm called UDDAG with few modifications. Fig. 4(d) Periodicity Detection (Customized) 271 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) One major feature of UDDAG is that it supports efficient pruning of invalid candidates. In the future, there is a requirement of storage of UDDAG in an efficient way. The future topics include mining with constraints, closed and maximal pattern mining, approximate pattern mining, and domain-specific pattern mining, etc. whenever large searching spaces are involved and pruning of searching spaces is necessary. In 2011, Faraz Rasheed, Mohammed Alshalalfa, and Reda Alhajj, proposed an algorithm called STNR,which can detect symbol, sequence (partial), and segment (full cycle) periodicity in time series [22]. The algorithm uses suffix tree as the underlying data structure. The algorithm is noise resilient; it has been successfully demonstrated to work with replacement, insertion, deletion, or a mixture of these types of noise. We performed all the experiments on a Windows 7 Operating System with 2.3 GHz Intel Core 2 Duo Processor PC Machine and 3 GB memory. The results represented by figure 6 (a) to 6(d), show that finding the whole time series and in a subsection of it to effectively handle different types of noise. We also implemented pruning strategy to remove redundant patterns for a given time-series Database. The conducted study demonstrates the applicability and effectiveness of the proposed algorithm; it is generally more time-efficient and noiseresilient than existing algorithms. The future work includes online construction of suffix tree. This type of approach is used in sensor networks, ubiquitous computing, and DNA sequence mining. Fig. 5(a) User will supply Min_Sup threshold value Fig. 5(b) Generated Sequential Patterns Fig. 6(a) Full Series sequence generation Fig. 5 (c) Sequential Patterns by using UDDAG 272 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) We implemented sanitization algorithm by using java programming language on a Pentium 2.77GHz processor with 40 GB Hard disk, 3 GB RAM Windows Operating System with the back end, MySQL Database. This algorithm finds spatiotemporal trajectories which require hiding of trajectories that are unexploitable from available data.There are different ways to hide sensitive trajectory data considered as a future work from this approach. Recently in 2012, Mr. Pradeep Mohan et al. proposed QSTP discovery process which finds the partially ordered subsets of the Boolean Spatio-Temporal event-types whose instances are located together & occur serially [24].The discovered patterns are represented using Directed Acyclic Graph (DAG). Originally, this algorithm was implemented in MATLAB 7.9 and we implemented this approach with modifications by using PHP with the back end MySQL and performed experiments on windows Operating System with 3 GB Memory. The results from figure 7(a) to 7(c) show that how to get cascading spatiotemporal patterns called CSTPM. The future work includes how to perform expensive significance tests by randomly permuting ST instances and accounting for multiple testing, which is used in weather forecasting. Fig. 6(b) periodicity within a subsection Fig. 6(c) Pruning strategy on Full Series patterns Fig. 7(a) QSTPs used in Crime Detection events Fig. 6(d) Pruning strategy on subsection series In 2010, Osman Abul, Francesco Bonchi, and Fosca Giannotti, proposed the problem of hiding sequential patterns and show it’s NP-hardness [23]. Fig. 7(b) Generated CSTPs 273 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) [6] [7] Fig. 7(c) Resulted CSTPs [8] VI. CONCLUSIONS AND FUTURE WORK In this paper, we proposed a new framework useful in Big Data pattern mining including the observations of many pattern mining algorithms with different scenarios. The future work contains how to mine frequent patterns and closed frequent patterns, sequential patterns, Periodic patterns with different parameters in Big Data field. The traditional algorithms and techniques may not be suitable for big data due to large, heterogeneous and stream data needs to be processed with in elapsed time [40]. Finally, there is a scope to develop new algorithms and methods works on MapReduce programming model to perform pattern mining, an active area of research in Big Data domain. [9] [10] [11] [12] Acknowledgements [13] The authors appreciate the efforts of UG Students, Sai Vamsi.V, M.Sandeep Reddy, M.Girish Kumar Reddy, P.Keerthi and Sai Avinash and Research Scholar, K.Suresh including the comments from the Professor Dr.A.Rama Mohan Reddy, Department of CSE, S.V.University College of Engineering; their feedback and suggestions improved the quality of the research presented here. The views expressed here are those of the authors and the authors thank the handling associate editors and the anonymous reviewers for their constructive comments. [14] [15] [16] REFERENCES [1] [2] [3] [4] [5] [17] Rakesh Agrawal Ramakrishnan Srikant. Fast Algorithms for Mining Association Rules. Proceedings of the 20th VLDB Conference Santiago, Chile, 1994. R. Srikant, R. Agrawal: "Mining Sequential Patterns: Generalizations and Performance Improvements", Proc. of the Fifth Int'l Conference on Extending Database Technology (EDBT), Avignon, France, March 1996. Expanded version available as IBM Research Report RJ 9994, December 1995. Roberto J. Bayardo Jr. Efficiently Mining Long Patterns from Databases. Proc. of the 1998 ACM-SIGMOD Int’l Conf. on Management of Data, 85- 93. J. Pei, J. Han, and R. Mao. "CLOSET: An efficient algorithm for mining frequent closed itemsets". Proceedings of the 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, Dallas, TX, May, 2000. Yves Bastide, Rafik Taouil, Nicolas Pasquier, Gerd Stumme. Mining Frequent Patterns with Counting Inference. SIGKDD Explorations. ACM SIGKDD, December 2000. Volume 2, Issue 2 - page 66-75. [18] [19] [20] [21] [22] 274 J. Han, J. Pei, Y. Yin and R. Mao, "Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach", Data Mining and Knowledge Discovery, an International Journal, Volume 8, Issue 1, pages 53-87, January 2004, Kluwer Academic Publishers. Doug Burdick, Manuel Calimlim, Johannes Gehrke. MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases. ICDE, 2001. Ilias Tsoukatos and Dimitrios Gunopulos. Efficient Mining of Spatiotemporal Patterns. C.S. Jensen et al. (Eds.):SSTD 2001, LNCS 2121, pp. 425−442, 2001. Springer-Verlag Berlin Heidelberg 2001. Mohammed J. Zaki and Ching-Jui Hsiao, CHARM: An Efficient Algorithm for Closed Itemset Mining. 2nd {SIAM} International Conference on Data Mining. April 2002. J. Wang, J. Han, and J. Pei, "CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets", Proc. 2003 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'03), Washington, D.C., August, 2003. Guimei Liu Hongjun Lu Wenwu Lou and Jeffrey Xu Yu. On Computing, Storing and Querying Frequent Patterns. SIGKDD’03, August24-27, 2003, Washington, DC, USA. 2003 ACM 1-58113-737-0/03/0008 ...$5.00. J. Wang, J. Han, Y. Lu, and P. Tzvetkov, "TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets", IEEE Transactions on Knowledge and Data Engineering, 17(5):652664, 2005. H. Cao, N. Mamoulis, and D. W. Cheung, "Mining Frequent Spatio-temporal Sequential Patterns," Proceedings of the 5th IEEE International Conference on Data Mining (ICDM), pp. 82-89, Houston, Texas, November 2005. Zhenhui Li Bolin Ding, Jiawei Han, Roland Kays and Peter Nye. Mining Periodic Behaviors for Moving Objects. KDD’10, July 25–28, 2010, Washington, DC, USA. 2010 ACM 978-1-45030055-1/10/07 ...$10.00. Juyoung Kang and Hwan-Seung Yong. Mining Spatio-Temporal Patterns in Trajectory Data. Journal of Information Processing Systems, Vol.6, No.4, December 2010 DOI : 10.3745/JIPS.2010.6.4.521. Mabroukeh, N. R. and Ezeife, C. I. 2010. A Taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43, 1, Article 3, November 2010, 41 pages. Rakesh Agrawal Sakti Ghosh Tomasz Imielinski, Bala Iyer, Arun Swami. An Interval Classier for Database Mining Applications. Proceedings of the 18th VLDB Conference Vancouver, British Columbia, Canada 1992. Zhenhui Li, Ming Ji, Jae-Gil Lee, Lu-An Tang, Yintao Yu, Jiawei Han and Roland Kays, MoveMine: Mining Moving Object Databases. SIGMOD’10, June 6–11, 2010, Indianapolis, Indiana, USA. 2010 ACM 978-1-4503-0032-2/10/06 ...$10.00. Qiankun Zhao and Sourav S. Bhowmick. Sequential Pattern Mining: A Survey. Technical Report, CAIS, Nanyang Technological University, Singapore, No. 2003118 , 2003. Ho Jin Woo and Won Suk Lee. estMax: Tracing Maximal Frequent Item Sets Instantly over Online Transactional Data Streams. IEEE Transactions on Knowledge and Data Engineering, VOL. 21, NO. 10, OCTOBER 2009. Jinlin Chen. An UpDown Directed Acyclic Graph Approach for Sequential Pattern Mining. IEEE Transactions on Knowledge and Data Engineering, VOL. 22, NO. 7, JULY 2010. Faraz Rasheed, Mohammed Alshalalfa, and Reda Alhajj. Efficient Periodicity Mining in Time Series Databases Using Suffix Trees. IEEE Transactions on Knowledge and Data Engineering, VOL. 23, NO. 1, JANUARY 2011. International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 4, Issue 1, January 2014) [23] Osman Abul, Francesco Bonchi, and Fosca Giannotti, Hiding Sequential and Spatiotemporal Patterns. IEEE Transactions on Knowledge and Data Engineering, VOL. 22, NO. 12, DECEMBER 2010. [24] Pradeep Mohan, Shashi Shekhar, James A. Shine, and James P. Rogers. Cascading Spatio-Temporal Pattern Discovery. IEEE Transactions on Knowledge and Data Engineering, VOL. 24, NO. 11, NOVEMBER 2012. [25] Xixian Han, Jianzhong Li, Donghua Yang, and Jinbao Wang. Efficient Skyline Computation on Big Data. IEEE Transactions on Knowledge and Data Engineering, VOL. 25, NO. 11, November 2013. [26] Manish Bhide and Krithi Ramamritham. Category-Based Infidelity Bounded Queries over Unstructured Data Streams. IEEE Transactions on Knowledge and Data Engineering, VoL. 25, No. 11, November 2013. [27] J.Han, M.Kamber and J.Pei. Data Mining: Concepts and Techniques. Morgan Kauffman Series. 3rd Edition, 2012. [28] Wei Fan, Albert Bifet. Mining Big Data: Current Status, and Forecast to the Future. SIGKDD Explorations Volume 14, Issue 2, Pno.s 1-5. [29] M.N.Garofalakis, Rajeev Rastogi and Kyuseok Shim. SPIRIT: Sequential Pattern Mining with Regular Expression Constraints. Proceedings of VLDB '99 Proceedings of the 25th International Conference on Very Large Databases, Page Nos. 223-234. [30] Mohammed J. Zaki, SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning Journal, 42(1/2):31-60, 2001. [31] Ming-Yen Lin and Suh-Yin Lee. Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning. Springer Berlin Heidelberg, 2002. [32] Huiping Cao, Nikos Mamoulis, David Wai-lok Cheung. Discovery of Periodic Patterns in Spatiotemporal Sequences, IEEE TKDE, Vol. 19, No. 4, April 2007 pp. 453-467. [33] O.Obulesu, A. Rama Mohan Reddy and K. Suresh. Finding Maximal Periodic Patterns and Pruning Strategy in Spatiotemporal Databases. IJARCSSE, Vol.2, Issue No.4, April 2012. ISSN: 2277 128X Page No.s 423-426. 275