Fast discovery of sequential patterns by memory indexing

39Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Mining sequential patterns is an important issue for the complexity of temporal pattern discovering from sequences. Current mining approaches either require many times of database scanning or generate several intermediate databases. As databases may fit into the ever-increasing main memory, efficient memory-based discovery of sequential patterns will become possible. In this paper, we propose a memory indexing approach for fast sequential pattern mining, named MEMISP. During the whole process, MEMISP scans the sequence database only once for reading data sequences into memory. The find-then-index technique recursively finds the items which constitute a frequent sequence and constructs a compact index set which indicates the set of data sequences for further exploration. Through effective index advancing, fewer and shorter data sequences need to be processed in MEMISP as the discovered patterns getting longer. Moreover, the maximum size of total memory required, which is independent of minimum support threshold in MEMISP, can be estimated. The experiments indicates that MEMISP outperforms both GSP and PrefixSpan algorithms. MEMISP also has good linear scalability even with very low minimum support. When the database is too large to fit in memory in a batch, we partition the database, mine patterns in each partition, and validate the true patterns in the second pass of database scanning. Therefore, MEMISP may efficiently mine databases of any size, for any minimum support values. © 2002 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Lin, M. Y., & Lee, S. Y. (2002). Fast discovery of sequential patterns by memory indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2454 LNCS, pp. 150–160). Springer Verlag. https://doi.org/10.1007/3-540-46145-0_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free