Mining top-k distinguishing sequential patterns with flexible gap constraints

9Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Distinguishing sequential pattern (DSP) mining has been widely employed in many applications, such as building classifiers and comparing/analyzing protein families. However, in previous studies on DSP mining, the gap constraints are very rigid – they are identical for all discovered patterns and at all positions in the discovered patterns, in addition to being predetermined. This paper considers a more flexible way to handle gap constraint, allowing the gap constraints between different pairs of adjacent elements in a pattern to be different and allowing different patterns to use different gap constraints. The associated DSPs will be called DSPs with flexible gap constraints. After discussing the importance of specifying/determining gap constraints flexibly in DSP mining, we present GepDSP, a heuristic mining method based on Gene Expression Programming, for mining DSPs with flexible gap constraints. Our empirical study on real-world data sets demonstrates that GepDSP is effective and efficient, and DSPs with flexible gap constraints are more effective in capturing discriminating sequential patterns.

Cite

CITATION STYLE

APA

Gao, C., Duan, L., Dong, G., Zhang, H., Yang, H., & Tang, C. (2016). Mining top-k distinguishing sequential patterns with flexible gap constraints. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9658, pp. 82–94). Springer Verlag. https://doi.org/10.1007/978-3-319-39937-9_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free