The problem of exploiting Closed Sequential Patterns (CSPs) is an essential task in data mining, with many different applications. It is used to resolve the situations of huge databases or low minimum support (minsup) thresholds in mining sequential patterns. However, it is challenging and needs a lot of time to customize the minsup values for generating appropriate numbers of CSPs desired by users. To conquer this issue, the TSP algorithm for mining top- k CSPs was previously proposed, with k being a given parameter. The algorithm would return the k CSPs which have the highest support values in a database. However, its execution time and memory usage were high. In this paper, an algorithm named TKCS (Top- K Closed Sequences) is proposed to mine the top- k CSPs efficiently. To improve the execution time and memory usage, it uses a vertical bitmap database to represent data. Besides, it adopts some useful strategies in the process of exploiting the top- k CSPs such as: always choosing the sequential patterns with the greatest support values for generating candidate patterns and storing top- k CSPs in an ascending order of the support values to increase the minsup value more quickly. The empirical results show that TKCS has better performance than TSP for discovering the top- k CSPs in terms of both runtime and memory usage.
CITATION STYLE
Pham, T. T., Do, T., Nguyen, A., Vo, B., & Hong, T. P. (2020). An Efficient Method for Mining Top-K Closed Sequential Patterns. IEEE Access, 8, 118156–118163. https://doi.org/10.1109/ACCESS.2020.3004528
Mendeley helps you to discover research relevant for your work.