In the area of pattern discovery, there is much interest in discovering small sets of patterns that characterize the data well. In such scenarios, when data is represented by a small set of characterizing patterns, an interesting problem is the comparison of datasets, by comparing the respective representative sets of patterns. In this paper, we propose a novel kernel function for measuring similarities between two sets of patterns, which is based on evaluating the structural similarities between the patterns in the two sets, weighted using their relative frequencies in the data. We define the kernel for injective serial episodes and itemsets. We also present an efficient algorithm for computing this kernel. We demonstrate the effectiveness of our kernel on classification scenarios and for change detection using sequential datasets and transaction databases.
CITATION STYLE
Ibrahim, A., Sastry, P. S., & Sastry, S. (2016). Analyzing similarities of datasets using a pattern set kernel. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9651, pp. 265–276). Springer Verlag. https://doi.org/10.1007/978-3-319-31753-3_22
Mendeley helps you to discover research relevant for your work.