Clustering results could be comprehensible and usable if individual groups are associated with characteristic descriptions. However, characterization of clusters followed by clustering may not always produce clusters associated with special features, because the first clustering process and the second classification step are done independently, demanding an elegant way that combines clustering and classification and executes both simultaneously. In this paper, we focus on itemsets as the feature for characterizing groups, and present a technique called "itemset classified clustering," which divides data into groups given the restriction that only divisions expressed using a common itemset are allowed and computes the optimal itemset maximizing the interclass variance between the groups. Although this optimization problem is generally intractable, we develop techniques that effectively prune the search space and efficiently compute optimal solutions in practice. We remark that itemset classified clusters are likely to be overlooked by traditional clustering algorithms such as two-clustering or k-means, and demonstrate the scalability of our algorithm with respect to the amount of data by the application of our method to real biological datasets. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Sese, J., & Morishita, S. (2004). Itemset classified clustering. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3202, 398–409. https://doi.org/10.1007/978-3-540-30116-5_37
Mendeley helps you to discover research relevant for your work.