Clustering is ubiquitous in data analysis, including analysis of time series. It is inherently subjective: different users may prefer different clusterings for a particular dataset. Semi-supervised clustering addresses this by allowing the user to provide examples of instances that should (not) be in the same cluster. This paper studies semi-supervised clustering in the context of time series. We show that COBRAS, a state-of-the-art active semi-supervised clustering method, can be adapted to this setting. We refer to this approach as COBRASTS. An extensive experimental evaluation supports the following claims: (1) COBRASTS far outperforms the current state of the art in semi-supervised clustering for time series, and thus presents a new baseline for the field; (2) COBRASTS can identify clusters with separated components; (3) COBRASTS can identify clusters that are characterized by small local patterns; (4) actively querying a small amount of semi-supervision can greatly improve clustering quality for time series; (5) the choice of the clustering algorithm matters (contrary to earlier claims in the literature).
CITATION STYLE
Van Craenendonck, T., Meert, W., Dumančić, S., & Blockeel, H. (2018). COBRASTS: A New Approach to Semi-supervised Clustering of Time Series. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11198 LNAI, pp. 179–193). Springer Verlag. https://doi.org/10.1007/978-3-030-01771-2_12
Mendeley helps you to discover research relevant for your work.