Clustering of Short Texts Based on Dynamic Adjustment for Contrastive Learning

4Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Faced with the large amount of unlabeled short text data appearing on the Internet, it is necessary to categorize them using clustering that can divide text into several clusters based on similarity degree of text semantics. Recently, combining clustering with contrastive learning has been the focus of clustering research. Due to the excellent representation learning ability of contrastive learning, clustering achieves better results on short texts that are high-dimensional and sparse. However, contrastive learning pays more attention to general feature representation at the instance-level and ignores the semantic-level correlation of data belonging to same cluster in clustering. The inconsistent training objectives of contrastive learning and clustering lead to lower confidence of clustering results and sparse cluster space. To improve this problem, we propose a clustering method based on Dynamic Adjustment for Contrastive Learning (DACL). The method smoothly transitions loss weight of model from contrastive learning to clustering during training and filters negative samples in contrastive learning by the pseudo-labels generated by clustering. To demonstrate the effectiveness of the method, DACL is compared with eight short text clustering models on eight datasets. The results show that we achieve considerable performance improvements on most datasets compared to state-of-the-art short text clustering methods. In addition, The effectiveness of loss smooth transition and negative filtering is proved by ablation experiments.

Cite

CITATION STYLE

APA

Li, R., & Wang, H. (2022). Clustering of Short Texts Based on Dynamic Adjustment for Contrastive Learning. IEEE Access, 10, 76069–76078. https://doi.org/10.1109/ACCESS.2022.3192442

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free