HSCNN: A hybrid-siamese convolutional neural network for extremely imbalanced multi-label text classification

33Citations
Citations of this article
99Readers
Mendeley users who have this article in their library.

Abstract

The data imbalance problem is a crucial issue for the multi-label text classification. Some existing works tackle it by proposing imbalanced loss objectives instead of the vanilla cross-entropy loss, but their performances remain limited in the cases of extremely imbalanced data. We propose a hybrid solution which adapts general networks for the head categories, and few-shot techniques for the tail categories. We propose a Hybrid-Siamese Convolutional Neural Network (HSCNN) with additional technical attributes, i.e., a multi-task architecture based on Single and Siamese networks; a category-specific similarity in the Siamese structure; a specific sampling method for training HSCNN. The results using two benchmark datasets and three loss objectives show that our method can improve the performance of Single networks with diverse loss objectives on the tail or entire categories.

Cite

CITATION STYLE

APA

Yang, W., Li, J., Fukumoto, F., & Ye, Y. (2020). HSCNN: A hybrid-siamese convolutional neural network for extremely imbalanced multi-label text classification. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 6716–6722). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.545

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free