Abstract
Hierarchical Text Classification (HTC) is an essential and challenging subtask of multi-label text classification with a taxonomic hierarchy. Recent advances in deep learning and pre-trained language models have led to significant breakthroughs in the HTC problem. However, despite their effectiveness, these methods are often restricted by a lack of domain knowledge, which leads them to make mistakes in a variety of situations. Generally, when manually classifying a specific document to the taxonomic hierarchy, experts make inference based on their prior knowledge and experience. For machines to achieve this capability, we propose a novel Knowledge-enabled Hierarchical Text Classification model (K-HTC), which incorporates knowledge graphs into HTC. Specifically, K-HTC innovatively integrates knowledge into both the text representation and hierarchical label learning process, addressing the knowledge limitations of traditional methods. Additionally, a novel knowledge-aware contrastive learning strategy is proposed to further exploit the information inherent in the data. Extensive experiments on two publicly available HTC datasets show the efficacy of our proposed method, and indicate the necessity of incorporating knowledge graphs in HTC tasks.
Cite
CITATION STYLE
Liu, Y., Zhang, K., Huang, Z., Wang, K., Zhang, Y., Liu, Q., & Chen, E. (2023). Enhancing Hierarchical Text Classification through Knowledge Graph Integration. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 5797–5810). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.358
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.