Term based semantic clusters for very short text classification

Jasper Paalman; Shantanu Mullick; Kalliopi Zervanou; Yingqian Zhang

Conference ProceedingsOPEN ACCESS

Term based semantic clusters for very short text classification

International Conference Recent Advances in Natural Language Processing, RANLP (2019) 2019-September 878-887

DOI: 10.26615/978-954-452-056-4_102

6Citations

69Readers

Abstract

Very short texts, such as tweets and invoices, present challenges in classification. Although term occurrences are strong indicators of content, in very short texts, the sparsity of these texts makes it difficult to capture important semantic relationships. A solution calls for a method that not only considers term occurrence, but also handles sparseness well. In this work, we introduce such an approach, the Term Based Semantic Clusters (TBSeC) that employs terms to create distinctive semantic concept clusters. These clusters are ranked using a semantic similarity function which in turn defines a semantic feature space that can be used for text classification. Our method is evaluated in an invoice classification task. Compared to well-known content representation methods the proposed method performs competitively.

Cite

CITATION STYLE

APA

Paalman, J., Mullick, S., Zervanou, K., & Zhang, Y. (2019). Term based semantic clusters for very short text classification. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2019-September, pp. 878–887). Incoma Ltd. https://doi.org/10.26615/978-954-452-056-4_102

Term based semantic clusters for very short text classification

Abstract

Cite

Register to see more suggestions