Introducing semantics in short text classification

Ameni Bouaziz; Célia da Costa Pereira; Christel Dartigues-Pallez; Frédéric Precioso

Conference Proceedings

Introducing semantics in short text classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 9624 LNCS 433-445

DOI: 10.1007/978-3-319-75487-1_34

0Citations

3Readers

Get full text

Abstract

To overcome short text classification issues due to shortness and sparseness, the enrichment process is classically proposed: topics (word clusters) are extracted from external knowledge sources using Latent Dirichlet Allocation. All the words, associated to topics which encompass short text words, are added to the initial short text content. We propose (i) an explicit representation of a two-level enrichment method in which the enrichment is considered either with respect to each word in the text or to the global semantic meaning of the short text and (ii) a new semantic Random Forest kind in which semantic relations between features are taken into account at node level rather than at tree level as it was recently proposed in the literature to avoid potential tree correlation. We demonstrate that our enrichment method is valid not only for Random Forest based methods but also for other methods like MaxEnt, SVM and Naive Bayes.

Author supplied keywords

Cite

CITATION STYLE

APA

Bouaziz, A., da Costa Pereira, C., Dartigues-Pallez, C., & Precioso, F. (2018). Introducing semantics in short text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9624 LNCS, pp. 433–445). Springer Verlag. https://doi.org/10.1007/978-3-319-75487-1_34

Introducing semantics in short text classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions