SemVec: Semantic features word vectors based deep learning for improved text classification

Feras Odeh; Adel Taweel

Conference Proceedings

SemVec: Semantic features word vectors based deep learning for improved text classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11324 LNCS 449-459

DOI: 10.1007/978-3-030-04070-3_35

3Citations

4Readers

Get full text

Abstract

Semantic word representation is a core building block in many deep learning systems. Most word representation techniques are based on words angle/distance, word analogies and statistical information. However, popular models ignore word morphology by representing each word with a distinct vector. This limits their ability to represent rare words in languages with large vocabulary. This paper proposes a dynamic model, named SemVec, for representing words as a vector of both domain and semantic features. Based on the problem domain, semantic features can be added or removed to generate an enriched word representation with domain knowledge. The proposed method is evaluated on adverse drug events (ADR) tweets/text classification. Results show that SemVec improves the precision of ADR detection by 15.28% over other state-of-the-art deep learning methods with a comparable recall score.

Author supplied keywords

Cite

CITATION STYLE

APA

Odeh, F., & Taweel, A. (2018). SemVec: Semantic features word vectors based deep learning for improved text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11324 LNCS, pp. 449–459). Springer Verlag. https://doi.org/10.1007/978-3-030-04070-3_35

SemVec: Semantic features word vectors based deep learning for improved text classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions