Semantic word representation is a core building block in many deep learning systems. Most word representation techniques are based on words angle/distance, word analogies and statistical information. However, popular models ignore word morphology by representing each word with a distinct vector. This limits their ability to represent rare words in languages with large vocabulary. This paper proposes a dynamic model, named SemVec, for representing words as a vector of both domain and semantic features. Based on the problem domain, semantic features can be added or removed to generate an enriched word representation with domain knowledge. The proposed method is evaluated on adverse drug events (ADR) tweets/text classification. Results show that SemVec improves the precision of ADR detection by 15.28% over other state-of-the-art deep learning methods with a comparable recall score.
CITATION STYLE
Odeh, F., & Taweel, A. (2018). SemVec: Semantic features word vectors based deep learning for improved text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11324 LNCS, pp. 449–459). Springer Verlag. https://doi.org/10.1007/978-3-030-04070-3_35
Mendeley helps you to discover research relevant for your work.