Text classification research based on improved word2vec and CNN

11Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In view of the traditional classification algorithm, the problem of high feature dimension and data sparseness often occurs when text classification of short texts. This paper proposes a text feature combining neural network language model word2vec and document topic model Latent Dirichlet Allocation (LDA). Represents a matrix model. The matrix model can not only effectively represent the semantic features of the words but also convey the context features and enhance the feature expression ability of the model. The feature matrix was input into the convolutional neural network (CNN) for convolution pooling, and text classification experiments were performed. The experimental results show that the proposed matrix model has better classification effect than the traditional text classification methods based on word2vec and CNN. In the text classification accuracy rate, recall rate and F1 three evaluation indicators increased by 8.4%, 8.9% and 8.6%.

Author supplied keywords

Cite

CITATION STYLE

APA

Gao, M., Li, T., & Huang, P. (2019). Text classification research based on improved word2vec and CNN. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11434 LNCS, pp. 126–135). Springer Verlag. https://doi.org/10.1007/978-3-030-17642-6_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free