Distributed document representation for document classification

5Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The distributed vector representations learned from the deep learning framework have shown its great power in capturing the semantic meaning of words, phrases and sentences, from which multiple NLP applications have benefited. As words combine to form the meaning of sentences, so do sentences combine to form the meaning of documents, the idea of representing each document with a dense distributed representation holds promise. In this paper, we propose a supervised framework (Compound RNN) for document classification based on document-level distributed representations learned from deep learning architecture. Our framework first obtains the distributed representation at sentence-level by operating on the parse tree structure from recursive neural network, and then obtains the document presentation-level by convoluting the sentence vectors from a recurrent neural network. Our framework (Compound RNN) outperforms existing document representations such as bag-of-words, LDA in multiple text classification/regression tasks.

Cite

CITATION STYLE

APA

Li, R., & Shindo, H. (2015). Distributed document representation for document classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9077, pp. 212–225). Springer Verlag. https://doi.org/10.1007/978-3-319-18038-0_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free