Weakly-Supervised Neural Categorization of Wikipedia Articles

Xingyu Chen; Mizuho Iwaihara

Conference Proceedings

Weakly-Supervised Neural Categorization of Wikipedia Articles

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11853 LNCS 16-22

DOI: 10.1007/978-3-030-34058-2_2

0Citations

4Readers

Get full text

Abstract

Deep neural models are gaining increasing popularity for many NLP tasks, due to their strong expressive power and less requirement for feature engineering. Neural models often need a large amount of labeled training documents. However, one category of Wikipedia does not contain enough articles for training. Weakly-supervised neural document classification can deal with situations even when only a small labeled document set is given. However, these RNN-based approaches often fail on long documents such as Wikipedia articles, due to hardness to retain memories on important parts of a long document. To overcome these challenges, we propose a text summarization method called WS-Rank, which extracts key sentences of documents with weighting based on class-related keywords and sentence positions in documents. After applying our WS-Rank to training and test documents to summarize then into key sentences, weakly-supervised neural classification shows remarkable improvement on classification results.

Author supplied keywords

Cite

CITATION STYLE

APA

Chen, X., & Iwaihara, M. (2019). Weakly-Supervised Neural Categorization of Wikipedia Articles. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11853 LNCS, pp. 16–22). Springer. https://doi.org/10.1007/978-3-030-34058-2_2

Weakly-Supervised Neural Categorization of Wikipedia Articles

Abstract

Author supplied keywords

Cite

Register to see more suggestions