Addressing unseen word problem in text classification

Promod Yenigalla; Sibsambhu Kar; Chirag Singh; Ajay Nagar; Gaurav Mathur

Conference Proceedings

Addressing unseen word problem in text classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10859 LNCS 339-351

DOI: 10.1007/978-3-319-91947-8_36

8Citations

5Readers

Get full text

Abstract

Word based Deep Neural Network (DNN) approach of text classification suffers performance issues due to limited set of vocabulary words. Character based Convolutional Neural Network models (CNN) was proposed by the researchers to address the issue. But, character based models do not inherently capture the sequential relationship of words in texts. Hence, there is scope of further improvement by addressing unseen word problem through character model while maintaining the sequential context through word based model. In this work, we propose methods to combine both character and word based models for efficient text classification. The methods are compared with some of the benchmark datasets and state-of-the art results.

Author supplied keywords

Cite

CITATION STYLE

APA

Yenigalla, P., Kar, S., Singh, C., Nagar, A., & Mathur, G. (2018). Addressing unseen word problem in text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 339–351). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_36

Addressing unseen word problem in text classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions