Word based Deep Neural Network (DNN) approach of text classification suffers performance issues due to limited set of vocabulary words. Character based Convolutional Neural Network models (CNN) was proposed by the researchers to address the issue. But, character based models do not inherently capture the sequential relationship of words in texts. Hence, there is scope of further improvement by addressing unseen word problem through character model while maintaining the sequential context through word based model. In this work, we propose methods to combine both character and word based models for efficient text classification. The methods are compared with some of the benchmark datasets and state-of-the art results.
CITATION STYLE
Yenigalla, P., Kar, S., Singh, C., Nagar, A., & Mathur, G. (2018). Addressing unseen word problem in text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 339–351). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_36
Mendeley helps you to discover research relevant for your work.