In today’s world, several digitized Hindi text documents are generated daily at the Government sites, news portals, and public and private sectors, which are required to be classified effectively into various mutually exclusive pre-defined categories. As such, many Hindi text-based processing systems exist in application domains of information retrieval, machine translation, text summarization, simplification, keyword extraction, and other related parsing and linguistic perspectives, but still, there is a wide scope to classify the extracted text of Hindi documents into pre-defined categories using a classifier. In this paper, a Hindi Text Classification model is proposed, which accepts a set of known Hindi documents, preprocesses them at document, sentence and word levels, extracts features, and trains SVM classifier, which further classifies a set of Hindi unknown documents. Such text classification becomes challenging in Hindi due to its large set of available conjuncts and letter combinations, its sentence structure, and multisense words. The experiments have been performed on a set of four Hindi documents of two categories, which have been classified by SVM with 100% accuracy.
CITATION STYLE
Puri, S., & Singh, S. P. (2019). An efficient hindi text classification model using SVM. In Lecture Notes in Networks and Systems (Vol. 75, pp. 227–237). Springer. https://doi.org/10.1007/978-981-13-7150-9_24
Mendeley helps you to discover research relevant for your work.