In many languages, multiple ambiguous words can be present, which can have different meanings depending on the context in which they are used. Such words are known as polysemous words. Word sense disambiguation deals with the problem of identifying the right meaning of a polysemous word based on context. Plenty of software systems are available to carry out this functionality for English language documents, but there has not been much development for regional languages. Hence, we propose a system that is equipped to carry out word sense disambiguation for regional languages by taking Kannada as an example. The execution of the application is achieved using the BERT model. BERT is a transformer-based language model that can create state-of-the-art models for a wide range of tasks like word sense disambiguation. Its suitability for NLP tasks is due to its ability to achieve sophisticated performance on sentence-level and token-level tasks. The proposed system provides us with an accuracy of 75%. The model can be made more intense by building and training on a larger dataset and enhancing it for other regional languages.
CITATION STYLE
Chandrika, C. P., & Kallimani, J. S. (2022). Word Sense Disambiguation for Indian Regional Language Using BERT Model. In Smart Innovation, Systems and Technologies (Vol. 283, pp. 127–137). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-16-9705-0_13
Mendeley helps you to discover research relevant for your work.