Application of Linguistic Knowledge in Factored Language Modeling for Hindi Language

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A language model is a technique that shows which words are more or less likely to be generated during some conversation in any natural language. N-gram language modeling is the pioneer technology used to construct language models. N-gram technique considers preceding words only to predict the upcoming word. Factored language modeling is a formalism that provides a facility to undertake other linguistic knowledge of the words like gender, number, part of speech, stem of word along with word itself to predict next word in a sentence. This paper discusses the effect of various combinations of linguistic features of word on predictability of next word in Hindi-language sentence. The paper also discusses how use of linguistic features decreases the perplexity by 31.71% as compared to perplexity of baseline N-gram language model.

Cite

CITATION STYLE

APA

Babhulgaonkar, A. R., & Sonavane, S. P. (2020). Application of Linguistic Knowledge in Factored Language Modeling for Hindi Language. In Advances in Intelligent Systems and Computing (Vol. 1025, pp. 521–531). Springer. https://doi.org/10.1007/978-981-32-9515-5_50

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free