—The paper presents a Markov chain-based method for automatic written language identification. Given a training document in a specific language, each word can be represented as a Markov chain of letters. Using the entire training document regarded as a set of Markov chains, the set of initial and transition probabilities can be calculated and referred to as a Markov model for that language. Given an unknown language string, the maximum likelihood decision rule was used to identify language. Experimental results showed that the proposed method achieved lower error rate and faster identification speed than the current n-gram method.
CITATION STYLE
Tran, D., & Sharma, D. (2005). Markov models for written language identification. Proceedings of the 12th International Conference on Neural Information Processing. Retrieved from http://ise.canberra.edu.au/html/DTran/Publications/P51479.pdf
Mendeley helps you to discover research relevant for your work.