A new methodology for language identification in social media code-mixed text

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Nowadays, Transliteration is one of the hot research areas in the field of Natural Language Processing. Transliteration means that transferring a word from one language to another language and it is mostly used in cross-language platforms. Generally, people use code-mixed language for sharing their views on social media like Twitter, WhatsApp, etc. Code-mixed language means one language is written using another language script and it is very important to identify the languages used in each word to process such type of text. Therefore, a deep learning model is implemented using Bidirectional Long Short-Term Memory (BLSTM) for Indian social media texts in this paper. This model identifies the origin of the word from language perspective in the sequence based on the specific words that have come before it in the sequence. The proposed model gives better accuracy for word-embedding model as compared to character embedding.

Cite

CITATION STYLE

APA

Gupta, Y., Raghuwanshi, G., & Tripathi, A. (2021). A new methodology for language identification in social media code-mixed text. In Advances in Intelligent Systems and Computing (Vol. 1141, pp. 243–254). Springer. https://doi.org/10.1007/978-981-15-3383-9_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free