Resource creation for training and testing of normalisation systems for konkani-english code-mixed social media text

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Code-Mixing is the mixing of two or more languages or language varieties in speech. Apart from the inherent linguistic complexity, the analysis of code-mixed content poses complex challenges owing to the presence of spelling variations and non-adherence to a formal grammar. However, for any downstream Natural Language Processing task, tools that are able to process and analyze code-mixed social media data are required. Currently there is a lack of publicly available resources for code-mixed Konkani-English social media data, while the amount of such text is increasing everyday. The lack of a standard dataset to evaluate these systems makes it difficult to make any meaningful comparisons of their relative accuracies. In this paper, we describe the methodology for the creation of a normalisation dataset for Konkani-English Code-Mixed Social Media Text (CMST). We believe that this dataset will prove useful not only for the evaluation and training of normalisation systems but also help in the linguistic analysis of the process of normalisation Indian languages from native scripts to Roman. Normalisation refers to the process of writing the text of one language using the script of another language whereby the sound of the text is preserved as far as possible [3].

Cite

CITATION STYLE

APA

Phadte, A. (2018). Resource creation for training and testing of normalisation systems for konkani-english code-mixed social media text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 264–271). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_26

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free