Abstract
We present a novel lexicon-based classification approach for code-switching detection on Twitter. The main aim is to develop a simple lexical look-up classifier based on frequency information retrieved from Wikipedia. We evaluate the classifier using three different language pairs: Spanish-English, Dutch-English, and German-Turkish. The results indicate that our figures for Spanish-English are competitive with current state of the art classifiers, even though the approach is simplistic and based solely on word frequency information.
Cite
CITATION STYLE
Claeser, D., Felske, D., & Kent, S. (2018). Token level code-switching detection using wikipedia as a lexical resource. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10713 LNAI, pp. 192–198). Springer Verlag. https://doi.org/10.1007/978-3-319-73706-5_16
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.