Abstract
The Discriminating between Similar Languages (DSL) shared task at VarDial challenged participants to build an automatic language identification system to discriminate between 13 languages in 6 groups of highly-similar languages (or national varieties of the same language). In this paper, we describe the submissions made by team UniMelb-NLP, which took part in both the closed and open categories. We present the text representations and modeling techniques used, including cross-lingual POS tagging as well as fine-grained tags extracted from a deep grammar of English, and discuss additional data we collected for the open submissions, utilizing custombuilt web corpora based on top-level domains as well as existing corpora.
Cite
CITATION STYLE
Lui, M., Letcher, N., Adams, O., Duong, L., Cook, P., & Baldwin, T. (2014). Exploring Methods and Resources for Discriminating Similar Languages. In 1st Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, VarDial 2014 at the 25th International Conference on Computational Linguistics: System Demonstrations, COLING 2014 - Proceedings (pp. 129–138). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-5315
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.