Exploring Methods and Resources for Discriminating Similar Languages

Marco Lui; Ned Letcher; Oliver Adams; Long Duong; Paul Cook; Timothy Baldwin

Conference Proceedings

Exploring Methods and Resources for Discriminating Similar Languages

1st Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, VarDial 2014 at the 25th International Conference on Computational Linguistics: System Demonstrations, COLING 2014 - Proceedings (2014) 129-138

DOI: 10.3115/v1/w14-5315

13Citations

68Readers

Get full text

Abstract

The Discriminating between Similar Languages (DSL) shared task at VarDial challenged participants to build an automatic language identification system to discriminate between 13 languages in 6 groups of highly-similar languages (or national varieties of the same language). In this paper, we describe the submissions made by team UniMelb-NLP, which took part in both the closed and open categories. We present the text representations and modeling techniques used, including cross-lingual POS tagging as well as fine-grained tags extracted from a deep grammar of English, and discuss additional data we collected for the open submissions, utilizing custombuilt web corpora based on top-level domains as well as existing corpora.

Cite

CITATION STYLE

APA

Lui, M., Letcher, N., Adams, O., Duong, L., Cook, P., & Baldwin, T. (2014). Exploring Methods and Resources for Discriminating Similar Languages. In 1st Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, VarDial 2014 at the 25th International Conference on Computational Linguistics: System Demonstrations, COLING 2014 - Proceedings (pp. 129–138). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-5315

Exploring Methods and Resources for Discriminating Similar Languages

Abstract

Cite

Register to see more suggestions