Automatic interlinear glossing as two-level sequence classification

11Citations
Citations of this article
72Readers
Mendeley users who have this article in their library.

Abstract

Interlinear glossing is a type of annotation of morphosyntactic categories and crosslinguistic lexical correspondences that allows linguists to analyse sentences in languages that they do not necessarily speak. Automatising this annotation is necessary in order to provide glossed corpora big enough to be used for quantitative studies. In this paper, we present experiments on the automatic glossing of Chintang. We decompose the task of glossing into steps suitable for statistical processing. We first perform grammatical glossing as standard supervised part-of-speech tagging. We then add lexical glosses from a stand-off dictionary applying context disambiguation in a similar way to word lemmatisation. We obtain the highest accuracy score of 96% for grammatical and 94% for lexical glossing. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.

Cite

CITATION STYLE

APA

Samardžíc, T., Schikowski, R., & Stoll, S. (2015). Automatic interlinear glossing as two-level sequence classification. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2015-text, pp. 68–72). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3710

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free