Exploiting languages proximity for part-of-speech tagging of three French regional languages

7Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents experiments in part-of-speech tagging of low-resource languages. It addresses the case when no labeled data in the targeted language and no parallel corpus are available. We only rely on the proximity of the targeted language to a better-resourced language. We conduct experiments on three French regional languages. We try to exploit this proximity with two main strategies: delexicalization and transposition. The general idea is to learn a model on the (better-resourced) source language, which will then be applied to the (regional) target language. Delexicalization is used to deal with the difference in vocabulary, by creating abstract representations of the data. Transposition consists in modifying the target corpus to be able to use the source models. We compare several methods and propose different strategies to combine them and improve the state-of-the-art of part-of-speech tagging in this difficult scenario.

Cite

CITATION STYLE

APA

Magistry, P., Ligozat, A. L., & Rosset, S. (2019). Exploiting languages proximity for part-of-speech tagging of three French regional languages. Language Resources and Evaluation, 53(4), 865–888. https://doi.org/10.1007/s10579-019-09463-7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free