Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora

189Citations
Citations of this article
113Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper investigates the potential for projecting linguistic annotations including part-of-speech tags and base noun phrase bracketings from one language to another via automatically word-aligned parallel corpora. First, experiments assess the accuracy of unmodified direct transfer of tags and brackets from the source language English to the target languages French and Chinese, both for noisy machine-aligned sentences and for clean hand-aligned sentences. Performance is then substantially boosted over both of these baselines by using training techniques optimized for very noisy data, yielding 94-96% core French part-of-speech tag accuracy and 90% French bracketing F-measure for stand-alone monolingual tools trained without the need for any human-annotated data in the given language.

Cite

CITATION STYLE

APA

Yarowsky, D., & Ngai, G. (2001). Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora. In 2nd Meeting of the North American Chapter of the Association for Computational Linguistics, NAACL 2001. Association for Computational Linguistics (ACL). https://doi.org/10.3115/1073336.1073362

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free