Cheap translation for cross-lingual named entity recognition

Stephen Mayhew; Chen Tse Tsai; Dan Roth

Conference ProceedingsOPEN ACCESS

Cheap translation for cross-lingual named entity recognition

EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (2017) 2536-2545

DOI: 10.18653/v1/d17-1269

113Citations

147Readers

Abstract

Recent work in NLP has attempted to deal with low-resource languages but still assumed a resource level that is not present for most languages, e.g., the availability of Wikipedia in the target language. We propose a simple method for cross-lingual named entity recognition (NER) that works well in settings with very minimal resources. Our approach makes use of a lexicon to “translate” annotated data available in one or several high resource language(s) into the target language, and learns a standard monolingual NER model there. Further, when Wikipedia is available in the target language, our method can enhance Wikipedia based methods to yield state-of-the-art NER results; we evaluate on 7 diverse languages, improving the state-of-the-art by an average of 5.5% F1 points. With the minimal resources required, this is an extremely portable cross-lingual NER approach, as illustrated using a truly low-resource language, Uyghur.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Mayhew, S., Tsai, C. T., & Roth, D. (2017). Cheap translation for cross-lingual named entity recognition. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 2536–2545). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d17-1269

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 55

76%

Researcher 10

14%

Lecturer / Post doc 5

Professor / Associate Prof. 2

Readers' Discipline

Computer Science 72

83%

Linguistics 10

11%

Engineering 3

Neuroscience 2

Cheap translation for cross-lingual named entity recognition

Abstract

References Powered by Scopus

Design challenges and misconceptions in named entity recognition

Bootstrapping parsers via syntactic projection across parallel texts

Cross-lingual wikification using multilingual embeddings

Cited by Powered by Scopus

Word alignment by fine-tuning embeddings on parallel corpora

Named Entity Recognition and Relation Extraction: State-of-The-Art

End-to-end slot alignment and recognition for cross-lingual nlu

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline