Cheap translation for cross-lingual named entity recognition

113Citations
Citations of this article
147Readers
Mendeley users who have this article in their library.

Abstract

Recent work in NLP has attempted to deal with low-resource languages but still assumed a resource level that is not present for most languages, e.g., the availability of Wikipedia in the target language. We propose a simple method for cross-lingual named entity recognition (NER) that works well in settings with very minimal resources. Our approach makes use of a lexicon to “translate” annotated data available in one or several high resource language(s) into the target language, and learns a standard monolingual NER model there. Further, when Wikipedia is available in the target language, our method can enhance Wikipedia based methods to yield state-of-the-art NER results; we evaluate on 7 diverse languages, improving the state-of-the-art by an average of 5.5% F1 points. With the minimal resources required, this is an extremely portable cross-lingual NER approach, as illustrated using a truly low-resource language, Uyghur.

References Powered by Scopus

Design challenges and misconceptions in named entity recognition

1133Citations
778Readers
239Citations
82Readers
Get full text

Cross-lingual wikification using multilingual embeddings

98Citations
196Readers

Cited by Powered by Scopus

Word alignment by fine-tuning embeddings on parallel corpora

181Citations
152Readers
Get full text

End-to-end slot alignment and recognition for cross-lingual nlu

96Citations
119Readers

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Mayhew, S., Tsai, C. T., & Roth, D. (2017). Cheap translation for cross-lingual named entity recognition. In EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 2536–2545). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d17-1269

Readers over time

‘17‘18‘19‘20‘21‘22‘23‘24‘2509182736

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 55

76%

Researcher 10

14%

Lecturer / Post doc 5

7%

Professor / Associate Prof. 2

3%

Readers' Discipline

Tooltip

Computer Science 72

83%

Linguistics 10

11%

Engineering 3

3%

Neuroscience 2

2%

Save time finding and organizing research with Mendeley

Sign up for free
0