Unsupervised active learning of CRF model for cross-lingual named entity recognition

Mohamed Farouk Abdel Hady; Abubakrelsedik Karali; Eslam Kamal; Rania Ibrahim

Conference ProceedingsOPEN ACCESS

Unsupervised active learning of CRF model for cross-lingual named entity recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8774 23-34

DOI: 10.1007/978-3-319-11656-3_3

3Citations

11Readers

Abstract

Manual annotation of the training data of information extraction models is a time consuming and expensive process but necessary for the building of information extraction systems. Active learning has been proven to be effective in reducing manual annotation efforts for supervised learning tasks where a human judge is asked to annotate the most informative examples with respect to a given model. However, in most cases reliable human judges are not available for all languages. In this paper, we propose a cross-lingual unsupervised active learning paradigm (XLADA) that generates high-quality automatically annotated training data from a word-aligned parallel corpus. To evaluate our paradigm, we applied XLADA on English-French and English-Chinese bilingual corpora then we trained French and Chinese information extraction models. The experimental results show that XLADA can produce effective models without manually-annotated training data.

Author supplied keywords

Cite

CITATION STYLE

APA

Hady, M. F. A., Karali, A., Kamal, E., & Ibrahim, R. (2014). Unsupervised active learning of CRF model for cross-lingual named entity recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8774, pp. 23–34). Springer Verlag. https://doi.org/10.1007/978-3-319-11656-3_3

Unsupervised active learning of CRF model for cross-lingual named entity recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions