zNLP: Identifying parallel sentences in Chinese-English comparable corpora

Zheng Zhang; Pierre Zweigenbaum

Conference ProceedingsOPEN ACCESS

zNLP: Identifying parallel sentences in Chinese-English comparable corpora

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2017) 51-55

DOI: 10.18653/v1/w17-2510

6Citations

68Readers

Abstract

This paper describes the zNLP system for the BUCC 2017 shared task. Our system identifies parallel sentence pairs in Chinese-English comparable corpora by translating word-by-word Chinese sentences into English, using the search engine Solr to select near-parallel sentences and then by using an SVM classifier to identify true parallel sentences from the previous results. It obtains an F1-score of 45% (resp. 43%) on the test (training) set.

Cite

CITATION STYLE

APA

Zhang, Z., & Zweigenbaum, P. (2017). zNLP: Identifying parallel sentences in Chinese-English comparable corpora. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 51–55). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-2510

zNLP: Identifying parallel sentences in Chinese-English comparable corpora

Abstract

Cite

Register to see more suggestions