B1A3D2LUC@WMT 2016: A Bilingual1Document2Alignment3Platform Based on Lucene

5Citations
Citations of this article
62Readers
Mendeley users who have this article in their library.

Abstract

We participated in the Bilingual Document Alignment shared task of WMT 2016 with the intent of testing plain cross-lingual information retrieval platform built on top of the Apache Lucene framework. We devised a number of interesting variants, including one that only considers the URLs of the pages, and that offers - without any heuristic - surprisingly high performances. We finally submitted the output of a system that combines two informations (text and url) from documents and a post-treatment for an accuracy that reaches 92% on the development dataset distributed for the shared task.

Cite

CITATION STYLE

APA

Jakubina, L., & Langlais, P. (2016). B1A3D2LUC@WMT 2016: A Bilingual1Document2Alignment3Platform Based on Lucene. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2, pp. 703–709). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-2370

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free