AUT Document Alignment Framework for BUCC Workshop Shared Task

7Citations
Citations of this article
62Readers
Mendeley users who have this article in their library.

Abstract

This paper presents a framework for aligning comparable documents collection. Our feature based model is able to consider different characteristics of documents for evaluating their similarities. The model uses the content of documents while no link, special tag or Metadata are available. And also we apply a filtering mechanism which made our model to be properly applicable for a large collection of data. According to the results, our model is able to recognize related documents in the target language with recall of 45.67% for the 1-best and 62% for the 5-best.

Cite

CITATION STYLE

APA

Zafarian, A., Aghasadeghi, A., Azadi, F., Ghiasifard, S., Alipanahloo, Z., Bakhshaei, S., & Ziabary, S. M. M. (2015). AUT Document Alignment Framework for BUCC Workshop Shared Task. In 8th Workshop on Building and Using Comparable Corpora, BUCC 2015 - co-located with 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2015 - Proceedings (pp. 79–87). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3412

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free