This paper describes the system by FBK HLT-MT for cross-lingual semantic textual similarity measurement. Our approach is based on supervised regression with an ensemble decision tree. In order to assign a semantic similarity score to an input sentence pair, the model combines features collected by state-of-the-art methods in machine translation quality estimation and distance metrics between cross-lingual embeddings of the two sentences. In our analysis, we compare different techniques for composing sentence vectors, several distance features and ways to produce training data. The proposed system achieves a mean Pearson's correlation of 0.39533, ranking 7th among all participants in the cross-lingual STS task organized within the SemEval 2016 evaluation campaign.
CITATION STYLE
Ataman, D., De Souza, J. G. C., Turchi, M., & Negri, M. (2016). FBK HLT-MT at semeval-2016 task 1: Cross-lingual semantic similarity measurement using quality estimation features and compositional bilingual word embeddings. In SemEval 2016 - 10th International Workshop on Semantic Evaluation, Proceedings (pp. 570–576). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/s16-1086
Mendeley helps you to discover research relevant for your work.