A proposal on evaluation measures for RTE

Richard Bergmair

Conference Proceedings

A proposal on evaluation measures for RTE

Bergmair R

TextInfer 2009 - 2009 Workshop on Applied Textual Inference at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 - Proceedings (2009) 10-17

DOI: 10.3115/1708141.1708144

3Citations

81Readers

Get full text

Abstract

We outline problems with the interpretation of accuracy in the presence of bias, arguing that the issue is a particularly pressing concern for RTE evaluation. Furthermore, we argue that average precision scores are unsuitable for RTE, and should not be reported. We advocate mutual information as a new evaluation measure that should be reported in addition to accuracy and confidence-weighted score.

Cite

CITATION STYLE

APA

Bergmair, R. (2009). A proposal on evaluation measures for RTE. In TextInfer 2009 - 2009 Workshop on Applied Textual Inference at the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009 - Proceedings (pp. 10–17). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1708141.1708144

A proposal on evaluation measures for RTE

Abstract

Cite

Register to see more suggestions