Towards more fine-grained and reliable NLP performance prediction

18Citations
Citations of this article
79Readers
Mendeley users who have this article in their library.

Abstract

Performance prediction, the task of estimating a system's performance without performing experiments, allows us to reduce the experimental burden caused by the combinatorial explosion of different datasets, languages, tasks, and models. In this paper, we make two contributions to improving performance prediction for NLP tasks. First, we examine performance predictors not only for holistic measures of accuracy like F1 or BLEU, but also fine-grained performance measures such as accuracy over individual classes of examples. Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration. We perform an analysis of four types of NLP tasks, and both demonstrate the feasibility of fine-grained performance prediction and the necessity to perform reliability analysis for performance prediction methods in the future. We make our code publicly available: https://github.com/neulab/Reliable-NLPPP.

Cite

CITATION STYLE

APA

Ye, Z., Liu, P., Fu, J., & Neubig, G. (2021). Towards more fine-grained and reliable NLP performance prediction. In EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 3703–3714). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.eacl-main.324

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free