Predicting performance for natural language processing tasks

39Citations
Citations of this article
146Readers
Mendeley users who have this article in their library.

Abstract

Given the complexity of combinations of tasks, languages, and domains in natural language processing (NLP) research, it is computationally prohibitive to exhaustively test newly proposed models on each possible experimental setting. In this work, we attempt to explore the possibility of gaining plausible judgments of how well an NLP model can perform under an experimental setting, without actually training or testing the model. To do so, we build regression models to predict the evaluation score of an NLP experiment given the experimental settings as input. Experimenting on 9 different NLP tasks, we find that our predictors can produce meaningful predictions over unseen languages and different modeling architectures, outperforming reasonable baselines as well as human experts. Going further, we outline how our predictor can be used to find a small subset of representative experiments that should be run in order to obtain plausible predictions for all other experimental settings.

Cite

CITATION STYLE

APA

Xia, M., Anastasopoulos, A., Xu, R., Yang, Y., & Neubig, G. (2020). Predicting performance for natural language processing tasks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 8625–8646). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.764

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free