ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models

4Citations
Citations of this article
39Readers
Mendeley users who have this article in their library.

Abstract

Nowadays, pretrained language models (PLMs) have dominated the majority of NLP tasks. While, little research has been conducted on systematically evaluating the language abilities of PLMs. In this paper, we present a large-scale empirical study on genEral language ability evaluation of PLMs (ElitePLM). In our study, we design four evaluation dimensions, i.e., memory, comprehension, reasoning, and composition, to measure ten widely-used PLMs within five categories. Our empirical results demonstrate that: (1) PLMs with varying training objectives and strategies are good at different ability tests; (2) fine-tuning PLMs in downstream tasks is usually sensitive to the data size and distribution; (3) PLMs have excellent transferability between similar tasks. Moreover, the prediction results of PLMs in our experiments are released as an open resource for more deep and detailed analysis on the language abilities of PLMs. This paper can guide the future work to select, apply, and design PLMs for specific tasks. We have made all the details of experiments publicly available at https://github.com/RUCAIBox/ElitePLM.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Li, J., Tang, T., Gong, Z., Yang, L., Yu, Z., Chen, Z., … Wen, J. R. (2022). ElitePLM: An Empirical Study on General Language Ability Evaluation of Pretrained Language Models. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 3519–3539). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.naacl-main.258

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 11

69%

Researcher 4

25%

Lecturer / Post doc 1

6%

Readers' Discipline

Tooltip

Computer Science 16

80%

Linguistics 2

10%

Neuroscience 1

5%

Agricultural and Biological Sciences 1

5%

Save time finding and organizing research with Mendeley

Sign up for free