The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pre-trained language models have achieved great success in natural language processing. However, they are difficult to be deployed on resource-restricted devices because of the expensive computation. This paper introduces our solution to the Natural Language Processing and Chinese Computing (NLPCC) challenge of Light Pre-Training Chinese Language Model for the Natural Language Processing (http://tcci.ccf.org.cn/conference/2020/) (https://www.cluebenchmarks.com/NLPCC.html). The proposed solution uses a state-of-the-art method of BERT knowledge distillation (TinyBERT) with an advanced Chinese pre-trained language model (NEZHA) as the teacher model, which is dubbed as TinyNEZHA. In addition, we introduce some effective techniques in the fine-tuning stage to boost the performances of TinyNEZHA. In the official evaluation of NLPCC-2020 challenge, TinyNEZHA achieves a score of 77.71, ranking 1st place among all the participating teams. Compared with the BERT-base, TinyNEZHA obtains almost the same results while being 9× smaller and 8× faster on inference.

Cite

CITATION STYLE

APA

Zhang, Y., Yu, J., Wang, K., Yin, Y., Chen, C., & Liu, Q. (2020). The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12431 LNAI, pp. 524–533). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60457-8_43

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free