The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task

Yuyang Zhang; Jintao Yu; Kai Wang; Yichun Yin; Cheng Chen; Qun Liu

Conference Proceedings

The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12431 LNAI 524-533

DOI: 10.1007/978-3-030-60457-8_43

0Citations

1Readers

Get full text

Abstract

Pre-trained language models have achieved great success in natural language processing. However, they are difficult to be deployed on resource-restricted devices because of the expensive computation. This paper introduces our solution to the Natural Language Processing and Chinese Computing (NLPCC) challenge of Light Pre-Training Chinese Language Model for the Natural Language Processing (http://tcci.ccf.org.cn/conference/2020/) (https://www.cluebenchmarks.com/NLPCC.html). The proposed solution uses a state-of-the-art method of BERT knowledge distillation (TinyBERT) with an advanced Chinese pre-trained language model (NEZHA) as the teacher model, which is dubbed as TinyNEZHA. In addition, we introduce some effective techniques in the fine-tuning stage to boost the performances of TinyNEZHA. In the official evaluation of NLPCC-2020 challenge, TinyNEZHA achieves a score of 77.71, ranking 1st place among all the participating teams. Compared with the BERT-base, TinyNEZHA obtains almost the same results while being 9× smaller and 8× faster on inference.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, Y., Yu, J., Wang, K., Yin, Y., Chen, C., & Liu, Q. (2020). The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12431 LNAI, pp. 524–533). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60457-8_43

The Solution of Huawei Cloud & Noah’s Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task

Abstract

Author supplied keywords

Cite

Register to see more suggestions