Distill BERT to Traditional Models in Chinese Machine Reading Comprehension

4Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Recently, unsupervised representation learning has been extremely successful in the field of natural language processing. More and more pre-trained language models are proposed and achieved the most advanced results especially in machine reading comprehension. However, these proposed pre-trained language models are huge with hundreds of millions of parameters that have to be trained. It is quite time consuming to use them in actual industry. Thus we propose a method that employ a distillation traditional reading comprehension model to simplify the pre-trained language model so that the distillation model has faster reasoning speed and higher inference accuracy in the field of machine reading comprehension. We evaluate our proposed method on the Chinese machine reading comprehension dataset CMRC2018 and greatly improve the accuracy of the original model. To the best of our knowledge, we are the first to propose a method that employ the distillation pretrained language model in Chinese machine reading comprehension.

Cite

CITATION STYLE

APA

Ren, X., Shi, R., & Li, F. (2020). Distill BERT to Traditional Models in Chinese Machine Reading Comprehension. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 13901–13902). AAAI press.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free