We propose a simple yet effective MultiLayer RAndom Perturbation Training algorithm (RAPT) to enhance model robustness and generalization. The key idea is to apply randomly sampled noise to each input to generate label-preserving artificial input points. To encourage the model to generate more diverse examples, the noise is added to a combination of the model layers. Then, our model regularizes the posterior difference between clean and noisy inputs. We apply RAPT towards robust and efficient BERT training, and conduct comprehensive fine-tuning experiments on GLUE tasks. Our results show that RAPT outperforms the standard fine-tuning approach, and adversarial training method, yet with 22% less training time.
CITATION STYLE
Pereira, L. K., Taya, Y., & Kobayashi, I. (2021). Multi-Layer Random Perturbation Training for Improving Model Generalization. In BlackboxNLP 2021 - Proceedings of the 4th BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (pp. 303–310). Association for Computational Linguistics (ACL).
Mendeley helps you to discover research relevant for your work.