TopicBERT for energy efficient document classification

10Citations
Citations of this article
99Readers
Mendeley users who have this article in their library.

Abstract

Prior research notes that BERT’s computational cost grows quadratically with sequence length thus leading to longer training times, higher GPU memory constraints and carbon emissions. While recent work seeks to address these scalability issues at pre-training, these issues are also prominent in fine-tuning especially for long sequence tasks like document classification. Our work thus focuses on optimizing the computational cost of fine-tuning for document classification. We achieve this by complementary learning of both topic and language models in a unified framework, named TopicBERT. This significantly reduces the number of self-attention operations – a main performance bottleneck. Consequently, our model achieves a 1.4x ( 40%) speedup with 40% reduction in CO2 emission while retaining 99.9% performance over 5 datasets.

Cite

CITATION STYLE

APA

Chaudhary, Y., Gupta, P., Saxena, K., Kulkarni, V., Runkler, T., & Schütze, H. (2020). TopicBERT for energy efficient document classification. In Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020 (pp. 1682–1690). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.findings-emnlp.152

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free