Overview of the Transformer-based Models for NLP Tasks

Anthony Gillioz; Jacky Casas; Elena Mugellini; Omar Abou Khaled

Conference ProceedingsOPEN ACCESS

Overview of the Transformer-based Models for NLP Tasks

Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020 (2020) 179-183

DOI: 10.15439/2020F20

241Citations

355Readers

Abstract

In 2017, Vaswani et al. proposed a new neural network architecture named Transformer. That modern architecture quickly revolutionized the natural language processing world. Models like GPT and BERT relying on this Transformer architecture have fully outperformed the previous state-of-theart networks. It surpassed the earlier approaches by such a wide margin that all the recent cutting edge models seem to rely on these Transformer-based architectures. In this paper, we provide an overview and explanations of the latest models. We cover the auto-regressive models such as GPT, GPT-2 and XLNET, as well as the auto-encoder architecture such as BERT and a lot of post-BERT models like RoBERTa, ALBERT, ERNIE 1.0/2.0.

Cite

CITATION STYLE

APA

Gillioz, A., Casas, J., Mugellini, E., & Khaled, O. A. (2020). Overview of the Transformer-based Models for NLP Tasks. In Proceedings of the 2020 Federated Conference on Computer Science and Information Systems, FedCSIS 2020 (pp. 179–183). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.15439/2020F20

Overview of the Transformer-based Models for NLP Tasks

Abstract

Cite

Register to see more suggestions