Comparing BERT Against Traditional Machine Learning Models in Text Classification

Eduardo C. Garrido-Merchan; Roberto Gozalo-Brizuela; Santiago Gonzalez-Carvajal

Journal ArticleOPEN ACCESS

Comparing BERT Against Traditional Machine Learning Models in Text Classification

Journal of Computational and Cognitive Engineering (2023) 2(4) 352-356

DOI: 10.47852/bonviewJCCE3202838

119Citations

300Readers

Abstract

The Bidirectional Encoder Representations from Transformers (BERT) model has arisen as a popular state-of-the-art model in recent years. It is able to cope with natural language processing (NLP) tasks such as supervised text classification without human supervision. Its flexibility to cope with any corpus delivering great results has make this approach very popular in academia and industry, although other approaches have been used before successfully. We first present BERT and a review on classical NLP approaches. Then, we empirically test with a suite of different scenarios the behavior of BERT against traditional term frequency – inverse document frequency vocabulary fed to machine learning models. The purpose of this work is adding empirical evidence to support the use of BERT as a default on NLP tasks. Experiments show the superiority of BERT and its independence of features of the NLP problem such as the language of the text adding empirical evidence to use BERT as a default technique in NLP problems.

Author supplied keywords

Cite

CITATION STYLE

APA

Garrido-Merchan, E. C., Gozalo-Brizuela, R., & Gonzalez-Carvajal, S. (2023). Comparing BERT Against Traditional Machine Learning Models in Text Classification. Journal of Computational and Cognitive Engineering, 2(4), 352–356. https://doi.org/10.47852/bonviewJCCE3202838

Comparing BERT Against Traditional Machine Learning Models in Text Classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions