The numerous harmful publications generated from large amounts of data expelled daily on social media make it necessary to adopt automated technologies for online content moderation. Sentence classification and sentiment analysis are Natural Language Processing (NLP) techniques used to detect hate speech on social media platforms such as Facebook and Instagram. However, some difficulties reduce the effectiveness of these tools in the Portuguese language. Previous research has shown how NLP models have high accuracy when trained with datasets centered on mastering the Brazilian Portuguese language. In this work, we propose the creation of a large-scale linguistic corpus for Brazilian Portuguese composed of publications collected from the social network Twitter. The experiments were performed by tuning a pretrained transformer model.
CITATION STYLE
Rosa, C. C. S., Martinez, F. V., & Ishii, R. (2023). Natural Language Processing Techniques for Hate Speech Evaluation for Brazilian Portuguese. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14107 LNCS, pp. 104–117). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-37114-1_8
Mendeley helps you to discover research relevant for your work.