The problem of subjectivity detection is often approached as a preparatory binary task for sentiment analysis, despite the fact that theoretically subjectivity is often defined as a matter of degree. In this work, we approach subjectivity analysis as a regression task and test the efficiency of a transformer RoBERTa model in annotating subjectivity of online news, including news from social media, based on a small subset of human-labeled training data. The results of experiments comparing our model to an existing rule-based subjectivity regressor and a state-of-the-art binary classifier reveal that: 1) our model highly correlates with the human subjectivity ratings and outperforms the widely used rule-based pattern subjectivity regressor (De Smedt and Daelemans, 2012); 2) our model performs well as a binary classifier and generalizes to the benchmark subjectivity dataset (Pang and Lee, 2004); 3) in contrast, state-of-the-art classifiers trained on the benchmark dataset show catastrophic performance on our human-labeled data. The results bring to light the issues of the gold standard subjectivity dataset, and the models trained on it, which seem to distinguish between the origin/style of the texts rather than subjectivity as perceived by human English speakers.
CITATION STYLE
Savinova, E., & del Prado Martín, F. M. (2023). Analyzing Subjectivity Using a Transformer-Based Regressor Trained on Naïve Speakers’ Judgements. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 305–314). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.wassa-1.27
Mendeley helps you to discover research relevant for your work.