While deep learning models have greatly improved the performance of many tasks related to sentiment analysis and classification, they are often criticized for being untrustworthy due to their black-box nature. As a result, numerous explainability techniques have been proposed to better understand the model predictions and to improve the deep learning models. In this work, we introduce InfoBarometer, the first benchmark for examining interpretable methods related to sentiment analysis in the German automotive sector based on online news. Each news article in our dataset is annotated with respect to the overall sentiment (i.e., positive, negative and neutral), the target of the sentiment (focusing on innovation-related topics such as e.g. electromobility) and the rationales, i.e., textual explanations for the sentiment label that can be leveraged during both training and evaluation. For this research, we compare different state-of-the-art approaches to perform sentiment analysis and observe that even models that perform very well in classification do not score high on explainability metrics like model plausibility and faithfulness. We calculated the polarity scores for the best method BERT and got a macro F1-score of 73.8. Moreover, we evaluated different interpretability algorithms (LIME, SHAP, Integrated Gradients, Saliency) based on explicitly marked rationales by human annotators quantitatively and qualitatively. Our experiments demonstrate that the textual explanations often do not agree with human interpretations, and rarely help to justify the models decision. However, global features provide useful insights to help uncover spurious features in the model and biases within the dataset. We intend to make our dataset public for other researchers.
CITATION STYLE
Zielinski, A., Spolwind, C., Grimm, A., & Kroll, H. (2023). A Dataset for Explainable Sentiment Analysis in the German Automotive Industry. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 138–148). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.wassa-1.13
Mendeley helps you to discover research relevant for your work.