Abstract
Automated predictions require explanations to be interpretable by humans. One type of explanation is a rationale, i.e., a selection of input features such as relevant text snippets from which the model computes the outcome. However, a single overall selection does not provide a complete explanation, e.g., weighing several aspects for decisions. To this end, we present a novel self-interpretable model called ConRAT. Inspired by how human explanations for high-level decisions are often based on key concepts, ConRAT extracts a set of text snippets as concepts and infers which ones are described in the document. Then, it explains the outcome with a linear aggregation of concepts. Two regularizers drive ConRAT to build interpretable concepts. In addition, we propose two techniques to boost the rationale and predictive performance further. Experiments on both single- and multi-aspect sentiment classification tasks show that ConRAT is the first to generate concepts that align with human rationalization while using only the overall label. Further, it outperforms state-of-the-art methods trained on each aspect label independently.
Cite
CITATION STYLE
Antognini, D., & Faltings, B. (2021). Rationalization through Concepts. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 761–775). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.68
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.