Improving Interpretability via Explicit Word Interaction Graph Layer

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Recent NLP literature has seen growing interest in improving model interpretability. Along this direction, we propose a trainable neural network layer that learns a global interaction graph between words and then selects more informative words using the learned word interactions. Our layer, we call WIGRAPH, can plug into any neural network-based NLP text classifiers right after its word embedding layer. Across multiple SOTA NLP models and various NLP datasets, we demonstrate that adding the WIGRAPH layer substantially improves NLP models' interpretability and enhances models' prediction performance at the same time.

Cite

CITATION STYLE

APA

Sekhon, A., Chen, H., Shrivastava, A., Wang, Z., Ji, Y., & Qi, Y. (2023). Improving Interpretability via Explicit Word Interaction Graph Layer. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (Vol. 37, pp. 13528–13537). AAAI Press. https://doi.org/10.1609/aaai.v37i11.26586

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free