Neuron-level Interpretation of Deep NLP Models: A Survey

Hassan Sajjad; Nadir Durrani; Fahim Dalvi

Journal ArticleOPEN ACCESS

Neuron-level Interpretation of Deep NLP Models: A Survey

Transactions of the Association for Computational Linguistics (2022) 10 1285-1303

DOI: 10.1162/tacl_a_00519

58Citations

67Readers

Abstract

The proliferation of Deep Neural Networks in various domains has seen an increased need for interpretability of these models. Prelimi-nary work done along this line, and papers that surveyed such, are focused on high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level of analyzing neurons within these models. In this paper, we survey the work done on neuron analysis including: i) methods to discover and understand neurons in a network; ii) evaluation methods; iii) major findings including cross architectural compar-isons that neuron analysis has unraveled; iv) applications of neuron probing such as: controlling the model, domain adaptation, and so forth; and v) a discussion on open issues and future research directions.

Cite

CITATION STYLE

APA

Sajjad, H., Durrani, N., & Dalvi, F. (2022). Neuron-level Interpretation of Deep NLP Models: A Survey. Transactions of the Association for Computational Linguistics, 10, 1285–1303. https://doi.org/10.1162/tacl_a_00519

Neuron-level Interpretation of Deep NLP Models: A Survey

Abstract

Cite

Register to see more suggestions