Abstract
The proliferation of Deep Neural Networks in various domains has seen an increased need for interpretability of these models. Prelimi-nary work done along this line, and papers that surveyed such, are focused on high-level representation analysis. However, a recent branch of work has concentrated on interpretability at a more granular level of analyzing neurons within these models. In this paper, we survey the work done on neuron analysis including: i) methods to discover and understand neurons in a network; ii) evaluation methods; iii) major findings including cross architectural compar-isons that neuron analysis has unraveled; iv) applications of neuron probing such as: controlling the model, domain adaptation, and so forth; and v) a discussion on open issues and future research directions.
Cite
CITATION STYLE
Sajjad, H., Durrani, N., & Dalvi, F. (2022). Neuron-level Interpretation of Deep NLP Models: A Survey. Transactions of the Association for Computational Linguistics, 10, 1285–1303. https://doi.org/10.1162/tacl_a_00519
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.