The behavior of simple recurrent neural networks trained on regular languages is analyzed in terms of accuracy and interpretability. We use controlled amounts of noise and L1 regularization to obtain stable and accurate responses that are at the same time highly interpretable, and introduce a shocking mechanism that reactivates silent neurons when learning stops due to an excessive regularization. Proper parameter tuning allows the networks to develop a strong generalization capacity, and at the same time provides solutions that may be interpreted as finite automata. Experiments carried out with different regular languages show that, in all cases, the trained networks display activation patterns that automatically cluster into a set of discrete states without any need to explicitly perform quantization. Analysis of the transitions between states in response to the input symbols reveals that the networks are in fact implementing a finite state machine that in all cases matches the regular expressions used to generate the training data.
CITATION STYLE
Oliva, C., & Lago-Fernández, L. F. (2019). On the Interpretation of Recurrent Neural Networks as Finite State Machines. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11727 LNCS, pp. 312–323). Springer Verlag. https://doi.org/10.1007/978-3-030-30487-4_25
Mendeley helps you to discover research relevant for your work.