Extracting automata from neural networks using active learning

4Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deep learning is one of the most advanced forms of machine learning. Most modern deep learning models are based on an artificial neural network, and benchmarking studies reveal that neural networks have produced results comparable to and in some cases superior to human experts. However, the generated neural networks are typically regarded as incomprehensible black-box models, which not only limits their applications, but also hinders testing and verifying. In this paper, we present an active learning framework to extract automata from neural network classifiers, which can help users to understand the classifiers. In more detail, we use Angluin's L* algorithm as a learner and the neural network under learning as an oracle, employing abstraction interpretation of the neural network for answering membership and equivalence queries. Our abstraction consists of value, symbol and word abstractions. The factors that may affect the abstraction are also discussed in the paper. We have implemented our approach in a prototype. To evaluate it, we have performed the prototype on a MNIST classifier and have identified that the abstraction with interval number 2 and block size 1 x 28 offers the best performance in terms of F1 score. We also have compared our extracted DFA against the DFAs learned via the passive learning algorithms provided in LearnLib and the experimental results show that our DFA gives a better performance on the MNIST dataset.

Cite

CITATION STYLE

APA

Xu, Z., Wen, C., Qin, S., & He, M. (2021). Extracting automata from neural networks using active learning. PeerJ Computer Science, 7, 1–28. https://doi.org/10.7717/peerj-cs.436

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free