Deep learning (DL) techniques are proven effective in many chal-lenging tasks, and become widely-adopted in practice. However, previous work has shown that DL libraries, the basis of building and executing DL models, contain bugs and can cause severe con-sequences. Unfortunately, existing testing approaches still cannot comprehensively exercise DL libraries. They utilize existing trained models and only detect bugs in model inference phase. In this work we propose Muffin to address these issues. To this end, Muffin applies a specifically-designed model fuzzing approach, which al-lows it to generate diverse DL models to explore the target library, instead of relying only on existing trained models. Muffin makes differential testing feasible in the model training phase by tailoring a set of metrics to measure the inconsistencies between different DL libraries. In this way, Muffin can best exercise the library code to detect more bugs. To evaluate the effectiveness of Muffin, we conduct experiments on three widely-used DL libraries. The results demonstrate that Muffin can detect 39 new bugs in the latest release versions of popular DL libraries, including Tensorflow, CNTK, and Theano.
CITATION STYLE
Gu, J., Luo, X., Zhou, Y., & Wang, X. (2022). Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing. In Proceedings - International Conference on Software Engineering (Vol. 2022-May, pp. 1418–1430). IEEE Computer Society. https://doi.org/10.1145/3510003.3510092
Mendeley helps you to discover research relevant for your work.