Bug localization plays an important role in software maintenance. Traditional works treat the source code from the lexical perspective, while some recent researches indicate that exploiting the program structure is beneficial for improving bug localization. Control flow graph (CFG) is a widely used graph representation, which essentially represents the program structure. Although using graph neural network for feature learning is a straightforward way and has been proven effective in various software mining problems, this approach is inappropriate since adjacent nodes in the CFG could be totally unrelated in semantics. On the other hand, previous statements may affect the semantics of subsequent statements along the execution path, which we call the flowing nature of control flow graph. In this paper, we claim that the flowing nature should be explicitly considered and propose a novel model named cFlow for bug localization, which employs a particular designed flow-based GRU for feature learning from the CFG. The flow-based GRU exploits the program structure represented by the CFG to transmit the semantics of statements along the execution path, which reflects the flowing nature. Experimental results on widely-used real-world software projects show that cFlow significantly outperforms the state-of-the-art bug localization methods, indicating that exploiting the program structure from the CFG with respect to the flowing nature is beneficial for improving bug localization.
CITATION STYLE
Ma, Y. F., & Li, M. (2022). The flowing nature matters: feature learning from the control flow graph of source code for bug localization. Machine Learning, 111(3), 853–870. https://doi.org/10.1007/s10994-021-06078-4
Mendeley helps you to discover research relevant for your work.