Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization

Juncai Guo; Jin Liu; Yao Wan; Li Li; Pingyi Zhou

Conference ProceedingsOPEN ACCESS

Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2022) 1 486-500

DOI: 10.18653/v1/2022.acl-long.37

24Citations

58Readers

Abstract

Automatic code summarization, which aims to describe the source code in natural language, has become an essential task in software maintenance. Our fellow researchers have attempted to achieve such a purpose through various machine learning-based approaches. One key challenge keeping these approaches from being practical lies in the lacking of retaining the semantic structure of source code, which has unfortunately been overlooked by the state-of-the-art methods. Existing approaches resort to representing the syntax structure of code by modeling the Abstract Syntax Trees (ASTs). However, the hierarchical structures of ASTs have not been well explored. In this paper, we propose CODESCRIBE to model the hierarchical syntax structure of code by introducing a novel triplet position for code summarization. Specifically, CODESCRIBE leverages the graph neural network and Transformer to preserve the structural and sequential information of code, respectively. In addition, we propose a pointer-generator network that pays attention to both the structure and sequential tokens of code for a better summary generation. Experiments on two real-world datasets in Java and Python demonstrate the effectiveness of our proposed approach when compared with several state-of-the-art baselines.

Cite

CITATION STYLE

APA

Guo, J., Liu, J., Wan, Y., Li, L., & Zhou, P. (2022). Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 486–500). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.37

Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization

Abstract

Cite

Register to see more suggestions