Bi-LSTM-Based Neural Source Code Summarization

Sarah Aljumah; Lamia Berriche

Journal ArticleOPEN ACCESS

Bi-LSTM-Based Neural Source Code Summarization

Applied Sciences (Switzerland) (2022) 12(24)

DOI: 10.3390/app122412587

5Citations

13Readers

Abstract

Featured Application: Code comment generation. Code summarization is a task that is often employed by software developers for fixing code or reusing code. Software documentation is essential when it comes to software maintenance. The highest cost in software development goes to maintenance because of the difficulty of code modification. To help in reducing the cost and time spent on software development and maintenance, we introduce an automated comment summarization and commenting technique using state-of-the-art techniques in summarization. We use deep neural networks, specifically bidirectional long short-term memory (Bi-LSTM), combined with an attention model to enhance performance. In this study, we propose two different scenarios: one that uses the code text and the structure of the code represented in an abstract syntax tree (AST) and another that uses only code text. We propose two encoder-based models for the first scenario that encodes the code text and the AST independently. Previous works have used different techniques in deep neural networks to generate comments. This study’s proposed methodologies scored higher than previous works based on the gated recurrent unit encoder. We conducted our experiment on a dataset of 2.1 million pairs of Java methods and comments. Additionally, we showed that the code structure is beneficial for methods’ signatures featuring unclear words.

Author supplied keywords

Cite

CITATION STYLE

APA

Aljumah, S., & Berriche, L. (2022). Bi-LSTM-Based Neural Source Code Summarization. Applied Sciences (Switzerland), 12(24). https://doi.org/10.3390/app122412587

Bi-LSTM-Based Neural Source Code Summarization

Abstract

Author supplied keywords

Cite

Register to see more suggestions