CPC: Automatically classifying and propagating natural language comments via program analysis

36Citations
Citations of this article
64Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Code comments provide abundant information that have been leveraged to help perform various software engineering tasks, such as bug detection, speciication inference, and code synthesis. However, developers are less motivated to write and update comments, making it infeasible and error-prone to leverage comments to facilitate software engineering tasks. In this paper, we propose to leverage program analysis to systematically derive, reine, and propagate comments. For example, by propagation via program analysis, comments can be passed on to code entities that are not commented such that code bugs can be detected leveraging the propagated comments. Developers usually comment on diferent aspects of code elements like methods, and use comments to describe various contents, such as functionalities and properties. To more efectively utilize comments, a ine-grained and elaborated taxonomy of comments and a reliable classiier to automatically categorize a comment are needed. In this paper, we build a comprehensive taxonomy and propose using program analysis to propagate comments. We develop a prototype CPC, and evaluate it on 5 projects. The evaluation results demonstrate 41573 new comments can be derived by propagation from other code locations with 88% accuracy. Among them, we can derive precise functional comments for 87 native methods that have neither existing comments nor source code. Leveraging the propagated comments, we detect 37 new bugs in open source large projects, 30 of which have been conirmed and ixed by developers, and 304 defects in existing comments (by looking at inconsistencies between existing and propagated comments), including 12 incomplete comments and 292 wrong comments. This demonstrates the efectiveness of our approach. Our user study conirms propagated comments align well with existing comments in terms of quality.

Cite

CITATION STYLE

APA

Zhai, J., Xu, X., Shi, Y., Tao, G., Pan, M., Ma, S., … Zhang, X. (2020). CPC: Automatically classifying and propagating natural language comments via program analysis. In Proceedings - International Conference on Software Engineering (pp. 1359–1371). IEEE Computer Society. https://doi.org/10.1145/3377811.3380427

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free