Exploring the Characteristics of Identifiers: A Large-Scale Empirical Study on 5,000 Open Source Projects

Jingxuan Zhang; Siyuan Liu; Junpeng Luo; Jiahui Liang; Zhiqiu Huang

Journal ArticleOPEN ACCESS

Exploring the Characteristics of Identifiers: A Large-Scale Empirical Study on 5,000 Open Source Projects

IEEE Access (2020) 8 140607-140620

DOI: 10.1109/ACCESS.2020.3013694

4Citations

8Readers

Abstract

Informative identifiers are crucial for the comprehensibility and maintainability of programs. Exploring properties of identifiers and investigating their impact on software artifacts have been an important research focus. However, to enable such capabilities, fundamentally we need to have comprehensive understanding on the main characteristics of identifiers at the first place, which is unfortunately not sufficiently studied. For example, it remained unclear what Part of Speech (POS) tags that developers commonly use to define identifiers. To answer such open issues, we conducted a large-scale empirical study on the naturalness of identifiers, based on 5,000 open source Java and Android projects, concerning five dimensions of identifiers: distributions, compositions, POS tags, lengths, and initializations of identifiers. Results of the empirical study contain five key findings for identifiers in programs, including, e.g., the observation that the three POS tags (i.e., nouns, verbs, and adjectives) are the most commonly used ones when developers define identifiers. Furthermore, based on our findings, we provide implications and insights for developers, researchers, and Integrated Development Environments (IDEs) in the context that identifier-related activities are performed or functionalities are enabled.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, J., Liu, S., Luo, J., Liang, J., & Huang, Z. (2020). Exploring the Characteristics of Identifiers: A Large-Scale Empirical Study on 5,000 Open Source Projects. IEEE Access, 8, 140607–140620. https://doi.org/10.1109/ACCESS.2020.3013694

Exploring the Characteristics of Identifiers: A Large-Scale Empirical Study on 5,000 Open Source Projects

Abstract

Author supplied keywords

Cite

Register to see more suggestions