An empirical study on the importance of source code entities for requirements traceability

27Citations
Citations of this article
53Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Requirements Traceability (RT) links help developers during program comprehension and maintenance tasks. However, creating RT links is a laborious and resource-consuming task. Information Retrieval (IR) techniques are useful to automatically create traceability links. However, IR-based techniques typically have low accuracy (precision, recall, or both) and thus, creating RT links remains a human intensive process. We conjecture that understanding how developers verify RT links could help improve the accuracy of IR-based RT techniques to create RT links. Consequently, we perform an empirical study consisting of four case studies. First, we use an eye-tracking system to capture developers’ eye movements while they verify RT links. We analyse the obtained data to identify and rank developers’ preferred types of Source Code Entities (SCEs), e.g., domain vs. implementation-level source code terms and class names vs. method names. Second, we perform another eye-tracking case study to confirm that it is the semantic content of the developers’ preferred types of SCEs and not their locations that attract developers’ attention and help them in their task to verify RT links. Third, we propose an improved term weighting scheme, i.e., Developers Preferred Term Frequency/Inverse Document Frequency (D P T F / I D F), that uses the knowledge of the developers’ preferred types of SCEs to give more importance to these SCEs into the term weighting scheme. We integrate thisweighting scheme with an IR technique, i.e., Latent Semantic Indexing (LSI), to create a new technique to RT link recovery. Using three systems (iTrust, Lucene, and Pooka), we show that the proposed technique statistically improves the accuracy of the recovered RT links over a technique based on LSI and the usual Term Frequency/Inverse Document Frequency (T F / I D F) weighting scheme. Finally, we compare the newly proposed D P T F / I D F with our original Domain Or Implementation/Inverse Document Frequency (D O I / I D F) weighting scheme.

Cite

CITATION STYLE

APA

Ali, N., Sharafi, Z., Guéhéneuc, Y. G., & Antoniol, G. (2015). An empirical study on the importance of source code entities for requirements traceability. Empirical Software Engineering, 20(2), 442–478. https://doi.org/10.1007/s10664-014-9315-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free