Noise in Mylyn interaction traces and its impact on developers and recommendation systems

Zéphyrin Soh; Foutse Khomh; Yann Gaël Guéhéneuc; Giuliano Antoniol

Journal Article

Noise in Mylyn interaction traces and its impact on developers and recommendation systems

Empirical Software Engineering (2018) 23(2) 645-692

DOI: 10.1007/s10664-017-9529-x

4Citations

6Readers

Get full text

Abstract

Interaction traces (ITs) are developers’ logs collected while developers maintain or evolve software systems. Researchers use ITs to study developers’ editing styles and recommend relevant program entities when developers perform changes on source code. However, when using ITs, they make assumptions that may not necessarily be true. This article assesses the extent to which researchers’ assumptions are true and examines noise in ITs. It also investigates the impact of noise on previous studies. This article describes a quasi-experiment collecting both Mylyn ITs and video-screen captures while 15 participants performed four realistic software maintenance tasks. It assesses the noise in ITs by comparing Mylyn ITs and the ITs obtained from the video captures. It proposes an approach to correct noise and uses this approach to revisit previous studies. The collected data show that Mylyn ITs can miss, on average, about 6% of the time spent by participants performing tasks and can contain, on average, about 85% of false edit events, which are not real changes to the source code. The approach to correct noise reveals about 45% of misclassification of ITs. It can improve the precision and recall of recommendation systems from the literature by up to 56% and 62%, respectively. Mylyn ITs include noise that biases subsequent studies and, thus, can prevent researchers from assisting developers effectively. They must be cleaned before use in studies and recommendation systems. The results on Mylyn ITs open new perspectives for the investigation of noise in ITs generated by other monitoring tools such as DFlow, FeedBag, and Mimec, and for future studies based on ITs.

Author supplied keywords

Cite

CITATION STYLE

APA

Soh, Z., Khomh, F., Guéhéneuc, Y. G., & Antoniol, G. (2018). Noise in Mylyn interaction traces and its impact on developers and recommendation systems. Empirical Software Engineering, 23(2), 645–692. https://doi.org/10.1007/s10664-017-9529-x

Noise in Mylyn interaction traces and its impact on developers and recommendation systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions