Detecting Duplicate Contributions in Pull-Based Model Combining Textual and Change Similarities

13Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Communication and coordination between open source software (OSS) developers who do not work physically in the same location have always been the challenging issues. The pull-based development model, as the state-of-the-art collaborative development mechanism, provides high openness and transparency to improve the visibility of contributors’ work. However, duplicate contributions may still be submitted by more than one contributor to solve the same problem due to the parallel and uncoordinated nature of this model. If not detected in time, duplicate pull-requests can cause contributors and reviewers to waste time and energy on redundant work. In this paper, we propose an approach combining textual and change similarities to automatically detect duplicate contributions in the pull-based model at submission time. For a new-arriving contribution, we first compute textual similarity and change similarity between it and other existing contributions. And then our method returns a list of candidate duplicate contributions that are most similar to the new contribution in terms of the combined textual and change similarity. The evaluation shows that 83.4% of the duplicates can be found in average when we use the combined textual and change similarity compared with 54.8% using only textual similarity and 78.2% using only change similarity.

Cite

CITATION STYLE

APA

Li, Z. X., Yu, Y., Wang, T., Yin, G., Mao, X. J., & Wang, H. M. (2021). Detecting Duplicate Contributions in Pull-Based Model Combining Textual and Change Similarities. Journal of Computer Science and Technology, 36(1), 191–206. https://doi.org/10.1007/s11390-020-9935-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free