Automatic Core-Developer Identification on GitHub: A Validation Study

3Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Many open-source software projects are self-organized and do not maintain official lists with information on developer roles. So, knowing which developers take core and maintainer roles is, despite being relevant, often tacit knowledge. We propose a method to automatically identify core developers based on role permissions of privileged events triggered in GitHub issues and pull requests. In an empirical study on 25/GitHub projects, (1) we validate the set of automatically identified core developers with a sample of project-reported developer lists, and (2) we use our set of identified core developers to assess the accuracy of state-of-the-art unsupervised developer classification methods. Our results indicate that the set of core developers, which we extracted from privileged issue events, is sound and the accuracy of state-of-the-art unsupervised classification methods depends mainly on the data source (commit data versus issue data) rather than the network-construction method (directed versus undirected, etc.). In perspective, our results shall guide research and practice to choose appropriate unsupervised classification methods, and our method can help create reliable ground-truth data for training supervised classification methods.

Cite

CITATION STYLE

APA

Bock, T., Alznauer, N., Joblin, M., & Apel, S. (2023). Automatic Core-Developer Identification on GitHub: A Validation Study. ACM Transactions on Software Engineering and Methodology, 32(6). https://doi.org/10.1145/3593803

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free