Exploring identical users on GitHub and stack overflow

5Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

Analyzing behaviours of developers in different platforms (in particular, GitHub and Stack Overflow in this paper) can reveal interesting facts related to development activities. There are only few datasets for analysing crossplatform user behaviours, especially across GitHub and Stack Overflow. Users on GitHub and Stack Overflow are identifiable by equivalences of email addresses. In order to increase the number of identifiable users on these datasets, this paper retrieves potentially identifiable users between GitHub and Stack Overflow not relying only on email addresses. This paper employs a classification-based link prediction, which design the user identification problem as a link prediction problem on the bipartite graph consisting of users of GitHub and those of Stack Overflow. With the identification method, this paper generates a probabilistic dataset containing pairs of users with probabilities (or confidences). This paper, as well, publishes the identification tool in order to enable further data generation on appearing datasets of GitHub, Stack Overflow and others. The generated dataset and tool are highly helpful to accelerate researches on mining software repositories.

Cite

CITATION STYLE

APA

Komamizu, T., Hayase, Y., Amagasa, T., & Kitagawa, H. (2017). Exploring identical users on GitHub and stack overflow. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (pp. 584–589). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2017-109

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free