Abstract
Nowadays, software developers are increasingly involved in GitHub and StackOverflow, creating a lot of valuable data in the two communities. Researchers mine the information in these software communities to understand developer behaviors, while previous work mainly focuses on mining data within a single community. In this paper, we propose a novel approach to mining developer behaviors across GitHub and StackOverflow. This approach links the accounts from two communities using a CART decision tree, leveraging the features from usernames, user behaviors and writing styles. Then, it explores cross-site developer behaviors through T-graph analysis, LDA-based topics clustering and cross-site tagging. We conducted several experiments to evaluate this approach. The results show that the precision and F-Score of our identity linkage method are higher than previous methods in software communities. Especially, we discovered that (1) active issue committers are also active question askers; (2) for most developers, the topics of their contents in GitHub are similar to that of their questions and answers in StackOverflow; (3) developers' concerns in StackOverflow shift over the time of their current participating projects in GitHub; (4) developers' concerns in GitHub are more relevant to their answers than questions and comments in StackOverflow.
Author supplied keywords
Cite
CITATION STYLE
Xiong, Y., Meng, Z., Shen, B., & Yin, W. (2017). Mining developer behavior across git hub and stack overflow. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, SEKE (pp. 578–583). Knowledge Systems Institute Graduate School. https://doi.org/10.18293/SEKE2017-062
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.