Predicting Popularity of Open Source Projects Using Recurrent Neural Networks

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

GitHub is the largest open source software development platform with millions of repositories on variety of topics. The number of stars received by a repository is often considered as a measure of its popularity. Predicting the number of stars of a repository has been associated with the number of forks, commits, followers, documentation size, and programming language in the literature. We extend prior studies in terms of input features and algorithm: We define six features from GitHub events corresponding to the development activities, and additional six features incorporating the influence of users (followers and contributors) on the popularity of projects into their development activities. We propose a time-series based forecast model using Recurrent Neural Networks to predict the number of stars received in consecutive k days. We assess the performance of our proposed model with varying k (1, 7, 14, 30 days) and with varying input features. Our analysis on five topmost starred repositories in data visualization area shows that the error rate ranges between 19.76 and 70.57 among the projects. The best performing models use either features from development activities only, or all metrics including all the features.

Cite

CITATION STYLE

APA

Sahin, S. E., Karpat, K., & Tosun, A. (2019). Predicting Popularity of Open Source Projects Using Recurrent Neural Networks. In IFIP Advances in Information and Communication Technology (Vol. 556, pp. 80–90). Springer New York LLC. https://doi.org/10.1007/978-3-030-20883-7_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free