Predicting the programming language: Extracting knowledge from stack overflow posts

13Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Stack Overflow (SO) is an important source of knowledge for developers. It provides authoritative advice as well as detailed technical information about different computer science and software engineering topics. The goal of this paper is to explore mechanisms to extract implicit knowledge, which is present in questions of SO. In particular, we want to extract information about programming languages and their relationships to such questions. The proposed approach builds a classifier model that predicts the programming language using the content (text and source code snippets) of a question. The proposed method produces word embeddings in which each term of the question is represented in a vectorial space in which it is possible to perform operations such as comparing words, sentences, and questions. The method was evaluated on a set of 18,000 questions related to 18 different programming languages. Results show that it is possible to extract interesting non-evident information from this highly unstructured data source.

Cite

CITATION STYLE

APA

Baquero, J. F., Camargo, J. E., Restrepo-Calle, F., Aponte, J. H., & González, F. A. (2017). Predicting the programming language: Extracting knowledge from stack overflow posts. In Communications in Computer and Information Science (Vol. 735, pp. 199–210). Springer Verlag. https://doi.org/10.1007/978-3-319-66562-7_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free