Stack Overflow (SO) is an important source of knowledge for developers. It provides authoritative advice as well as detailed technical information about different computer science and software engineering topics. The goal of this paper is to explore mechanisms to extract implicit knowledge, which is present in questions of SO. In particular, we want to extract information about programming languages and their relationships to such questions. The proposed approach builds a classifier model that predicts the programming language using the content (text and source code snippets) of a question. The proposed method produces word embeddings in which each term of the question is represented in a vectorial space in which it is possible to perform operations such as comparing words, sentences, and questions. The method was evaluated on a set of 18,000 questions related to 18 different programming languages. Results show that it is possible to extract interesting non-evident information from this highly unstructured data source.
CITATION STYLE
Baquero, J. F., Camargo, J. E., Restrepo-Calle, F., Aponte, J. H., & González, F. A. (2017). Predicting the programming language: Extracting knowledge from stack overflow posts. In Communications in Computer and Information Science (Vol. 735, pp. 199–210). Springer Verlag. https://doi.org/10.1007/978-3-319-66562-7_15
Mendeley helps you to discover research relevant for your work.