Benchmarking Library Recognition in Tweets

6Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Software developers often use social media (such as Twitter) to share programming knowledge such as new tools, sample code snippets, and tips on programming. One of the topics they talk about is the software library. The tweets may contain useful information about a library. A good understanding of this information, e.g., on the developer's views regarding a library can be beneficial to weigh the pros and cons of using the library as well as the general sentiments towards the library. However, it is not trivial to recognize whether a word actually refers to a library or other meanings. For example, a tweet mentioning the word 'pandas' may refer to the Python pandas library or to the animal. In this work, we created the first benchmark dataset and investigated the task to distinguish whether a tweet refers to a programming library or something else. Recently, the pre-trained Transformer models (PTMs) have achieved great success in the fields of natural language processing and computer vision. Therefore, we extensively evaluated a broad set of modern PTMs, including both general-purpose and domain-specific ones, to solve this programming library recognition task in tweets. Experimental results show that the use of PTM can outperform the best-performing baseline methods by 5% - 12% in terms of F1-score under within-, cross-, and mixed-library settings.

Cite

CITATION STYLE

APA

Zhang, T., Chandrasekaran, D. P., Thung, F., & Lo, D. (2022). Benchmarking Library Recognition in Tweets. In IEEE International Conference on Program Comprehension (Vol. 2022-March, pp. 343–353). IEEE Computer Society. https://doi.org/10.1145/3524610.3527916

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free