Identifying COVID-19 english informative tweets using limited labelled data

5Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Identifying COVID-19 informative tweets is very useful in building monitoring systems to track the latest updates. Existing approaches to identify informative tweets rely on a large number of labelled tweets to achieve good performances. As labelling is an expensive and laborious process, there is a need to develop approaches that can identify COVID-19 informative tweets using limited labelled data. In this paper, we propose a simple yet novel labelled data-efficient approach that achieves the state-of-the-art (SOTA) F1-score of 91.23 on the WNUT COVID-19 dataset using just 1000 tweets (14.3% of the full training set). Our labelled data-efficient approach starts with limited labelled data, augment it using data augmentation methods and then fine-tune the model using augmented data set. It is the first work to approach the task of identifying COVID-19 English informative tweets using limited labelled data yet achieve the new SOTA performance.

Cite

CITATION STYLE

APA

Kothuru, S., & Santhanavijayan, A. (2023). Identifying COVID-19 english informative tweets using limited labelled data. Social Network Analysis and Mining, 13(1). https://doi.org/10.1007/s13278-023-01025-8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free