Abstract
Deep learning algorithms can identify related tweets to reduce the information overload that prevents humanitarian organisations from using valuable Twitter posts. However, they rely heavily on human-labelled data, which are unavailable for emerging crises. Because each crisis has its own features, such as location, time and social media response, current models are known to suffer from generalising to unseen disaster events when pre-trained on past ones.Tweet classifiers for low-resource languages like Arabic has the additional issue of limited labelled data duplicates caused by the absence of good language resources. Thus, we propose a novel domain adaptation approach that does not rely on human-labelled data to automatically label tweets from emerging Arabic crisis events to be used to train a model along with available human-labelled data. We evaluate our work on data from seven 2018-2020 Arabic events from different crisis types (flood, explosion, virus and storm). Results show that our method outperforms self-training in classifying crisis-related tweets in real-time scenarios.
Cite
CITATION STYLE
ALRashdi, R., & O’Keefe, S. (2022). Domain Adaptation for Arabic Crisis Response. In WANLP 2022 - 7th Arabic Natural Language Processing - Proceedings of the Workshop (pp. 249–259). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.wanlp-1.23
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.