Cross-lingual disaster-related multi-label tweet classification with manifold mixup

25Citations
Citations of this article
94Readers
Mendeley users who have this article in their library.

Abstract

Distinguishing informative and actionable messages from a social media platform like Twitter is critical for facilitating disaster management. For this purpose, we compile a multilingual dataset of over 130K samples for multi-label classification of disaster-related tweets. We present a masking-based loss function for partially labeled samples and demonstrate the effectiveness of Manifold Mixup in the text domain. Our main model is based on Multilingual BERT, which we further improve with Manifold Mixup. We show that our model generalizes to unseen disasters in the test set. Furthermore, we analyze the capability of our model for zero-shot generalization to new languages. Our code, dataset, and other resources are available on Github.1

Cite

CITATION STYLE

APA

Chowdhury, J. R., Caragea, C., & Caragea, D. (2020). Cross-lingual disaster-related multi-label tweet classification with manifold mixup. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 292–298). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-srw.39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free