Data Augmentation for Rare Symptoms in Vaccine Side-Effect Detection

Bosung Kim; Ndapa Nakashole

Conference ProceedingsOPEN ACCESS

Data Augmentation for Rare Symptoms in Vaccine Side-Effect Detection

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2022) 310-315

DOI: 10.18653/v1/2022.bionlp-1.29

1Citations

33Readers

Abstract

We study the problem of entity detection and normalization applied to patient self-reports of symptoms that arise as side-effects of vaccines. Our application domain presents unique challenges that render traditional classification methods ineffective: the number of entity types is large; and many symptoms are rare, resulting in a long-tail distribution of training examples per entity type. We tackle these challenges with an autoregressive model that generates standardized names of symptoms. We introduce a data augmentation technique to increase the number of training examples for rare symptoms. Experiments on real-life patient vaccine symptom self-reports show that our approach outperforms strong baselines, and that additional examples improve performance on the long-tail entities.

Cite

CITATION STYLE

APA

Kim, B., & Nakashole, N. (2022). Data Augmentation for Rare Symptoms in Vaccine Side-Effect Detection. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 310–315). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.bionlp-1.29

Data Augmentation for Rare Symptoms in Vaccine Side-Effect Detection

Abstract

Cite

Register to see more suggestions