Data Veracity of Patients and Health Consumers Reported Adverse Drug Reactions on Twitter: Key Linguistic Features, Twitter Variables, and Association Rules

3Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

Abstract

As Twitter emerged as an important data source for pharmacovigilance, heterogeneous data veracity becomes a major concern for extracted adverse drug reactions (ADRs). Our objective is to categorize different levels of data veracity and explore linguistic features of tweets and Twitter variables as they may be used for automatic screening high-veracity tweets that contain ADR-related information. We annotated a published Twitter corpus with linguistic features from existing studies and clinical experts. Multinomial logistic regression models found that first-person pronouns, expressing negative sentiment, ADR and drug name being in the same sentence were significantly associated with higher levels of data veracity (p<0.05), using medical terminology and fewer indications were associated with good data veracity (p<0.05), less drug numbers were marginally associated with good data veracity (p=0.053). These findings suggest opportunities for developing machine learning models for automatic screening of ADR-related tweets using key linguistic features, Twitter variables, and association rules.

Cite

CITATION STYLE

APA

Lyu, T., Eidson, A., Jun, J., Zhou, X., Cui, X., & Liang, C. (2022). Data Veracity of Patients and Health Consumers Reported Adverse Drug Reactions on Twitter: Key Linguistic Features, Twitter Variables, and Association Rules. In Studies in Health Technology and Informatics (Vol. 290, pp. 552–556). IOS Press BV. https://doi.org/10.3233/SHTI220138

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free