Abstract
Background: Social media platforms offer valuable insights into patients’ experience, revealing organic conversations that reflect their immediate concerns and needs. Through active listening to lived experiences, we can identify unmet needs and discover the real-world challenges that patients and caregivers face. Objective: The aim of our study is to develop a reusable framework to collect and analyze evolving social media data, capturing insights into the experiences of individuals with myelodysplastic syndromes (MDS) and higher-risk MDS and their caregivers. The findings can inform the development of appropriate patient support interventions. Methods: We conducted a structured Google search of English-language websites relevant to MDS from January 1, 2008, to December 31, 2022, using validated URLs and keywords. Data were sourced from MDS-specific platforms to ensure clinical relevance. Contextual embeddings (rather than simple keyword matching) were applied to detect semantically meaningful mentions of “MDS.” Scraping algorithms collected, cleaned, and standardized the data. Posts were classified as originating from patients or caregivers using decision-tree tagging based on contextual summaries. Users were categorized as HR-MDS based on explicit mentions of “high-risk” or by referencing criteria aligned with National Comprehensive Cancer Network guidelines (eg, blast count, transplant, chemotherapy use). Each post was analyzed for major themes and sentiment using a supervised machine learning classifier, while latent topics were identified through a semisupervised model. Results: We analyzed ~5.5 million words from 42,000 posts across 5500 threads by ~4000 users from the United States, United Kingdom, and Canada. Of the 1249 HR-MDS users identified, 587 (47%) were patients and 662 (53%) were caregivers. Dominant sentiments among HR-MDS users included concern (n=974, 78%), anxiety (n=749, 60%), frustration (n=724, 58%), fear (n=724, 58%), and confusion (n=612, 49%). Concern was the top sentiment among caregivers (n=390, 59%), while anxiety led among patients (n=323, 55%). Key topics included blood counts (n=674, 54%), disease burden (n=537, 43%), quality of life (n=450, 36%), treatment options (n=387, 31%), and disease progression (n=387, 31%). Anxiety was frequently tied to health (n=600, 48%), treatment (n=325, 26%), and the diagnostic process (n=250, 20%). Fear stemmed from complications (n=237, 19%) and progression (n=240, 19%). Confusion about diagnosis and disease understanding was reported by 300 (24%). Information-seeking behaviors revealed user interest in treatment interventions (n=238, 19%) and ongoing research (n=212, 17%). Conclusions: The application of sophisticated natural language processing techniques demonstrates promise in effectively identifying the emerging complex themes and sentiments experienced by HR-MDS users, thereby highlighting the unmet needs, barriers, and facilitators associated with the disease.
Author supplied keywords
- caregiver perspectives
- digital health
- health communication
- high-risk MDS
- information-seeking behavior
- machine learning
- myelodysplastic syndromes
- natural language processing
- patient experience
- patient-centered insights
- qualitative data mining
- real-world evidence
- sentiment analysis
- social media listening
- unmet needs
Cite
CITATION STYLE
Marwah, R., Mishra, S., Gross, B., Couturiaux, S., Calara, R., Sabate Estrella, E. J., & Hogea, C. (2025). Social Media Insights Into Disease Burden in Patients and Caregivers of Myelodysplastic Syndrome: Subcohort Analysis of High-Risk Patients. Journal of Medical Internet Research, 27(1). https://doi.org/10.2196/65460
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.