Contextual Gaps in Machine Learning for Mental Illness Prediction: The Case of Diagnostic Disclosures

Stevie Chancellor; Jessica L. Feuston; Jayhyun Chang

Journal ArticleOPEN ACCESS

Contextual Gaps in Machine Learning for Mental Illness Prediction: The Case of Diagnostic Disclosures

Proceedings of the ACM on Human-Computer Interaction (2023) 7(CSCW2)

DOI: 10.1145/3610181

1Citations

21Readers

Get full text

Abstract

Getting training data for machine learning (ML) prediction of mental illness on social media data is labor intensive. To work around this, ML teams will extrapolate proxy signals, or alternative signs from data to evaluate illness status and create training datasets. However, these signals' validity has not been determined, whether signals align with important contextual factors, and how proxy quality impacts downstream model integrity. We use ML and qualitative methods to evaluate whether a popular proxy signal, diagnostic self-disclosure, produces a conceptually sound ML model of mental illness. Our findings identify major conceptual errors only seen through a qualitative investigation - training data built from diagnostic disclosures encodes a narrow vision of diagnosis experiences that propagates into paradoxes in the downstream ML model. This gap is obscured by strong performance of the ML classifier (F1 = 0.91). We discuss the implications of conceptual gaps in creating training data for human-centered models, and make suggestions for improving research methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Chancellor, S., Feuston, J. L., & Chang, J. (2023). Contextual Gaps in Machine Learning for Mental Illness Prediction: The Case of Diagnostic Disclosures. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW2). https://doi.org/10.1145/3610181

Contextual Gaps in Machine Learning for Mental Illness Prediction: The Case of Diagnostic Disclosures

Abstract

Author supplied keywords

Cite

Register to see more suggestions