A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Stephen R. Pfohl; Haoran Zhang; Yizhe Xu; Agata Foryciarz; Marzyeh Ghassemi; Nigam H. Shah

Journal ArticleOPEN ACCESS

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Scientific Reports (2022) 12(1)

DOI: 10.1038/s41598-022-07167-7

16Citations

32Readers

Abstract

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Pfohl, S. R., Zhang, H., Xu, Y., Foryciarz, A., Ghassemi, M., & Shah, N. H. (2022). A comparison of approaches to improve worst-case predictive model performance over patient subpopulations. Scientific Reports, 12(1). https://doi.org/10.1038/s41598-022-07167-7

Readers' Seniority

PhD / Post grad / Masters / Doc 7

54%

Researcher 4

31%

Professor / Associate Prof. 1

Lecturer / Post doc 1

Readers' Discipline

Computer Science 5

42%

Engineering 4

33%

Medicine and Dentistry 2

17%

Decision Sciences 1

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Abstract

References Powered by Scopus

Learning phrase representations using RNN encoder-decoder for statistical machine translation

Learning from imbalanced data

MIMIC-III, a freely accessible critical care database

Cited by Powered by Scopus

Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Challenges and Solutions

Generalizability challenges of mortality risk prediction models: A retrospective analysis on a multi-center database

The path toward equal performance in medical machine learning

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline