Design matters in patient-level prediction: evaluation of a cohort vs. case-control design when developing predictive models in observational healthcare datasets

14Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Background: The design used to create labelled data for training prediction models from observational healthcare databases (e.g., case-control and cohort) may impact the clinical usefulness. We aim to investigate hypothetical design issues and determine how the design impacts prediction model performance. Aim: To empirically investigate differences between models developed using a case-control design and a cohort design. Methods: Using a US claims database, we replicated two published prediction models (dementia and type 2 diabetes) which were developed using a case-control design, and trained models for the same prediction questions using cohort designs. We validated each model on data mimicking the point in time the models would be applied in clinical practice. We calculated the models’ discrimination and calibration-in-the-large performances. Results: The dementia models obtained area under the receiver operating characteristics of 0.560 and 0.897 for the case-control and cohort designs respectively. The type 2 diabetes models obtained area under the receiver operating characteristics of 0.733 and 0.727 for the case-control and cohort designs respectively. The dementia and diabetes case-control models were both poorly calibrated, whereas the dementia cohort model achieved good calibration. We show that careful construction of a case-control design can lead to comparable discriminative performance as a cohort design, but case-control designs over-represent the outcome class leading to miscalibration. Conclusions: Any case-control design can be converted to a cohort design. We recommend that researchers with observational data use the less subjective and generally better calibrated cohort design when extracting labelled data. However, if a carefully constructed case-control design is used, then the model must be prospectively validated using a cohort design for fair evaluation and be recalibrated.

References Powered by Scopus

Modelling transmission and control of the COVID-19 pandemic in Australia

1789Citations
N/AReaders
Get full text

Molecular insights into receptor binding energetics and neutralization of SARS-CoV-2 variants

686Citations
N/AReaders
Get full text

Prediction models for cardiovascular disease risk in the general population: Systematic review

623Citations
N/AReaders
Get full text

Cited by Powered by Scopus

A clinician's guide to understanding and critically appraising machine learning studies: A checklist for Ruling Out Bias Using Standard Tools in Machine Learning (ROBUST-ML)

23Citations
N/AReaders
Get full text

DEPLOYR: A technical framework for deploying custom real-time machine learning models into the electronic medical record

16Citations
N/AReaders
Get full text

Considerations in the reliability and fairness audits of predictive models for advance care planning

15Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Reps, J. M., Ryan, P. B., Rijnbeek, P. R., & Schuemie, M. J. (2021). Design matters in patient-level prediction: evaluation of a cohort vs. case-control design when developing predictive models in observational healthcare datasets. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00501-2

Readers over time

‘21‘22‘23‘24‘25036912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 8

57%

Researcher 3

21%

Lecturer / Post doc 2

14%

Professor / Associate Prof. 1

7%

Readers' Discipline

Tooltip

Medicine and Dentistry 8

53%

Computer Science 4

27%

Agricultural and Biological Sciences 2

13%

Decision Sciences 1

7%

Article Metrics

Tooltip
Social Media
Shares, Likes & Comments: 21

Save time finding and organizing research with Mendeley

Sign up for free
0