The optimal ratio of cases to controls for estimating the classification accuracy of a biomarker

Holly Janes; Margaret Pepe

Journal ArticleOPEN ACCESS

The optimal ratio of cases to controls for estimating the classification accuracy of a biomarker

Biostatistics (2006) 7(3) 456-468

DOI: 10.1093/biostatistics/kxj018

20Citations

63Readers

Abstract

The case-control design is frequently used to study the discriminatory accuracy of a screening or diagnostic biomarker. Yet, the appropriate ratio in which to sample cases and controls has never been determined. It is common for researchers to sample equal numbers of cases and controls, a strategy that can be optimal for studies of association. However, considerations are quite different when the biomarker is to be used for classification. In this paper, we provide an expression for the optimal case-control ratio, when the accuracy of the biomarker is quantified by the receiver operating characteristic (ROC) curve. We show how it can be integrated with choosing the overall sample size to yield an efficient study design with specified power and type-I error. We also derive the optimal case-control ratios for estimating the area under the ROC curve and the area under part of the ROC curve. Our methods are applied to a study of a new marker for adenocarcinoma in patients with Barrett's esophagus. © The Author 2005. Published by Oxford University Press. All rights reserved.

Author supplied keywords

Cite

CITATION STYLE

APA

Janes, H., & Pepe, M. (2006). The optimal ratio of cases to controls for estimating the classification accuracy of a biomarker. Biostatistics, 7(3), 456–468. https://doi.org/10.1093/biostatistics/kxj018

The optimal ratio of cases to controls for estimating the classification accuracy of a biomarker

Abstract

Author supplied keywords

Cite

Register to see more suggestions