Abstract
We consider a situation where rich historical data are available for the coefficients and their standard errors in an established regression model describing the association between a binary outcome variable Y and a set of predicting factors X, from a large study. We would like to utilize this summary information for improving estimation and prediction in an expanded model of interest, Y|X,B. The additional variable B is a new biomarker, measured on a small number of subjects in a new data set. We develop and evaluate several approaches for translating the external information into constraints on regression coefficients in a logistic regression model of Y|X,B. Borrowing from the measurement error literature we establish an approximate relationship between the regression coefficients in the models Pr(Y=1|X,β), Pr(Y=1|X,B,γ) and E(B|X,θ) for a Gaussian distribution of B. For binary B we propose an alternative expression. The simulation results comparing these methods indicate that historical information on Pr(Y=1|X,β) can improve the efficiency of estimation and enhance the predictive power in the regression model of interest Pr(Y=1|X,B,γ). We illustrate our methodology by enhancing the high grade prostate cancer prevention trial risk calculator, with two new biomarkers: prostate cancer antigen 3 and TMPRSS2:ERG.
Author supplied keywords
Cite
CITATION STYLE
Cheng, W., Taylor, J. M. G., Gu, T., Tomlins, S. A., & Mukherjee, B. (2019). Informing a risk prediction model for binary outcomes with external coefficient information. Journal of the Royal Statistical Society. Series C: Applied Statistics, 68(1), 121–139. https://doi.org/10.1111/rssc.12306
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.