SUMMARY Recent advances in statistical software have led to the rapid diffusion of new methods for modelling longitudinal data. Multilevel (also known as hierarchical or random effects) models for binary outcomes have generally been based on a logistic–normal specification, by analogy with earlier work for normally distributed data. The appropriate application and interpretation of these models remains somewhat unclear, especially when compared with the computationally more straightforward semiparametric or 'marginal' modelling (GEE) approaches. In this paper we pose two interrelated questions. First, what limits should be placed on the interpretation of the coefficients and inferences derived from random-effect models involving binary outcomes? Second, what diagnostic checks are appropriate for evaluating whether such random-effect models provide adequate fits to the data? We address these questions by means of an extended case study using data on adolescent smoking from a large cohort study. Bayesian estimation methods are used to fit a discrete-mixture alternative to the standard logistic–normal model, and posterior predictive checking is used to assess model fit. Surprising parallels in the parameter estimates from the logistic–normal and mixture models are described and used to question the interpretability of the so-called 'subject-specific' regression coefficients from the standard multilevel approach. Posterior predictive checks suggest a serious lack of fit of both multilevel models. The results do not provide final answers to the two questions posed, but we expect that lessons learned from the case study will provide general guidance for further investigation of these important issues.
CITATION STYLE
Carlin, J. B. (2001). A case study on the choice, interpretation and checking of multilevel models for longitudinal binary outcomes. Biostatistics, 2(4), 397–416. https://doi.org/10.1093/biostatistics/2.4.397
Mendeley helps you to discover research relevant for your work.