Multiple imputation (MI) is increasingly being used to handle missing data in epidemiologic research. When data on both the exposure and the outcome are missing, an alternative to standard MI is the "multiple imputation, then deletion" (MID) method, which involves deleting imputed outcomes prior to analysis. While MID has been shown to provide efficiency gains over standard MI when analysis and imputation models are the same, the performance of MID in the presence of auxiliary variables for the incomplete outcome is not well understood. Using simulated data, we evaluated the performance of standard MI and MID in regression settings where data were missing on both the outcome and the exposure and where an auxiliary variable associated with the incomplete outcome was included in the imputation model. When the auxiliary variable was unrelated to missingness in the outcome, both standard MI and MID produced negligible bias when estimating regression parameters, with standard MI being more efficient in most settings. However, when the auxiliary variable was also associated with missingness in the outcome, alarmingly MID produced markedly biased parameter estimates. On the basis of these results, we recommend that researchers use standard MI rather than MID in the presence of auxiliary variables associated with an incomplete outcome.
CITATION STYLE
Sullivan, T. R., Salter, A. B., Ryan, P., & Lee, K. J. (2015). Bias and Precision of the “multiple Imputation, Then Deletion” Method for Dealing with Missing Outcome Data. American Journal of Epidemiology, 182(6), 528–534. https://doi.org/10.1093/aje/kwv100
Mendeley helps you to discover research relevant for your work.