A Probabilistic Record Linkage Model for Survival Data

11Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In the absence of a unique identifier, combining information from multiple files relies on partially identifying variables (e.g., gender, initials). With a record linkage procedure, these variables are used to distinguish record pairs that belong together (matches) from record pairs that do not belong together (nonmatches). Generally, the combined strength of the partially identifying variables is too low causing imperfect linkage; some true nonmatches are identified as match and, on the other hand, some true matches as nonmatch. To avoid bias in further analyses, it is necessary to correct for imperfect linkage. In this article, pregnancy data from the Perinatal Registry of the Netherlands were used to estimate the associations between the (baseline) characteristics from the first delivery and the time to a second delivery. Because of privacy regulations, no unique identifier was available to determine which pregnancies belonged to the same woman. To deal with imperfect linkage in a time-to-event setting, where we have a file with baseline characteristics and a file with event times, we developed a joint model in which the record linkage procedure and the time-to-event analysis are performed simultaneously. R code and example data are available as online supplemental material.

Cite

CITATION STYLE

APA

Hof, M. H., Ravelli, A. C., & Zwinderman, A. H. (2017). A Probabilistic Record Linkage Model for Survival Data. Journal of the American Statistical Association, 112(520), 1504–1515. https://doi.org/10.1080/01621459.2017.1311262

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free