Researchers often utilize data sets that link information from multiple sources, but non-linkage biases caused by linked and non-linked subject differences are little understood, especially in business data sets. We address these knowledge gaps by studying biases in linkable 2010 UK Small Business Survey data sets. We identify correlates of business linkage propensity, and also for the first time its components: consent to linkage and register identifier appendability. As well, we take a novel approach to evaluating non-linkage bias risks, by computing data set representativeness indicators (comparable, decomposable sample subset similarity measures). We find that the main impacts on linkage propensities and bias risks are due to consenter–non-consenter differences explicable given business survey response processes, and differences between subjects with and without identifiers caused by register undercoverage of very small businesses. We then discuss consequences for the analysis of linked business data sets, and implications of the evaluation methods we introduce for linked data set producers and users.
CITATION STYLE
Moore, J. C., Smith, P. W. F., & Durrant, G. B. (2018). Correlates of record linkage and estimating risks of non-linkage biases in business data sets. Journal of the Royal Statistical Society. Series A: Statistics in Society, 181(4), 1211–1230. https://doi.org/10.1111/rssa.12342
Mendeley helps you to discover research relevant for your work.