Household surveys often fail to capture the top tail of income and wealth distributions, as evidenced by studies based on tax data. Yet to date there is no consensus on how to best reconcile both sources of information, given the multiple biases at play. This paper contributes a novel method, rooted in standard calibration theory, to directly confront the problem of survey non-response between survey micro-data and anonymous tax data under reasonable assumptions. Our key innovation is to endogenously determine a “merging point” between the datasets, above which we start to incorporate information from tax data into the survey, under the assumption that the rate of representativeness is constant, then decreasing with income. This is followed by a “reweighting” and a “replacing” step, which preserves the microdata structure of the original survey, assuming no re-ranking of observations. We illustrate our approach with simulations, which show that our method is robust to the existence of income misreporting, and performs better than alternative methods. We also apply it to real data from five countries, both developed and less developed, finding changes to the levels and trends in income inequality. We discuss several limits to our approach and suggest some guidelines for future research.
CITATION STYLE
Blanchet, T., Flores, I., & Morgan, M. (2022). The weight of the rich: improving surveys using tax data. Journal of Economic Inequality, 20(1), 119–150. https://doi.org/10.1007/s10888-021-09509-3
Mendeley helps you to discover research relevant for your work.