Synthetic Individual Income Tax Data: Methodology, Utility, and Privacy Implications

3Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The United States Internal Revenue Service Statistics of Income (SOI) Division possesses invaluable administrative tax data from individual income tax returns that could vastly expand our understanding of how tax policies affect behavior and how those policies could be made more effective. However, only a small number of government analysts and researchers can access the raw data. The public use file (PUF) that SOI has produced for more than 60 years has become increasingly difficult to protect using traditional statistical disclosure control methods. The vast amount of personal information available in public and private databases combined with enormous computational power create unprecedented disclosure risks. SOI and researchers at the Urban Institute are developing synthetic data that represent the statistical properties of the administrative data without revealing any individual taxpayer information. This paper presents quality estimates of the first fully synthetic PUF and shows how it performs in tax model microsimulations as compared with the PUF and the confidential administrative data.

Cite

CITATION STYLE

APA

Bowen, C. M. K., Bryant, V., Burman, L., Czajka, J., Khitatrakun, S., MacDonald, G., … Zwiefel, N. (2022). Synthetic Individual Income Tax Data: Methodology, Utility, and Privacy Implications. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13463 LNCS, pp. 191–204). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-13945-1_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free