Correcting Selection Bias in Big Data by Pseudo-Weighting

6Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Nonprobability samples, for example observational studies, online opt-in surveys, or register data, do not come from a sampling design and therefore may suffer from selection bias. To correct for selection bias, Elliott and Valliant (EV) proposed a pseudo-weight estimation method that applies a two-sample setup for a probability sample and a nonprobability sample drawn from the same population, sharing some common auxiliary variables. By estimating the propensities of inclusion in the nonprobability sample given the two samples, we may correct the selection bias by (pseudo) design-based approaches. This paper expands the original method, allowing for large sampling fractions in either sample or for high expected overlap between selected units in each sample, conditions often present in administrative data sets and more frequently occurring with Big Data.

Cite

CITATION STYLE

APA

Liu, A. C., Scholtus, S., & De Waal, T. (2023). Correcting Selection Bias in Big Data by Pseudo-Weighting. Journal of Survey Statistics and Methodology, 11(5), 1181–1203. https://doi.org/10.1093/jssam/smac029

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free