We introduce the normal-inverse-gamma summation operator, which combines Bayesian regression results from different data sources and leads to a simple split-and-merge algorithm for big data regressions. The summation operator is also useful for computing the marginal likelihood and facilitates Bayesian model selection methods, including Bayesian LASSO, stochastic search variable selection, Markov chain Monte Carlo model composition, etc. Observations are scanned in one pass and then the sampler iteratively combines normal-inversegamma distributions without reloading the data. Simulation studies demonstrate that our algorithms can efficiently handle highly correlated big data. A real-world data set on employment and wage is also analyzed.
CITATION STYLE
Qian, H. (2018). Big data Bayesian linear regression and variable selection by normal-inverse-gamma summation. Bayesian Analysis, 13(4), 1007–1031. https://doi.org/10.1214/17-BA1083
Mendeley helps you to discover research relevant for your work.