It's all relative: Regression analysis with compositional predictors

Gen Li; Yan Li; Kun Chen

Journal ArticleOPEN ACCESS

It's all relative: Regression analysis with compositional predictors

Biometrics (2023) 79(2) 1318-1329

DOI: 10.1111/biom.13703

13Citations

24Readers

Abstract

Compositional data reside in a simplex and measure fractions or proportions of parts to a whole. Most existing regression methods for such data rely on log-ratio transformations that are inadequate or inappropriate in modeling high-dimensional data with excessive zeros and hierarchical structures. Moreover, such models usually lack a straightforward interpretation due to the interrelation between parts of a composition. We develop a novel relative-shift regression framework that directly uses proportions as predictors. The new framework provides a paradigm shift for regression analysis with compositional predictors and offers a superior interpretation of how shifting concentration between parts affects the response. New equi-sparsity and tree-guided regularization methods and an efficient smoothing proximal gradient algorithm are developed to facilitate feature aggregation and dimension reduction in regression. A unified finite-sample prediction error bound is derived for the proposed regularized estimators. We demonstrate the efficacy of the proposed methods in extensive simulation studies and a real gut microbiome study. Guided by the taxonomy of the microbiome data, the framework identifies important taxa at different taxonomic levels associated with the neurodevelopment of preterm infants.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, G., Li, Y., & Chen, K. (2023). It’s all relative: Regression analysis with compositional predictors. Biometrics, 79(2), 1318–1329. https://doi.org/10.1111/biom.13703

It's all relative: Regression analysis with compositional predictors

Abstract

Author supplied keywords

Cite

Register to see more suggestions