Compositional data analysis

79Citations
Citations of this article
366Readers
Mendeley users who have this article in their library.

Abstract

Compositional data are nonnegative data carrying relative, rather than absolute, information-these are often data with a constant-sum constraint on the sample values, for example, proportions or percentages summing to 1% or 100%, respectively. Ratios between components of a composition are important since they are unaffected by the particular set of components chosen. Logarithms of ratios (logratios) are the fundamental transformation in the ratio approach to compositional data analysis-all data thus need to be strictly positive, so that zero values present a major problem. Components that group together based on domain knowledge can be amalgamated (i.e., summed) to create new components, and this can alleviate the problem of data zeros. Once compositional data are transformed to logratios, regular univariate and multivariate statistical analysis can be performed, such as dimension reduction and clustering, as well as modeling. Alternative methodologies that come close to the ideals of the logratio approach are also considered, especially those that avoid the problem of data zeros, which is particularly acute in large bioinformatic data sets.

Cite

CITATION STYLE

APA

Greenacre, M. (2021, March 7). Compositional data analysis. Annual Review of Statistics and Its Application. Annual Reviews Inc. https://doi.org/10.1146/annurev-statistics-042720-124436

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free