The asymptotic behavior of estimates and information criteria in linear models are studied in the context of hierarchically correlated sampling units. The work is motivated by biological data collected on species where autocorrelation is based on the species' genealogical tree. Hierarchical autocorrelation is also found in many other kinds of data, such as from microarray experiments or human languages. Similar correlation also arises in ANOVA models with nested effects. I show that the best linear unbiased estimators are almost surely convergent but may not be consistent for some parameters such as the intercept and lineage effects, in the context of Brownian motion evolution on the genealogical tree. For the purpose of model selection I show that the usual BIC does not provide an appropriate approximation to the posterior probability of a model. To correct for this, an effective sample size is introduced for parameters that are inconsistently estimated. For biological studies, this work implies that tree-aware sampling design is desirable; adding more sampling units may not help ancestral reconstruction and only strong lineage effects may be detected with high power. © Institute of Mathematical Statistics.
CITATION STYLE
Ané, C. (2008). Analysis of comparative data with hierarchical autocorrelation. Annals of Applied Statistics, 2(3), 1078–1102. https://doi.org/10.1214/08-AOAS173
Mendeley helps you to discover research relevant for your work.