How to get the most out of phylogenetic imputation without abusing it

4Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Phylogenies are viewed as potentially powerful resources to predict missing values in trait datasets, but they are often misused. Critically, many of the imputed values that completely or partially rely on phylogenetic information are trusted without convincingly demonstrating that the data meet the requirements for the predictions to be at least minimally valuable. I discuss that phylogenetic signal, which is the mainstay of phylogenetic imputation, is often interpreted as ‘strong’ because the outcome of randomization tests has prevailed over the actual strength of the signal in determining whether it is strong or not. This circumstance has led many researchers to infer conclusions based on ‘strong’ signals that are actually way more labile than a phylogenetic random walk (i.e. Brownian motion). Although trait evolutionary trajectories that nearly fit Brownian motion are typically considered as strongly conserved, the Brownian process is subject to high levels of stochasticity that may render spurious predictions under some circumstances. To my knowledge, very few studies (if any) that rely on phylogenetically imputed information have rigorously evaluated the expected accuracy of individual predictions, despite among-lineage variability in prediction accuracy can be dramatic even for strongly conserved traits. Here, I advocate for a Monte-Carlo approach that is based on trait simulations to assess the prediction accuracy that is expected for each missing value in the traits of interest, which can be continuous or discrete. The framework is presented in a detailed step-by-step R tutorial that was conceived for non-specialized researchers to identify highly likely spurious predictions without the need for advanced technical and statistical skills. Although phylogenetic imputation has important limitations, I suggest that leveraging advances in our understanding of such hindrances and using the technique with caution and restraint will allow trait-based research to progress further while sampling efforts continue replacing imputed data.

References Powered by Scopus

Picante: R tools for integrating phylogenies and ecology

4485Citations
N/AReaders
Get full text

Inferring the historical patterns of biological evolution

3946Citations
N/AReaders
Get full text

Testing for phylogenetic signal in comparative data: Behavioral traits are more labile

3861Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Spatial heterogeneity of extinction risk for flowering plants in China

3Citations
N/AReaders
Get full text

A Guided Tour of Phylogenetic Comparative Methods for Studying Trait Evolution

2Citations
N/AReaders
Get full text

The fallacy of single imputation for trait databases: Use multiple imputation instead

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Molina-Venegas, R. (2024). How to get the most out of phylogenetic imputation without abusing it. Methods in Ecology and Evolution, 15(3), 456–463. https://doi.org/10.1111/2041-210X.14198

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 3

38%

Researcher 3

38%

Professor / Associate Prof. 2

25%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 3

33%

Environmental Science 3

33%

Biochemistry, Genetics and Molecular Bi... 2

22%

Chemistry 1

11%

Save time finding and organizing research with Mendeley

Sign up for free