Evolutionary Sparse Learning for Phylogenomics

8Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We introduce a supervised machine learning approach with sparsity constraints for phylogenomics, referred to as evolutionary sparse learning (ESL). ESL builds models with genomic loci-such as genes, proteins, genomic segments, and positions-as parameters. Using the Least Absolute Shrinkage and Selection Operator, ESL selects only the most important genomic loci to explain a given phylogenetic hypothesis or presence/absence of a trait. ESL models do not directly involve conventional parameters such as rates of substitutions between nucleotides, rate variation among positions, and phylogeny branch lengths. Instead, ESL directly employs the concordance of variation across sequences in an alignment with the evolutionary hypothesis of interest. ESL provides a natural way to combine different molecular and nonmolecular data types and incorporate biological and functional annotations of genomic loci in model building. We propose positional, gene, function, and hypothesis sparsity scores, illustrate their use through an example, and suggest several applications of ESL. The ESL framework has the potential to drive the development of a new class of computational methods that will complement traditional approaches in evolutionary genomics, particularly for identifying influential loci and sequences given a phylogeny and building models to test hypotheses. ESL's fast computational times and small memory footprint will also help democratize big data analytics and improve scientific rigor in phylogenomics.

References Powered by Scopus

Regression Shrinkage and Selection Via the Lasso

35617Citations
N/AReaders
Get full text

An introduction to ROC analysis

16052Citations
N/AReaders
Get full text

Regularization and variable selection via the elastic net

13097Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Current progress and open challenges for applying deep learning across the biosciences

158Citations
N/AReaders
Get full text

Embracing Green Computing in Molecular Phylogenetics

14Citations
N/AReaders
Get full text

Constructing phylogenetic networks via cherry picking and machine learning

4Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Kumar, S., & Sharma, S. (2021). Evolutionary Sparse Learning for Phylogenomics. Molecular Biology and Evolution, 38(11), 4674–4682. https://doi.org/10.1093/molbev/msab227

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

61%

Professor / Associate Prof. 5

22%

Researcher 3

13%

Lecturer / Post doc 1

4%

Readers' Discipline

Tooltip

Biochemistry, Genetics and Molecular Bi... 10

56%

Agricultural and Biological Sciences 6

33%

Computer Science 1

6%

Social Sciences 1

6%

Save time finding and organizing research with Mendeley

Sign up for free