Computational strategies for eukaryotic pangenome analyses

5Citations
Citations of this article
30Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Over the last few years, pangenome analyses have been applied to eukaryotes, especially to important crops. A handful of eukaryotic pangenome studies have demonstrated widespread variation in gene presence/absence among plant species and its implications on agronomically important traits. In this chapter, we focus on the methodology of pangenome analysis, which can generally be classified into two different types of approaches, a homolog-based strategy and a "map-to-pan" strategy. In a homolog-based strategy, the genomes of individuals are independently assembled, and the presence/absence of a gene family is determined by clustering protein sequences into homologs. Alternatively, in a "map-to-pan" strategy, pangenome sequences are constructed by combining a well-annotated reference genome with newly identified non-reference representative sequences, from which the presence/absence of a gene is then determined based on read coverage after individual reads are mapped to the pangenome. We highlight the advantages and limitations of the homolog-based strategy and several variant approaches to the "map-to-pan" strategy. We conclude that the "map-to-pan" strategy is highly recommended for eukaryotic pangenome analysis. However, programs and parameters for pangenome analysis need to be carefully selected for eukaryotes with different genome sizes.

Cite

CITATION STYLE

APA

Hu, Z., Wei, C., & Li, Z. (2020). Computational strategies for eukaryotic pangenome analyses. In The Pangenome: Diversity, Dynamics and Evolution of Genomes (pp. 293–307). Springer International Publishing. https://doi.org/10.1007/978-3-030-38281-0_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free