This artice is free to access.
Background: Sequencing prokaryotic genomes has revolutionized our understanding of the many roles played by microorganisms. However, the cell and taxon proportions of genome-sequenced bacteria or archaea on earth remain unknown. This study aimed to explore this basic question using large-scale alignment between the sequences released by the Earth Microbiome Project and 155,810 prokaryotic genomes from public databases. Results: Our results showed that the median proportions of the genome-sequenced cells and taxa (at 100% identities in the 16S-V4 region) in different biomes reached 38.1% (16.4-86.3%) and 18.8% (9.1-52.6%), respectively. The sequenced proportions of the prokaryotic genomes in biomes were significantly negatively correlated with the alpha diversity indices, and the proportions sequenced in host-associated biomes were significantly higher than those in free-living biomes. Due to a set of cosmopolitan OTUs that are found in multiple samples and preferentially sequenced, only 2.1% of the global prokaryotic taxa are represented by sequenced genomes. Most of the biomes were occupied by a few predominant taxa with a high relative abundance and much higher genome-sequenced proportions than numerous rare taxa. Conclusions: These results reveal the current situation of prokaryotic genome sequencing for earth biomes, provide a more reasonable and efficient exploration of prokaryotic genomes, and promote our understanding of microbial ecological functions. [MediaObject not available: see fulltext.]
Zhang, Z., Wang, J., Wang, J., Wang, J., & Li, Y. (2020). Estimate of the sequenced proportion of the global prokaryotic genome. Microbiome, 8(1). https://doi.org/10.1186/s40168-020-00903-z