De novo reconstruction of satellite repeat units from sequence data

26Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Satellite DNA are long tandemly repeating sequences in a genome and may be organized as high-order repeats (HORs). They are enriched in centromeres and are challenging to assemble. Existing algorithms for identifying satellite repeats either require the complete assembly of satellites or only work for simple repeat structures without HORs. Here we describe Satellite Repeat Finder (SRF), a new algorithm for reconstructing satellite repeat units and HORs from accurate reads or assemblies without prior knowledge on repeat structures. Applying SRF to real sequence data, we show that SRF could reconstruct known satellites in human and well-studied model organisms. We also find satellite repeats are pervasive in various other species, accounting for up to 12% of their genome contents but are often underrepresented in assemblies. With the rapid progress in genome sequencing, SRF will help the annotation of new genomes and the study of satellite DNAevolution even if such repeats are not fully assembled.

Cite

CITATION STYLE

APA

Zhang, Y., Chu, J., Cheng, H., & Li, H. (2023). De novo reconstruction of satellite repeat units from sequence data. Genome Research, 33(11), 1994–2001. https://doi.org/10.1101/gr.278005.123

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free