TRASH: Tandem Repeat Annotation and Structural Hierarchy

13Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: The advent of long-read DNA sequencing is allowing complete assembly of highly repetitive genomic regions for the first time, including the megabase-scale satellite repeat arrays found in many eukaryotic centromeres. The assembly of such repetitive regions creates a need for their de novo annotation, including patterns of higher order repetition. To annotate tandem repeats, methods are required that can be widely applied to diverse genome sequences, without prior knowledge of monomer sequences. Results: Tandem Repeat Annotation and Structural Hierarchy (TRASH) is a tool that identifies and maps tandem repeats in nucleotide sequence, without prior knowledge of repeat composition. TRASH analyses a fasta assembly file, identifies regions occupied by repeats and then precisely maps them and their higher order structures. To demonstrate the applicability and scalability of TRASH for centromere research, we apply our method to the recently published Col-CEN genome of Arabidopsis thaliana and the complete human CHM13 genome.

Cite

CITATION STYLE

APA

Wlodzimierz, P., Hong, M., & Henderson, I. R. (2023). TRASH: Tandem Repeat Annotation and Structural Hierarchy. Bioinformatics, 39(5). https://doi.org/10.1093/bioinformatics/btad308

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free