RRCA: Ultra-fast multiple in-species genome alignments

Sebastian Wandelt; Ulf Leser

Conference Proceedings

RRCA: Ultra-fast multiple in-species genome alignments

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8542 LNBI 247-261

DOI: 10.1007/978-3-319-07953-0_20

2Citations

8Readers

Get full text

Abstract

Multiple sequence alignment is an important method in Bioinformatics, for instance, to reconstruct phylogenetic trees or for identifying functional domains within genes. Finding an optimal MSA is computationally intractable, and therefore many alignment heuristics were proposed. However, computing MSA for sequences at chromosome/genome scale in a reasonable time with good alignment results remains an open challenge. In this paper we propose RRCA, a very fast method to compute high-quality in-species MSAs at genome scale. RRCA uses referential compression to efficiently find long common subsequences in to-be-aligned sequences. A colinear sub collection of these subsequences is used for an initial alignment and the not yet covered subsequences are aligned following the same approach recursively. Our evaluation shows that RRCA achieves MSAs at similar quality as current state-of-the-art methods, while often being orders of magnitude faster for all our datasets. For instance, RRCA aligns eight human Chromosome 22 (around 50 MB each) within one minute on a consumer computer; a task that takes hours to days with competitors. © 2014 Springer International Publishing.

Author supplied keywords

Cite

CITATION STYLE

APA

Wandelt, S., & Leser, U. (2014). RRCA: Ultra-fast multiple in-species genome alignments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8542 LNBI, pp. 247–261). Springer Verlag. https://doi.org/10.1007/978-3-319-07953-0_20

RRCA: Ultra-fast multiple in-species genome alignments

Abstract

Author supplied keywords

Cite

Register to see more suggestions