A Simplified Description of Child Tables for Sequence Similarity Search

1Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Finding related nucleotide or protein sequences is a fundamental, diverse, and incompletely-solved problem in bioinformatics. It is often tackled by seed-and-extend methods, which first find 'seed' matches of diverse types, such as spaced seeds, subset seeds, or minimizers. Seeds are usually found using an index of the reference sequence(s), which stores seed positions in a suffix array or related data structure. A child table is a fundamental way to achieve fast lookup in an index, but previous descriptions have been overly complex. This paper aims to provide a more accessible description of child tables, and demonstrate their generality: they apply equally to all the above-mentioned seed types and more. We also show that child tables can be used without LCP (longest common prefix) tables, reducing the memory requirement.

Cite

CITATION STYLE

APA

Frith, M. C., & Shrestha, A. M. S. (2018). A Simplified Description of Child Tables for Sequence Similarity Search. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 15(6), 2067–2073. https://doi.org/10.1109/TCBB.2018.2796064

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free