Filtering bio-sequence based on sequence descriptor

Te Wen Hsieh; Huang Cheng Kuo; Jen Peng Huang

Conference Proceedings

Filtering bio-sequence based on sequence descriptor

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 3916 LNBI 14-23

DOI: 10.1007/11691730_3

1Citations

3Readers

Get full text

Abstract

Study on biological sequence database similarity searching has received substantial attention in the past decade, especially after the sequencing of the human genome. As a result, with larger and larger increases in database sizes, fast similarity search is becoming an important issue. Transforming sequences into numerical vectors, called sequence descriptors, for storing in a multidimensional data structure is becoming a promising method for indexing bio-sequences. In this paper, we present an effective sequence transformation method, called SD (Sequence Descriptor) which uses multiple features of a sequence including Count, RPD (Relative Position Dispersion), and APD (Absolute Position Dispersion) to represent the original sequence data. In contrast to the q-gram transformation method, this avoids the problem of exponentially growing vector size. Also, we present a transformation, called ST (Segment Transformation), which recursively divides sequence data into equal length subsequences, and concatenates them after transformation of the subsequences. Experiments on human genome data show that our transformation method is more effective than the q-gram transformation method. © Springer-Verlag Berlin Heidelberg 2006.

Author supplied keywords

Cite

CITATION STYLE

APA

Hsieh, T. W., Kuo, H. C., & Huang, J. P. (2006). Filtering bio-sequence based on sequence descriptor. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3916 LNBI, pp. 14–23). https://doi.org/10.1007/11691730_3

Filtering bio-sequence based on sequence descriptor

Abstract

Author supplied keywords

Cite

Register to see more suggestions