Probabilistic approach for DNA compression

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Rapid advancements in research in the field of DNA sequence discovery has led to a vast range of compression algorithms. The number of bits required for storing four bases of any DNA sequence is two, but efficient algorithms have pushed this limit lower. With the constant decrease in prices of memory and communication channel bandwidth, one often doubts the need of such compression algorithms. The algorithm discussed in this chapter compresses the DNA sequence, and also allows one to generate finite length sequences, which can be used to find approximate pattern matches. DNA sequences are mainly of two types, Repetitive and Non-Repetitive. The compression technique used is meant for the non-repetitive parts of the sequence, where we make use of the fact that a DNA sequence consists of only 4 characters. The algorithm achieves bit/base ratio of 1.3-1.4(dependent on the database), but more importantly one of the stages of the algorithm can be used for efficient discovery of approximate patterns. © 2009 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Probabilistic approach for DNA compression. (2009). Studies in Computational Intelligence, 190, 279–289. https://doi.org/10.1007/978-3-642-00193-2_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free