Abstract
Motivation: The necessity to characterize the spatial uniformity (or lack of it) of symbols in biological sequences, given its implications for identification of the properties of the structures associated with the sequences. Methods: A one-dimensional version of a recently introduced percolation-based approach is presented, which allows the accurate quantification of symbol distributions even in the presence of co-existing densities. An enhanced version of this methodology, which uses an agglomerative process to organize hierarchically the sequence into subsequences, is also proposed and illustrated. Results: The potential of the proposed methodology is illustrated with respect to synthetic and real data (1881 zebrafish and 1200 Xenopus proteins) and compared to two alternative multiscale methodologies, with encouraging results including the possibility to identify particularly remarkable amino acid arrangements in proteins. © The Author 2004. Published by Oxford University Press. All rights reserved.
Cite
CITATION STYLE
da Fontoura Costa, L. (2005). Biological sequence analysis through the one-dimensional percolation transform and its enhanced version. Bioinformatics, 21(5), 608–616. https://doi.org/10.1093/bioinformatics/bti050
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.