Motivation: Upstream sequences contain short motifs, which mediate transcriptional regulation by specifically binding different transcription factors. The presence of common motifs in the regulatory regions of two genes might be considered as a clue for a potential co-regulation. A pattern count-based (dis)similarity metric between sequences could thus be used to classify genes according to their putative regulatory properties. Results: We present here several metrics which rely on probability theory, and which aim at comparing sequences on the basis of pattern counts. We compare these metrics to several classical dissimilarity and similarity metrics, and illustrate their behaviour with a biological example.
CITATION STYLE
van Helden, J. (2004). Metrics for comparing regulatory sequences on the basis of pattern counts. Bioinformatics, 20(3), 399–406. https://doi.org/10.1093/bioinformatics/btg425
Mendeley helps you to discover research relevant for your work.