Abstract
Motivation: The transcription start site (TSS) has been located for an increasing number of genes across several organisms. Statistical tests have shown that some cis-acting regulatory elements have positional preferences with respect to the TSS, but few strategies have emerged for locating elements by their positional preferences. This paper elaborates such a strategy. First, we align promoter regions without gaps, anchoring the alignment on each promoter's TSS. Second, we apply a novel word-specific mask. Third, we apply a clustering test related to gapless BLAST statistics. The test examines whether any specific word is placed unusually consistently with respect to the TSS. Finally, our program A-GLAM, an extension of the GLAM program, uses significant word positions as new 'anchors' to realign the sequences. A Gibbs sampling algorithm then locates putative cis-acting regulatory elements. Usually, Gibbs sampling requires a preliminary masking step, to avoid convergence onto a dominant but uninteresting signal from a DNA repeat. However, since the positional anchors focus A-GLAM on the motif of interest, masking DNA repeats during Gibbs sampling becomes unnecessary. Results: In a set of human DNA sequences with experimentally characterized TSSs, the placement of 791 octonucleotide words was unusually consistent (multiple test corrected P < 0.05). Alignments anchored on these words sometimes located statistically significant motifs inaccessible to GLAM or AlignACE. © The Author 2005. Published by Oxford University Press. All rights reserved.
Cite
CITATION STYLE
Tharakaraman, K., Mariño-Ramírez, L., Sheetlin, S., Landsman, D., & Spouge, J. L. (2005). Alignments anchored on genomic landmarks can aid in the identification of regulatory elements. Bioinformatics, 21(SUPPL. 1). https://doi.org/10.1093/bioinformatics/bti1028
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.