Use of ChIP-Seq data for the design of a multiple promoter-alignment method

Ionas Erb; Juan R. González-Vallinas; Giovanni Bussotti; Enrique Blanco; Eduardo Eyras; Cédric Notredame

Journal ArticleOPEN ACCESS

Use of ChIP-Seq data for the design of a multiple promoter-alignment method

Nucleic Acids Research (2012) 40(7)

DOI: 10.1093/nar/gkr1292

20Citations

72Readers

Abstract

We address the challenge of regulatory sequence alignment with a new method, Pro-Coffee, a multiple aligner specifically designed for homologous promoter regions. Pro-Coffee uses a dinucleotide substitution matrix estimated on alignments of functional binding sites from TRANSFAC. We designed a validation framework using several thousand families of orthologous promoters. This dataset was used to evaluate the accuracy for predicting true human orthologs among their paralogs. We found that whereas other methods achieve on average 73.5 accuracy, and 77.6 when trained on that same dataset, the figure goes up to 80.4 for Pro-Coffee. We then applied a novel validation procedure based on multi-species ChIP-seq data. Trained and untrained methods were tested for their capacity to correctly align experimentally detected binding sites. Whereas the average number of correctly aligned sites for two transcription factors is 284 for default methods and 316 for trained methods, Pro-Coffee achieves 331, 16.5 above the default average. We find a high correlation between a method's performance when classifying orthologs and its ability to correctly align proven binding sites. Not only has this interesting biological consequences, it also allows us to conclude that any method that is trained on the ortholog data set will result in functionally more informative alignments. © 2011 The Author(s).

Cite

CITATION STYLE

APA

Erb, I., González-Vallinas, J. R., Bussotti, G., Blanco, E., Eyras, E., & Notredame, C. (2012). Use of ChIP-Seq data for the design of a multiple promoter-alignment method. Nucleic Acids Research, 40(7). https://doi.org/10.1093/nar/gkr1292

Use of ChIP-Seq data for the design of a multiple promoter-alignment method

Abstract

Cite

Register to see more suggestions