A computational approach for identifying pseudogenes in the ENCODE regions.

ISSN: 14656914
37Citations
Citations of this article
66Readers
Mendeley users who have this article in their library.

Abstract

BACKGROUND: Pseudogenes are inheritable genetic elements showing sequence similarity to functional genes but with deleterious mutations. We describe a computational pipeline for identifying them, which in contrast to previous work explicitly uses intron-exon structure in parent genes to classify pseudogenes. We require alignments between duplicated pseudogenes and their parents to span intron-exon junctions, and this can be used to distinguish between true duplicated and processed pseudogenes (with insertions). RESULTS: Applying our approach to the ENCODE regions, we identify about 160 pseudogenes, 10% of which have clear 'intron-exon' structure and are thus likely generated from recent duplications. CONCLUSION: Detailed examination of our results and comparison of our annotation with the GENCODE reference annotation demonstrate that our computation pipeline provides a good balance between identifying all pseudogenes and delineating the precise structure of duplicated genes.

Cite

CITATION STYLE

APA

Zheng, D., & Gerstein, M. B. (2006). A computational approach for identifying pseudogenes in the ENCODE regions. Genome Biology, 7 Suppl 1.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free