Mining of assembled expressed sequence tag (EST) data for protein families: application to the G protein-coupled receptor superfamily.

5Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The availability of large expressed sequence tag (EST) databases has led to a revolution in the way new genes are identified. Mining of these databases using known protein sequences as queries is a powerful technique for discovering orthologous and paralogous genes. The scientist is often confronted, however, by an enormous amount of search output owing to the inherent redundancy of EST data. In addition, high search sensitivity often cannot be achieved using only a single member of a protein superfamily as a query. In this paper a technique for addressing both of these issues is described. Assembled EST databases are queried with every member of a protein superfamily, the results are integrated and false positives are pruned from the set. The result is a set of assemblies enriched in members of the protein superfamily under consideration. The technique is applied to the G protein-coupled receptor (GPCR) superfamily in the construction of a GPCR Resource. A novel full-length human GPCR identified from the GPCR Resource is presented, illustrating the utility of the method.

Cite

CITATION STYLE

APA

Conklin, D., Yee, D. P., Millar, R., Engelbrecht, J., & Vissing, H. (2000). Mining of assembled expressed sequence tag (EST) data for protein families: application to the G protein-coupled receptor superfamily. Briefings in Bioinformatics, 1(1), 93–99. https://doi.org/10.1093/bib/1.1.93

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free