UniRef clusters: A comprehensive and scalable alternative for improving sequence similarity searches

  • Suzek B
  • Wang Y
  • Huang H
 et al. 
  • 168

    Readers

    Mendeley users who have this article in their library.
  • 117

    Citations

    Citations of this article.

Abstract

Motivation: UniRef databases provide full-scale clustering of UniProtKB sequences and are utilized for a broad range of applications, particularly similarity-based functional annotation. Non-redundancy and intra-cluster homogeneity in UniRef were recently improved by adding a sequence length overlap threshold. Our hypothesis is that these improvements would enhance the speed and sensitivity of similarity searches and improve the consistency of annotation within clusters.
Results: Intra-cluster molecular function consistency was examined by analysis of Gene Ontology terms. Results show that UniRef clusters bring together proteins of identical molecular function in more than 97% of the clusters, implying that clusters are useful for annotation and can also be used to detect annotation inconsistencies. To examine coverage in similarity results, BLASTP searches against UniRef50 followed by expansion of the hit lists with cluster members demonstrated advantages compared with searches against UniProtKB sequences; the searches are concise (∼7 times shorter hit list before expansion), faster (∼6 times) and more sensitive in detection of remote similarities (>96% recall at e-value Availability and implementation: Web access and file download from UniProt website at http://www.uniprot.org/uniref and ftp://ftp.uniprot.org/pub/databases/uniprot/uniref. BLAST searches against UniRef are available at http://www.uniprot.org/blast/
Contact: huang@dbi.udel.edu

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Baris E. Suzek

  • Yuqi Wang

  • Hongzhan Huang

  • Peter B. McGarvey

  • Cathy H. Wu

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free