How to Cluster Protein Sequences: Tools, Tips and Commands

  • Pavlopoulos G
N/ACitations
Citations of this article
68Readers
Mendeley users who have this article in their library.

Abstract

Proteins are the key molecules that facilitate most biological processes within a cell. Therefore, the discovery, annotation and characterization of them, is of great importance. In System Biology, protein clustering by sequence at a large-scale in order to detect homology, orthology, families, common domains or functional similarities is becoming a great challenge, especially when living in the -Omics era where the exponential growth of sequences produced is indisputable. Despite the great plethora of applications with different strengths to serve this purpose that is available today, a steep learning curve to get familiar with such tools is often required. Users often quit when they get lost in the README files prior to any analysis. To help the community overcome this hesitance, this article describes tools and ways to cluster proteins into groups or families and emphasizes on their basic commands that can be executed in a simple Unix terminal. Notably, both graph-based and sequence-based approaches are described.

Cite

CITATION STYLE

APA

Pavlopoulos, G. A. (2017). How to Cluster Protein Sequences: Tools, Tips and Commands. MOJ Proteomics & Bioinformatics, 5(5). https://doi.org/10.15406/mojpb.2017.05.00174

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free