How to Cluster Protein Sequences: Tools, Tips and Commands

Georgios A Pavlopoulos

Journal ArticleOPEN ACCESS

How to Cluster Protein Sequences: Tools, Tips and Commands

Pavlopoulos G

MOJ Proteomics & Bioinformatics (2017) 5(5)

DOI: 10.15406/mojpb.2017.05.00174

N/ACitations

68Readers

Abstract

Proteins are the key molecules that facilitate most biological processes within a cell. Therefore, the discovery, annotation and characterization of them, is of great importance. In System Biology, protein clustering by sequence at a large-scale in order to detect homology, orthology, families, common domains or functional similarities is becoming a great challenge, especially when living in the -Omics era where the exponential growth of sequences produced is indisputable. Despite the great plethora of applications with different strengths to serve this purpose that is available today, a steep learning curve to get familiar with such tools is often required. Users often quit when they get lost in the README files prior to any analysis. To help the community overcome this hesitance, this article describes tools and ways to cluster proteins into groups or families and emphasizes on their basic commands that can be executed in a simple Unix terminal. Notably, both graph-based and sequence-based approaches are described.

Cite

CITATION STYLE

APA

Pavlopoulos, G. A. (2017). How to Cluster Protein Sequences: Tools, Tips and Commands. MOJ Proteomics & Bioinformatics, 5(5). https://doi.org/10.15406/mojpb.2017.05.00174

How to Cluster Protein Sequences: Tools, Tips and Commands

Abstract

Cite

Register to see more suggestions