A multi-metric algorithm for hierarchical clustering of same-length protein sequences

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid sim-ilarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results.

Cite

CITATION STYLE

APA

Tsarouchis, S., Kotouza, M. T., Psomopoulos, F. E., & Mitkas, P. A. (2018). A multi-metric algorithm for hierarchical clustering of same-length protein sequences. In IFIP Advances in Information and Communication Technology (Vol. 520, pp. 189–199). Springer New York LLC. https://doi.org/10.1007/978-3-319-92016-0_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free