Motivation: Describing domain architecture is a critical step in the functional characterization of proteins. However, some orphan domains do not match any profile stored in dedicated domain databases and are thereby difficult to analyze.Results: We present here an original novel approach, called TREMOLO-HCA, for the analysis of orphan domain sequences and inspired from our experience in the use of Hydrophobic Cluster Analysis (HCA). Hidden relationships between protein sequences can be more easily identified from the PSI-BLAST results, using information on domain architecture, HCA plots and the conservation degree of amino acids that may participate in the protein core. This can lead to reveal remote relationships with known families of domains, as illustrated here with the identification of a hidden Tudor tandem in the human BAHCC1 protein and a hidden ET domain in the Saccharomyces cerevisiae Taf14p and human AF9 proteins. The results obtained in such a way are consistent with those provided by HHPRED, based on pairwise comparisons of HHMs. Our approach can, however, be applied even in absence of domain profiles or known 3D structures for the identification of novel families of domains. It can also be used in a reverse way for refining domain profiles, by starting from known protein domain families and identifying highly divergent members, hitherto considered as orphan. © 2013 The Author 2013. Published by Oxford University Press. All rights reserved.
CITATION STYLE
Faure, G., & Callebaut, I. (2013). Identification of hidden relationships from the coupling of Hydrophobic Cluster Analysis and Domain Architecture information. Bioinformatics, 29(14), 1726–1733. https://doi.org/10.1093/bioinformatics/btt271
Mendeley helps you to discover research relevant for your work.