A simple method to predict protein-binding from aligned sequences - Application to MHC superfamily and β2-microglobulin

Elodie Duprat; Marie Paule Lefranc; Olivier Gascuel

Journal ArticleOPEN ACCESS

A simple method to predict protein-binding from aligned sequences - Application to MHC superfamily and β2-microglobulin

Bioinformatics (2006) 22(4) 453-459

DOI: 10.1093/bioinformatics/bti826

18Citations

23Readers

Abstract

Motivation: The MHC superfamily (MhcSF) consists of immune system MHC class I (MHC-I) proteins, along with proteins with a MHC-I-like structure that are involved in a large variety of biological processes. β2-Microglobulin (B2M) non-covalent binding to MHC-I proteins is required for their surface expression and function, whereas MHC-I-like proteins interact, or not, with B2M. This study was designed to predict B2M binding (or non-binding) of newly identified MhcSF proteins, in order to decipher their function, understand the molecular recognition mechanisms and identify deleterious mutations. IMGT standardization of MhcSF protein domains provides a unique numbering of the multiple alignment positions, and conditions to develop such predictive tool. Method: We combine a simple-Bayes classifier with IMGT unique numbering. Our method involves two steps: (1) selection of discriminant binary features, which associate an alignment position with an amino acid group; and (2) learning of the classifier by estimating the frequencies of selected features, conditionally to the B2M binding property. Results: Our dataset contains aligned sequences of 806 allelic forms of 47 MhcSF proteins, corresponding to 9 receptor types and 4 mammalian species. Eighteen discriminant features are selected, belonging to B2M contact sites, or stabilizing the molecular structure required for this contact. Three leave-one-out procedures are used to assess classifier performance, which corresponds to B2M binding prediction for: (1) new proteins, (2) species not represented in the dataset and (3) new receptor types. The prediction accuracy is high, i.e. 98, 94 and 70%, respectively. Application of our classifier to lower vertebrate MHC-I proteins indicates that these proteins bind to B2M and should then be expressed on the cellular surface by a process similar to that of mammalian MHC-I proteins. These results demonstrate the usefulness and accuracy of our (simple) approach, which should apply to other function or interaction prediction problems. © The Author 2005. Published by Oxford University Press. All rights reserved.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Duprat, E., Lefranc, M. P., & Gascuel, O. (2006). A simple method to predict protein-binding from aligned sequences - Application to MHC superfamily and β2-microglobulin. Bioinformatics, 22(4), 453–459. https://doi.org/10.1093/bioinformatics/bti826

Readers' Seniority

PhD / Post grad / Masters / Doc 8

47%

Researcher 6

35%

Lecturer / Post doc 2

12%

Professor / Associate Prof. 1

Readers' Discipline

Agricultural and Biological Sciences 10

59%

Computer Science 3

18%

Psychology 2

12%

Biochemistry, Genetics and Molecular Bi... 2

12%

A simple method to predict protein-binding from aligned sequences - Application to MHC superfamily and β2-microglobulin

Abstract

References Powered by Scopus

MUSCLE: Multiple sequence alignment with high accuracy and high throughput

A Mathematical Theory of Communication

A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood

Cited by Powered by Scopus

IMGT/3dstructure-DB and IMGT/domaingapalign: A database and a tool for immunoglobulins or antibodies, T cell receptors, MHC, IgSF and MHcSF

Immunoglobulin and T cell receptor genes: IMGT<sup>®</sup> and the birth and rise of immunoinformatics

Restricting nonclassical MHC genes coevolve with TRAV genes used by innate-like T cells in mammals

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline