The C1q domain containing protein...
Developmental and Comparative Immunology 34 (2010) 785���790 Contents lists available at ScienceDirect Developmental and Comparative Immunology j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / d c i Review The C1q domain containing proteins: Where do they come from and what do they do? Tristan M. Carland, Lena Gerwick ��� Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego, 9500 Gilman Drive MC 0212, La Jolla, CA 92037, United States a r t i c l e i n f o Article history: Received 3 February 2010 Received in revised form 26 February 2010 Accepted 27 February 2010 Available online 16 March 2010 Keywords: Complement gC1q C1q cbln Bacteria Teleosts Mammals a b s t r a c t The gene sequence encoding an N-terminal collagen stalk followed by a globular complement 1q domain (gC1q), an architecture that characterizes the C1q A, B and C chains of the first complement component (C1), did not become prevalent until the cephalochordates and urochordates. However, genes encoding only the globular complement 1q domain (ghC1q) are more ancient as they exist within many lower vertebrate and invertebrate genomes, and are even present in the prokaryotes. These genes can be divided into two groups, the first, which appears to be the more ancient form, encodes proteins that are not secreted (cghC1q). The second group encodes proteins in which the globular domain is preceded by a signal peptide indicating secretion (sghC1q). In this review we examine bioinformatic evidence for C1q domain containing (C1qDC) genes in many organisms and integrate these observations with research performed and published on the biochemistry and functions of this fascinating set of proteins. �� 2010 Elsevier Ltd. All rights reserved. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785 2. C1q structure and function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786 2.1. C1q-like proteins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 786 3. ghC1q proteins (precerebellin, precerebellin-like and CAPRINs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787 3.1. Immune response and sghC1q proteins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 788 4. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 789 1. Introduction Genes encoding complement component 3 (C3) have been extensively investigated within invertebrate genomes and traced in evolutionary history to the cnidarian radiation [1���3]. Presently the genes encoding for proteins containing a C1q domain (C1qDC) (Table 1 and Figs. 1 and 2) have been partially investigated from an evolutionary perspective [4]. These genes exist within many of the sequenced mammalian, lower vertebrate and invertebrate genomes and functions have been described for some of these ��� Corresponding author. Tel.: +1 858 534 0566 fax: +1 858 534 0529. E-mail addresses: tcarland@ucsd.edu (T.M. Carland), lgerwick@ucsd.edu (L. Gerwick). C1qDC proteins. However, many have not been characterized at all. For example, within the human genome, 32 open reading frames encoding C1qDC proteins have been found [5] while within the zebrafish genome at least 52 exist [6]. In this review we will broadly cover all known C1qDC proteins found within the metazoa as well as suggest a comprehensive set of abbreviations with which to refer to them (Table 1 and Fig. 1). The focus of this review will be primar- ily on the C1qDC proteins wherein the globular domain is preceded only by a short N-terminal amino acid sequence (as exemplified by the precerebellin-like proteins) (Fig. 2) [7] and it will also contain a brief discussion of the C1q-like proteins that contain an N-terminal collagen portion (Fig. 2). The C1qDC proteins are a large group of proteins with many members that have been organized into groups on several occa- sions. In 2005, Tom Tang et al. [5], divided the human C1qDC into 0145-305X/$ ��� see front matter �� 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.dci.2010.02.014
786 T.M. Carland, L. Gerwick / Developmental and Comparative Immunology 34 (2010) 785���790 Table 1 Table outlining definitions for abbreviations used in the text. Abbreviation Definition C1qDC C1q domain (gC1q) containing proteins refer to all proteins that contain a C1q domain. Includes proteins with and without collagen gC1q Globular C1q domain. Structural term that refers to the amino acid sequence that folds into the ���jelly-roll��� topology C1q Complement component 1, sub-component q. C1q forms a hexamer of heterotrimers (total of 18 peptides) in which each heterotrimer is a globular head of C1q A, B and C chains C1q-like A peptide that contains a collagen portion and a gC1q ghC1q Globular head C1q protein. Protein that contains only a gC1q and a short N-terminal that does not form a special motif. As exemplified by the precerebellins and CAPRINs cghC1q A (cellular) globular head C1q protein that does not contain a signal peptide. Exemplified by CAPRINs sghC1q A (secreted) globular head C1q protein that contain a signal peptide. Exemplified by precerebellin and precerebellin-like proteins three sub-families based on their sequence homology. In 2007 this scheme was basically agreed upon by Ghai et al. [8] with an align- ment of the human C1q proteins that divided them into two families with subgroups the larger family containing the C1q-like and cerebellin-like subgroups while the smaller family was composed of EMILINs and multimerins. This was largely reiterated phyloge- netically by Mei in 2008 [6] using the zebrafish C1qDC proteins. In 2008, using the mouse genome, Yuzaki [9] further divided what had been established as sub-family B into what where referred to as the Cbln (precerebellin) and C1ql (C1q-like) groups. Two C1qDC proteins, the human C1q globular domain (PDB:1PK6) [10] and adiponectin (ACRP 30) (PDB:1C3H) [11], have been crystallized and their X-ray structures determined to reso- lutions of 1.9 and 2.1 angstroms, respectively. From these crystal structures the 3D conformations have been deduced revealing that the C1q domain is characterized by its ability to fold into a jelly roll topology of five pairs of anti-parallel -strands creating two - sheets, generally referred to as the globular domain (gC1q) [10���12]. 2. C1q structure and function Of all the C1qDC proteins, the mammalian first complement component (C1) has been the most thoroughly studied, both struc- turally and functionally. Sub-component q of C1 (C1q) forms a hexamer of heterotrimers (total of 18 peptides) in which each het- erotrimer is a globular head of C1q A, B and C chains (Fig. 1). C1q associates with C1s and C1r to form the C1 complex. This complex is the initiator of the classical complement pathway in which it binds to IgM, IgG or C-reactive protein (CRP) on the cells surface, thus activating C4. This initiates the formation of the membrane attack complex and subsequent breaching of the cell membrane [8,13]. C1q has also been studied for its ability to interact with a diverse set of molecules including ligands on the surfaces of pathogens. Fig. 1. Flowchart outlining the relationship of C1qDC to C1q-like proteins, C1q, ghC1q, sghC1q and cgHC1q proteins. C1qDC = C1q domain containing C1q-like pro- teins = peptide that has a collagen domain preceding a gC1q domain C1q = first complement component consisting of C1q A, B and C chains ghC1q = globular head C1q cghC1q = globular head C1q domain protein containing no signal peptide, prob- ably intracellular function. sghC1q = globular head C1q domain protein that contains a signal peptide, probably extracellular function. These interactions have been mapped to different binding sites on the C1q globular head [14,15]. The nature of this binding appears to be a charged pattern recognition between the C1q peptides and the ligand, however, no specific amino acid motif has been identified which promotes this interaction [8]. 2.1. C1q-like proteins A C1q-like gene containing a 5 nucleotide sequence that encodes the amino acid repeat Gly-Pro-X, a feature which forms the colla- gen helix, in which X can be any of the other amino acids, and a 3 end that encodes the amino acids needed to form the globu- lar C1q domain (Fig. 2) have been detected in the medicinal leech Hirudo medicinalis [16]. There are at least three receptors that can interact with C1qDC proteins: CR1, gC1qR, and ��2��1 integrin [17]. One of them, the gC1qR, interacts with the globular C1q domain [18]. This ligand receptor interaction was exploited when it was found that the leech C1q-like peptide elicited chemotactic behav- ior that could be blocked by the use of a human antibody towards the gC1q receptor [16]. Experiments using both human and murine mast cells have also shown that gC1qR is involved in chemotaxis [19]. The results from the study of the leech C1q-like protein indi- cate that the gC1qR must be highly conserved since a human gC1qR antibody appears to be able to block the leech receptor [16]. C1q-like gene copies have also been found in the urochordate Ciona intestinalis (sea squirt) and the cephalocordate Branchiostoma floridae (Florida lancelet) but it can be expected that more C1q- like genes will be found as more sequencing information becomes available (Table 2). However, the isolated case of a C1q-like peptide in the leech is especially interesting since none of the sequenced platyhelminth, nematode, molluscan or echinoderm genomes con- tain C1q-like genes. Several open reading frames in the echinoderm genome contain 5 codons coding for glycine and proline residues but not in the systematic Gly-Pro-X repeated fashion of collagen. This gene motif, as mentioned above, does appear in transcrip- tomes of amphioxus, lamprey, and several teleost fishes. Few of these putative C1q-like proteins have been characterized however, some of these proteins appear to bind to a variety of carbohydrates hence they may function as lectins [20]. As mentioned, a C1q-like protein was isolated from lamprey (an agnathan) that has lectin properties as indicated by its isolation using an N-acetyl-d-glucosamine-SepharoseTM affinity column. Furthermore, the lamprey C1q-like protein has a mass of 480 kDa under native and non-reducing conditions, indicating that it exists as an 18 peptide multimeric protein, identical to the structure of mammalian C1q [20]. In addition to being able to bind to N-acetyl- d-glucosamine, the lamprey C1q-like protein, when co-purified with MASP-A, was able to cleave the C3 molecule also isolated from lamprey serum. In conclusion, C1q-like genes did not become common in genomes or transcriptomes until the evolution of the urochordates and cephalochordates. The exception, at this time, appears to be the medicinal leech however, it is not known if this