Saturday, August 16, 2008

Putting cytosine deamination to work

The effect of cytosine deamination on a random pool of amino acids and how it might facilitate evolution has been described. Cytosine deamination also does not result in any stop codon formation. Bollenbach et al. (2007) briefly describes a few more optimal features of the genetic code as discussed in more detail by Itzkovitz and Alon (2007).
These include:
1) Quote:
They (Itzkovitz and Alon) compared the actual genetic code with an ensemble of all other codes that are equally optimized with respect to mistranslation or mutation (for more on this statistical approach, see also Alff-Steinberger 1969; Haig and Hurst 1991; Freeland and Hurst 1998). Assuming that the usage frequencies of the different amino acids are fixed, while their codon assignments vary in the ensemble, they find that the actual code is far better than other possible codes in minimizing the number of amino acids incorporated until translation is interrupted after a frameshift error occurred. This new observation by Itzkovitz and Alon could therefore be seen as reviving the basis for Crick’s theory of a comma-less code, modified by the constraints imposed on the code by the need to be robust to other kinds of translation errors and mutations. Another possible interpretation of their result is that the amino acid usage has adjusted to reduce the effects of frameshift errors; alternative genetic codes would have had a different amino acid usage coadapted to them. It has been shown previously that amino acid usage is rather malleable, and, for example, influenced by GC content (Knight et al. 2001b).
2) Quote:
Itzkovitz and Alon suggest another, quite unanticipated, type of optimality: the code is highly optimal for encoding arbitrary additional information, i.e., information other than the amino acid sequence in protein-coding sequences. Optimality for encoding additional information is particularly important and relevant given the known signals contained in the nucleotide sequence of coding regions. These include RNA splicing signals, which are encoded in the nucleotide sequence together with the amino acid sequence of the prospective protein (Cartegni et al. 2002), as well as signals recognized by the translation apparatus.
Bollenbach et al. (2007) also briefly mentions how the code could have evolved:
1) Quote:
(1) the code has evolved under selection pressure to optimize certain functions such as minimization of the impact of mutations (Sonneborn 1965) or translation errors (Woese 1965a); Random mutation is a source of variability, yet selection pressure is believed to have selected for a system to put constraints on variability. Why?

2) Quote:
(2) the number of amino acids in the code has increased over evolutionary time according to evolution of the pathways for amino acid biosynthesis (Wong 1975)
Why was selection so strong in removing the other variants with fewer codons? Is there evidence of organisms using only 5, 6, 9, 13, 18 etc. amino acid codons? Bollenbach et al. (2007) also points out the following:
Quote:
The discovery of variant codes (Barrell et al. 1979; Fox 1987; Knight et al. 2001a) made the connection between evolvability and universality even more puzzling. On one hand, they prove that the genetic codes can evolve; on the other hand, if they could easily evolve, why are all variations minor? It was recently proposed that extensive horizontal gene transfer during early evolution can account for both evolution toward optimality and the near universality of the genetic code (Vetsigian et al. 2006).
3) Quote:
(3) direct chemical interactions between amino acids and short nucleic acid sequences originally led to corresponding assignments in the genetic code (Woese et al. 1966b).
Bollenbach et al. (2007) concludes with the following:
Quote:
As we learn more about the functions of the genetic code, it becomes ever clearer that the degeneracy in the genetic code is not exploited in such a way as to optimize one function, but rather to optimize a combination of several different functions simultaneously. Looking deeper into the structure of the code, we wonder what other remarkable properties it may bear. While our understanding of the genetic code has increased substantially over the last decades, it seems that exciting discoveries are waiting to be made.
The vertebrate immune system exploits these optimal features of the genetic code by "putting cytosine deamination to work". Antibody diversification is crucial in limiting the frequency of environmentally acquired infections and thereby increasing the fitness of the organism. Initial diversification of antibodies is achieved by assembling variable (V), diversity (D) and joining (J) gene segments (V(D)J recombination) by non-homologous recombination. Further diversification is carried out by somatic hypermutation (SHM) and Class Switch Recombination. Central to the initiation to these diversification processes is the activation-induced cytosine deaminase (AID) protein. AID deaminates cytosine to uracil in single stranded DNA (ssDNA - arising during gene transcription) and is dependent on active gene transcription of the various antibody genes. The induced mutation is resolved by at least 4 pathways (Figure 4):
1) Copying of the base by high-fidelity polymerases during DNA replication.
2) Short-Patch Base Excision Repair (SP-BER) by uracil-DNA glycosylase removal and subsequent repair of the base.
3) Long-Patch Base Excision Repair (LP-BER)
4) Mismatch repair (MMR)

Figure 1: Activation induced cytosine deamination and the pathways involved in resolving the induced mutation. 1) Normal DNA replication results in a C:G→T:A transition. 2) Successful SP-BER resolves the mutation, however the recruitment of error-prone translesion polymerases results (e.g. REV1) in transversions (REV1; C:G→G:C) and transition. 3) LP-BER can also resolve the mutation, however recruitment of low-fidelity polymerases (e.g. Pol n) also causes transition and transversion mutations. 4) MMR repair can also resolve the mutation, however the recruitment of low-fidelity polymerases through this pathway is a major cause of A:T transitions.

AID causes somatic hypermutation and its activity is limited to the certain genetic regions of the immune system. When the system runs unchecked, mutations might be introduced into proto-oncogenes, resulting in possible cancerous growth. The system is controlled (Figure 2). The activity and gene expression of AID is controlled. The type of error-repair pathway and the subsequent recruitment of various low-fidelity polymerases determine the type of mutations after the repair process and these also seem to be controlled. Current research focuses on the mechanisms of control of downstream repair pathways and why this system is selectively targeted to the small region of antibody genes.

Figure 2: Controlled variability of somatic hypermutation.

Thus, the immune system exploits the properties the genetic code for the purpose of controlled variability. Is the system limited to vertabrates or can similar systems be found in other organisms. Cytosine deamninases are found in bacteria as well. Error-prone repair systems are also present. Will we discover an active system in bacteria that exploits the properties of the genetic code for the purpose of controlled variability under selective pressure? Will RecA
and LexA play a part?

References:
Peled JU, Kuang FL, Iglesias-Ussel MD, Roa S, Kalis SL, Goodman MF et al. The biochemistry of somatic hypermutation. Annu Rev Immunol. 2008;26:481-511.

Teng G, Papavasiliou FN. Immunoglobulin somatic hypermutation. Annu Rev Genet. 2007;41:107-20.

Goodman MF, Scharff MD, Romesberg FE. Abstract AID-initiated purposeful mutations in immunoglobulin genes. Adv Immunol. 2007;94:127-55.

Basu U, Chaudhuri J, Phan RT, Datta A, Alt FW. Regulation of activation induced deaminase via phosphorylation. Adv Exp Med Biol. 2007;596:129-37

No comments: