Codon Optimized for Failure

STATUS: Active (last updated Endy 17:07, 17 Oct 2005 (EDT)) What if we had a genetic code, which supported translation of all 20 natural amino acids, but that any point mutation resulted in a non-sense codon? In other words, what if every point mutation resulted in a translation stop? Such a code might form the basis of a "fail-fast" genome. To do this, it needs to be true that any point mutation maps to a stop codon. Presume that we only need twenty codons. For a three-base codon, there are 9 possible single point mutations... AAA <- original codon AAC [nine single point mutations... AAG AAT ACA AGA ATA CAA GAA TAA ] Thus, starting out simplemindedly, we need 20x9 "stop/non-sense" codons (with more brain power, it should be possible to optimize the mapping into fewer stop/non-sense codons). Since a 3x4-base codon table only encodes 64 codons... we should consider bigger codons. Peter Schultz's group has shown that 4-, 5-, and 6-base codons work. So, what about using 4-base codons. A 4x4-base codon table would encode 256 codons. For a four-base codon, there are 12 possible single point mutations... AAAA <- original codon AAAC [twelve single point mutation... AAAG AAAT AACA AAGA AATA ACAA AGAA ATAA CAAA GAAA TAAA ] Thus, continuing simplemindedly, we need 20x12 "stop/non-sense" codons. Since a 4x4-base codon table encodes 256 codons, and 240 < 260, we should enough "room" in mutation space to bracket each coding codon with 12 failure codons!


 * I'm not sure I exactly understand what you're trying to say/calculate here. It's pretty easy to show that a 3-base codon isn't sufficient for specifying 20 codons under the constraint that any individual mutation results in a stop codon. For a 4-base codon, It's actually really easy to design a 4-base code encoding 64 codons, much more efficiently than what you have here. Just make one of the bases a parity base. For example, let A=0, T=1, G=2, C=3, and suppose the last base is the parity base which much be the sum of the first 3 bases mod 4. For example, AAAA is legal, AAAT, AAAG, and AAAC are illegal (stop). Also, all other 8 point mutations also wouldn't pass the parity check and are illegal. Thus, the first 3 bases can be anything, the fourth base depends on the first 3, and all single base mutations are illegal. Therefore 64 codons can be coded. In fact, you can just use the existing 3-base genetic code and just add the extra parity base. --Austin 18:01, 17 Oct 2005 (EDT)