- 1 Introduction
- 2 Genome Editing in Higher Eukaryotes
- 3 Protein-Directed Nuclease Technologies
- 4 The CRISPR/Cas9 System
- 5 Future Directions
- 6 iGEM Connections
- 7 References
The ability to alter the genomes of living organisms has been critical to our understanding of genetics and the development of synthetic biology as a viable field. In it's simplest form, genome editing involves generation of gene knockouts, where expression is eliminated through insertion or a removal of a region of genomic DNA, or knockins, where a new coding region is inserted to produce a novel gene product. This process is fairly straightforward in bacteria and yeast, where a cell's own homologous recombination machinery can be used to make genomic insertions, albeit with low efficiency. To carry this out, the cells are transformed with exogenous DNA (usually on a plasmid) containing the desired insertion sequence flanked by homology regions complementary to the target site sequences. Because very few cells undergo successful recombination, the inserted sequence must contain a selectable marker, such as an antibiotic resistance gene, to facilitate selection of modified cells. Thus, most knockouts are generated simply by inserting a marker in place of an existing gene, thus eliminating its expression.
Genome Editing in Higher Eukaryotes
Genomic insertions can also be generated in higher eukaryotes using homologous recombination, but the process is significantly more involved. Knockout mice have been a staple of genetics research since the 1980s, but they can take upwards of a year to generate. The process begins with embryonic stem cells (ESCs) harvested from a mouse blastocyst. They are then transfected with insert DNA by electroporation and successfully recombined cells are selected using an antibiotic such as neomycin. The surviving ESCs are then injected into another blastocyst and implanted into a surrogate mouse's uterus. Some of the resulting pups will be chimeric animals with a portion of their cells containing the modification. Subsequent breeding of the chimeras allows for generation of a knockout animal. 
Recombination alone is generally not a viable strategy for genome editing in non-ESC cells, including tissue culture cells and live animals due to a high ratio of off-target insertions, but the process can be greatly enhanced if the insertion site is cleaved to generate a double-stranded break. Several technologies exist to generate these breaks on a sequence specific basis, including zinc finger nucleases (ZFNs), transcription activator-like nucleases (TALENs), and recently developed CRISPR/Cas9 system. 
Protein-Directed Nuclease Technologies
Existing nuclease-based technologies require recognition of the genomic target site by a sequence-specific DNA binding domain. Both ZFNs and TALENs use DNA binding proteins modules tethered to an endonuclease domain to generate breaks at the correct positions, though off-target cleavage does occur due to the limited length of the recognition sequence and flexibility of DNA-binding specificity from different modules. Double-stranded breaks generated by the nucleases can be repaired through homology directed repair (HDR) which allows insertion of a new sequence with flanking homology arms, or by nonhomologous end joining (NHEJ), an imperfect process that often results in gene knockouts without any additional insertion. 
Zinc Finger Nucleases (ZFNs)
ZFNs are chimeric proteins consisting of a modular zinc finger-based DNA-binding domain fused to a nonspecific endonuclease domain from the FokI restriction enzyme. Each zinc finger domain is selected to bind a 12 bp sequence, but because the FokI endonuclease must form a dimer to generate double-stranded breaks, two ZFN units with different binding domains may be combined to recognize up to 24bp of sequence. ZFN's have been used successfully in a variety of organisms, including recent clinical trials in which HIV-resistant T cells were generated by knocking out the CCR5 co-receptor NCT01044654. 
The greatest challenge with using ZFNs comes from the fact that there is no obvious correlation between the amino-acid sequences of zinc finger DNA-binding domains and their target sites, so rational design of binding domains for specific sequences is impossible. They are instead generated by a directed evolution or selection process, but it's unclear what the range of possible target sequences is. 
Transcription Activator-Like Effector Nucleases (TALENs)
TALENs are a more recent development that gets around the problem of having to select for specific zinc finger binding domains. TALENs use a DNA-binding domain derived from Transcription Activator-Like (TAL) effectors from the Xanthomonas genus of bacterial plant pathogens. TAL binding domains contain a variable number of generally 34 amino acid imperfect repeats. Amino acids 12 and 13 of each repeating sequence are called a repeat-variable diresidue (RVD), which binds a specific pair of nucleotides at the target site. Although there is still work being done on deciphering the code of RVDs that correspond to different nucleotides, our present knowledge of the system allows for rational design of TAL binding domains to target DNA sequences of around 19bp. The TAL domains are fused to the FokI endonuclease domain as with ZFNs, and the resulting dimer can be engineered to target a 38bp region. 
The CRISPR/Cas9 System
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) are a genomic feature present in a wide range of bacteria and archae. Originally discovered in E. coli, CRISPR regions facilitate a form of acquired immunity against foreign double-stranded DNA (dsDNA) sequences in the cell, such as those introduced by a virus or plasmid. Although the mechanism by which the CRISPR system functions is not entirely understood, recent work has shown that the CRISPR-associated Cas9 endonuclease from Streptococcus pyogenes can be repurposed to generate site-specific double-stranded breaks through the use of a synthetic guide RNA.
CRISPR loci contain a series of repeating sequences separated by short, ~30 bp spacer regions that store sequences from previous invaders of the cell. These spacers are transcribed and used in a mechanism similar to eukaryotic RNA interference (RNAi) to degrade exogenous DNA matching the stored sequence. The process is facilitated by a series of CRISPR associated (cas) genes that vary in function and orientation between different species, but all fall into one of three main types. In general, transcripts of the CRISPR locus are processed into short CRISPR RNAs (crRNAs) which direct cleavage of complementary foreign DNA by either a single nuclease (type II CRISPR systems) or a multisubunit complex known as CASCADE (type I and III).
CRISPR/Cas9 Genome Editing
Streptococcus pyogenes uses a type II CRISPR system in which a single endonuclease, Cas9 is necessary for RNA-directed cleavage of foreign dsDNA. Cas9 forms a complex with two small RNA molecules, a crRNA and a trans-activating crRNA (tracRNA) complementary to the pre-crRNA. The tracrRNA molecule is necessary not only for processing of the pre-crRNA into mature crRNA by RNase III, but also for Cas9 DNA cleavage. When a Cas9, tracrRNA, crRNA complex is formed, 20 nucleotides of the crRNA are oriented to directly base-pair with a strand of the target DNA. It was recently demonstrated that a single, synthetic RNA, dubbed sgRNA, containing elements of both the crRNA and tracrRNA could be used to stimulate Cas9 cleavage at a site determined by the sequence of the DNA-base pairing region from the crRNA portion. Thus, an easily targeted complex containing just a single RNA and one molecule of Cas9 is sufficient to generate a double-strand break at a specific 20bp site. 
Several recent publications have shown CRISPR/Cas9 targeting rates similar to those obtained with ZFNs and TALENs in a variety of eukaryotic cells including zebrafish embryos, human embryonic kidney (HEK) 293T cells, and K562 immortalised human myelogenous leukemia cells. The system has also been used to efficiently modify bacterial S. pneumoniae and E. coli bacterial cells extremely efficiently, with nearly 100% and 65% of recovered cells containing the desired mutations respectively. 
One of the most difficult problems arising from any of the previously mentioned technologies is the number of off-target insertions/deletions that are generated. Although the characteristics of ZFNs are reasonably well understood if highly dependent on which zinc finger is used, newer technologies, such as TALENs and CRISPR/Cas9 have yet to be used enough to accurately gauge how precise they can be. TALENs can be made to recognize long stretches of target DNA by increasing the number of repeat modules, but some flexibility remains in target site binding. They also generally require thymidines at either end of the target site. CRISPR/Cas9 binds a much shorter region of the DNA with high specificity, but presently is limited to targets with a 3'-NGG or NAG sequence, thus restricting the possible locations for insertions. Additional research may yield altered forms of either recognition scheme that expand the array of functional targets and reduce off-target binding. Additionally, alternative technologies such as group II intron-directed targeting may prove viable in the future. 
Because generation of novel zinc finger binding domains has been the greatest challenge in the use of ZFNs, the 2011 Harvard iGEM team used a computational approach to design 55,000 possible zinc finger sequences in the hopes of targeting 6 specific DNA sequences for which no zinc finger binding domains presently exist. They then synthesized and expressed these zinc finger sequences using a one-hybrid selection system in E. coli to identify functional motifs. They identified 15 sequences that may correspond to functional zinc finger binding domains, although additional characterization of each module has not been completed.
In order to simplify the creation of new TALENs, the 2012 University of Freiburg iGEM team developed a kit-based approach for cloning novel TALENs using Golden Gate cloning. Cloning of TAL binding domains using traditional methods is very difficult due to the large number of repeating sequences, but the team's approach reduces the number of necessary steps and increases the likelihood of generating full-length TALENs from existing TAL module sequences.
Georgia Tech 2011
The 2011 Georgia Tech iGEM team sought to transfer the CRISPR system from Streptococcus thermophilus, commonly used to make yogurt, to E. coli and Bacillus subtilis as a way of reducing antibiotic resistance in wild bacterial populations. Their plan involved introducing spacer sequences matching those of common antibiotic resistance genes, such that the modified cells would degrade resistance plasmids that they encountered. The project was modeled in silico but never progressed to the experimental stage.
Arizona State 2011
The 2011 Arizona State iGEM team experimented with generating permanent resistance in E. coli to specific sequences by introducing artificial CRISPR spacers into strains lacking genes necessary for generation of new spacers. They used a GFP expression plasmid as a target, but the complexity of the E. coli CRISPR system caused other problems, including inconsistent assembly of tandem repeats. They ultimately switched to working with Listeria innocua, which has a simpler type II CRISPR system but the project did not advance past cloning of spacer regions. They also used Bacillus halodurans to investigate the assembly of an RNA-targeting CRISPR system and made some progress in cloning the spacers here as well.
- Thomas KR, Folger KR, and Capecchi MR. High frequency targeting of genes to specific sites in the mammalian genome. Cell. 1986 Feb 14;44(3):419-28. DOI:10.1016/0092-8674(86)90463-0 |
Homologous recombination in mouse ESCs
- Kuehn MR, Bradley A, Robertson EJ, and Evans MJ. A potential animal model for Lesch-Nyhan syndrome through introduction of HPRT mutations into mice. Nature. 1987 Mar 19-25;326(6110):295-8. DOI:10.1038/326295a0 |
Knockout mouse generation
- Mussolino C and Cathomen T. RNA guides genome engineering. Nat Biotechnol. 2013 Mar;31(3):208-9. DOI:10.1038/nbt.2527 |
Comparison of ZFN, TALEN, and CRISPR technologies
- Santiago Y, Chan E, Liu PQ, Orlando S, Zhang L, Urnov FD, Holmes MC, Guschin D, Waite A, Miller JC, Rebar EJ, Gregory PD, Klug A, and Collingwood TN. Targeted gene knockout in mammalian cells by using engineered zinc-finger nucleases. Proc Natl Acad Sci U S A. 2008 Apr 15;105(15):5809-14. DOI:10.1073/pnas.0800940105 |
Zinc finger nuclease editing in CHO cells
- Perez EE, Wang J, Miller JC, Jouvenot Y, Kim KA, Liu O, Wang N, Lee G, Bartsevich VV, Lee YL, Guschin DY, Rupniewski I, Waite AJ, Carpenito C, Carroll RG, Orange JS, Urnov FD, Rebar EJ, Ando D, Gregory PD, Riley JL, Holmes MC, and June CH. Establishment of HIV-1 resistance in CD4+ T cells by genome editing using zinc-finger nucleases. Nat Biotechnol. 2008 Jul;26(7):808-16. DOI:10.1038/nbt1410 |
CCR5 knockout in CD4+ T cells using ZFNs
- Wilen CB, Wang J, Tilton JC, Miller JC, Kim KA, Rebar EJ, Sherrill-Mix SA, Patro SC, Secreto AJ, Jordan AP, Lee G, Kahn J, Aye PP, Bunnell BA, Lackner AA, Hoxie JA, Danet-Desnoyers GA, Bushman FD, Riley JL, Gregory PD, June CH, Holmes MC, and Doms RW. Engineering HIV-resistant human CD4+ T cells with CXCR4-specific zinc-finger nucleases. PLoS Pathog. 2011 Apr;7(4):e1002020. DOI:10.1371/journal.ppat.1002020 |
Engineering HIV-resistant human CD4+ T cells with CXCR4-specific ZFNs
- Moscou MJ and Bogdanove AJ. A simple cipher governs DNA recognition by TAL effectors. Science. 2009 Dec 11;326(5959):1501. DOI:10.1126/science.1178817 |
Deciphering TAL domain DNA binding.
- Boch J, Scholze H, Schornack S, Landgraf A, Hahn S, Kay S, Lahaye T, Nickstadt A, and Bonas U. Breaking the code of DNA binding specificity of TAL-type III effectors. Science. 2009 Dec 11;326(5959):1509-12. DOI:10.1126/science.1178811 |
TAL DNA sequence specificity
- Christian M, Cermak T, Doyle EL, Schmidt C, Zhang F, Hummel A, Bogdanove AJ, and Voytas DF. Targeting DNA double-strand breaks with TAL effector nucleases. Genetics. 2010 Oct;186(2):757-61. DOI:10.1534/genetics.110.120717 |
- Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, and Charpentier E. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science. 2012 Aug 17;337(6096):816-21. DOI:10.1126/science.1225829 |
Cas9 function and RNA targeting
- Gasiunas G, Barrangou R, Horvath P, and Siksnys V. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc Natl Acad Sci U S A. 2012 Sep 25;109(39):E2579-86. DOI:10.1073/pnas.1208507109 |
- Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, Peterson RT, Yeh JR, and Joung JK. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol. 2013 Mar;31(3):227-9. DOI:10.1038/nbt.2501 |
Zebrafish CRISPR/Cas9 targeting
- Cho SW, Kim S, Kim JM, and Kim JS. Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease. Nat Biotechnol. 2013 Mar;31(3):230-2. DOI:10.1038/nbt.2507 |
Human cell line CRISPR/Cas9 targeting
- Jiang W, Bikard D, Cox D, Zhang F, and Marraffini LA. RNA-guided editing of bacterial genomes using CRISPR-Cas systems. Nat Biotechnol. 2013 Mar;31(3):233-9. DOI:10.1038/nbt.2508 |
Bacterial CRISPR/Cas9 targeting