CRISPR-Cas9 is the third generation of gene editing technology after the introduction of ZFN, TALENs and other gene editing technologies. In just a few years, CRISPR-Cas9 technology has become popular all over the world and has become one of the existing gene editing and gene modification technologies. It is one of the most efficient, simplest, lowest-cost, and easiest-to-use technologies and has become the most mainstream gene editing system today.
1. What is the CRISPR-Cas system
The CRISPR-Cas system is a natural immune system of prokaryotes. After being invaded by a virus, certain bacteria can store a small segment of the viral gene into a storage space called CRISPR in their own DNA. When encountering a virus invasion again, the bacteria can identify the virus based on the stored fragments and cut off the virus' DNA to render it ineffective.
C RISPR-Cas system consists of two parts: CRISPR locus and Cas gene (CRISPR-associated gene).
1. CRISPR (/'kr?sp?r/) is a repetitive sequence in the genome of prokaryotes. The full name of CRISPR is Clustered Regularly Interspersed Short Palindromic Repeats. Distributed among 40% of sequenced bacteria and 90% of sequenced archaea. (Note: Living in extreme environments such as deep-sea volcanoes, hot springs on land, and salt-alkali lakes, there are some bacteria with unique structures called archaea)
The CRISPR gene sequence mainly consists of a leader sequence ( It is composed of leader), repeat sequence (repeat) and spacer sequence (spacer).
① Leader sequence: rich in AT bases, located upstream of the CRISPR gene, and is considered to be the promoter of the CRISPR sequence.
② Repeated sequences: approximately 20–50 bp in length and containing 5–7 bp palindromic sequences. The transcripts can form a hairpin structure and stabilize the overall secondary structure of RNA.
③ Spacer sequence: It is a foreign DNA sequence captured by bacteria. This is equivalent to the "blacklist" of the bacterial immune system. When these foreign genetic materials invade again, the CRISPR/Cas system will accurately attack them.
2. The Cas gene is located near the CRISPR gene or scattered elsewhere in the genome. The proteins encoded by this gene can interact with the CRISPR sequence region. Therefore, the gene was named CRISPR associated gene (CRISPR associated, Cas).
The Cas protein encoded by the Cas gene is crucial in the defense process. Currently, multiple types of Cas genes such as Cas1-Cas10 have been discovered.
Based on the role of Cas proteins in the bacterial immune defense process, CRISPR-Cas systems are currently divided into two major categories.
The first category: their effector factors for cleaving exogenous nucleic acids are complexes formed by multiple Cas proteins, including type I, type III and type IV.
The second category: their acting factors are relatively single Cas proteins, such as type II Cas9 protein and type V Cpf protein.
Currently, the most widely used CRISPR system is the type II CRISPR-Cas system, which is the CRISPR-Cas9 system.
2. The working principle of CRISPR-Cas9
The working mechanism of CRISPR-Cas9 can be understood in three stages.
1. The first stage: Obtaining the highly variable spacer region of CRISPR (capturing foreign DNA and registering the "blacklist")
The highly variable spacer region of CRISPR Acquisition actually means that a short DNA sequence of the foreign invading phage or plasmid DNA is integrated into the genome of the host bacteria. The integration position is between the two repeat sequences at the 5' end of CRRSPR. Therefore, the arrangement of spacer sequences in the CRISPR locus from 5' to 3' also records the temporal sequence of invasion of foreign genetic material.
The acquisition of new spacer sequences may be divided into three steps:
Step 1: The proteins encoded by Cas1 and Cas2 will scan the invading DNA and identify the PAM region, and then close the The DNA sequence of PAM serves as a candidate protospacer sequence.
Step 2: The Cas1/2 protein complex cuts the protospacer sequence from the foreign DNA, and with the assistance of other enzymes, inserts the protospacer sequence downstream of the leader region adjacent to the CRISPR sequence.
Step 3: DNA will be repaired to close the open double-stranded gap. In this way, a new spacer sequence is added to the CRISPR sequence of the genome.
2. The second stage: expression of the CRISPR locus (including transcription and post-transcriptional maturation processing)
The CRISPR sequence is transcribed under the control of the leader region to produce pre-crRNA (crRNA) precursor), and tracrRNA (transactivating crRNA) complementary to the pre-crRNA sequence is also transcribed. Pre-crRNA forms double-stranded RNA with tracrRNA through complementary base pairing and assembles into a complex with the protein encoded by Cas9. It will select the corresponding "ID card number" (spacer sequence RNA) based on the type of intruder, and with the assistance of ribonuclease III (RNase III), cut this "ID card" to form a short crRNA (contains a single type of spacer sequence RNA and part of the repetitive sequence region).
crRNA, Cas9 and tracrRNA form the final complex to prepare for the next step of cutting.
3. The third stage: the activation of CRISPR/Cas system activity (targeted interference)
The final complex composed of crRNA, Cas9 and tracrRNA is like a guided missile. Can carry out precise attack on the invader's DNA. This complex will scan the entire foreign DNA sequence and identify protospacer sequences that are complementary to the crRNA. At this time, the complex will be localized to the region of the PAM/protospacer sequence, and the DNA double strands will be unraveled to form an R-Loop. The crRNA will hybridize to the complementary strand, while the other strand remains free.
Subsequently, the precise blunt-end cleavage site of the Cas9 protein is located 3 nucleotides upstream of PAM, forming a blunt-ended product.
The HNH domain of Cas9 protein is responsible for cutting the DNA strand that is complementary to crRNA, while the RuvC domain is responsible for cutting the other non-complementary DNA strand. Finally, under the action of Cas9, DNA double-strand breaks (DSB) occur, the expression of foreign DNA is silenced, and the invaders are eliminated in one fell swoop.
3. CRISPR-Cas9 gene editing technology and applications...
tracrRNA-crRNA can also play a role in guiding Cas9 when fused into a single-stranded guide RNA (sgRNA).
CRISPR-Cas9 gene editing technology uses artificially designed sgRNA (guide RNA) to identify the target genome sequence and guide the Cas9 protease to effectively cut the DNA double strands, forming double strand breaks. Repair after damage will cause Gene knockout or knock-in, etc., ultimately achieve the purpose of modifying genomic DNA.
Wide application of CRISPR-Cas9
1. Gene knockout (Knock-out)
Cas9 can cut the target genome to form double DNA. The chain breaks. Under normal circumstances, cells use highly efficient non-homologous end joining (NHEJ) to repair broken DNA. However, during the repair process, mismatching of base insertion or deletion usually occurs, resulting in frameshift mutation. (Frameshift mutation: refers to the change in the reading frame of the DNA molecule due to the deletion or insertion of a base at a certain site, resulting in A series of downstream code changes change the gene that originally encoded a certain peptide chain into a completely different peptide chain sequence, causing the target gene to lose its function, thereby achieving gene knockout. In order to improve the specificity of the CRISPR system, one domain of Cas9 can be mutated to form a Cas9 nickase nuclease that can only cut single strands of DNA to create DNA nicks. Therefore, if you want to create a double-stranded break effect, you can design two sgRNA sequences to target the two complementary strands of DNA respectively. In this way, the two sgRNAs specifically bind the target sequence to form a DNA break, and the DNA will be broken through migration during the repair process. Code mutation to achieve gene knockout
2. Gene knock-in (Knock-in)
When the DNA double-strand breaks, if a DNA repair template enters the cell, the broken part of the genome Homologous recombination repair (HDR) is performed based on the repair template to achieve gene knock-in. The repair template consists of the target gene to be imported and the homology sequences (homology arms) upstream and downstream of the target sequence. The length and position of the homology arms are determined by the size of the editing sequence. DNA repair templates can be linear/double-stranded deoxynucleotide strands or double-stranded DNA plasmids. HDR repair patterns occur at low rates in cells, usually less than 10. In order to increase the success rate of gene knock-in, many scientists are currently working on improving the HDR efficiency, synchronizing the edited cells to the most active cell division period of HDR, and promoting the repair method to proceed with HDR; or using chemical methods to inhibit genes for NHEJ to improve Efficiency of HDR
3. Gene suppression, gene activation (Repression or Activation)
The characteristic of Cas9 is that it can independently bind and cut the target gene, and the two functions of Cas9 can be modified through point mutations. The RuvC- and HNH- domains are inactive, and the resulting dCas9 can only bind to target genes under the mediation of sgRNA, but does not have the function of shearing DNA. Therefore, binding dCas9 to the transcription start site of a gene can block the start of transcription, thereby inhibiting gene expression; binding dCas9 to the promoter region of a gene can also bind to transcription repressors/activators, thereby inhibiting the transcription of downstream target genes. Inhibit or activate.
Therefore, the difference between dCas9 and Cas9 and Cas9 nickase is that the activation or inhibition caused by dCas9 is reversible and does not cause permanent changes to genomic DNA.
4. Multiplex Editing
By transferring multiple sgRNA plasmids into cells, multiple genes can be edited at the same time, which has the function of screening genome functions. Applications of multiplex editing include using dual Cas9nickases to improve the accuracy of gene knockouts, large-scale genome deletions, and editing different genes simultaneously. Normally, 2 to 7 different sgRNAs can be constructed on one plasmid for multiplex CRISPR gene editing.
5. Functional genome screening
Using CRISPR-Cas9 for gene editing can produce a large number of gene mutant cells. Therefore, these mutant cells can be used to confirm whether the phenotypic changes are caused by genes or Caused by genetic factors. The traditional method for genome screening is shRNA technology, but shRNA has its limitations: it has high off-target effects and cannot inhibit all genes, resulting in false negative results. The genome screening function of the CRISRP-Cas9 system has the advantages of high specificity and irreversibility, and has been widely used in genome screening. At present, the genome screening function of CRISPR is used to screen related genes that regulate phenotypes, such as genes that inhibit the production of chemotherapy drugs or toxins, genes that affect tumor migration, and the construction of virus screening libraries for large-scale screening of potential genes.