In the human genome project, the biological and medical circles all over the world are studying the sequence of eukaryotic chromatin genes in the human genome. It is found that the number of human genes is less than originally expected, and exons, that is, coding sequences that can make protein, only account for 1.5% of the total length.
Modern geneticists believe that a gene is the general name of a specific nucleotide sequence with genetic effect on DNA (deoxyribonucleic acid) molecules, and it is a DNA molecular fragment with genetic effect. Genes are located on chromosomes and arranged linearly on chromosomes. Genes can not only transmit genetic information to the next generation through replication, but also express genetic information. The differences in hair, skin color, eyes and nose among different races are caused by genetic differences.
Humans have only one genome, with about 50,000 to 65,438+million genes.
With the gradual decoding of the human genome, a picture of life will be drawn, and people's lives will undergo tremendous changes. Gene drugs have entered people's lives, and it is no longer an extravagant hope to treat more diseases with genes. Because as our understanding of human beings reaches a new level, the causes of many diseases will be uncovered, drugs will be better designed, treatment programs will be able to "suit the remedy to the case", life and eating habits may be adjusted according to genetic conditions, the overall health level of human beings will be improved, and the medical foundation of 2 1 century will be laid.
Using genes, people can improve the varieties of fruits and vegetables, improve the quality of crops, more transgenic plants, animals and foods will come out, and human beings may cultivate super crops in the new century. By controlling the biochemical characteristics of human body, human beings will be able to restore or repair the functions of human cells and organs, and even change the evolution process of human beings.
Human Genome Project The Human Genome Project (HGP) was first proposed by American scientists in 1985 and officially launched in 1990. Scientists from the United States, Britain, France, Germany, Japan and China participated in the $3 billion human genome project. According to this plan, in 2005, the passwords of about 654.38+ten thousand genes in the human body will all be unlocked, and the human gene map will be drawn. In other words, it is necessary to uncover the secret of the 3 billion base pairs that make up the 654.38+10,000 genes of the human body. The Human Genome Project, the Manhattan Atomic Bomb Project and the Apollo Project are also called the three major scientific projects.
1986, renato dulbecco, the Nobel Prize winner, published the essay "The Turning Point of Cancer Research: Sequencing the Human Genome" (Science, 231:1055 ~1056). The article pointed out: "If we want to know more about tumors, we must pay attention to the genome of cells from now on. ..... Which species should we start with? If you want to understand human tumors, you must start with humans. ..... A detailed understanding of DNA will greatly promote human tumor research. "
What is the genome? Genome is the whole composition of all genes in a species. The human genome has two meanings: genetic information and genetic material. To reveal the mystery of life, it is necessary to study the existence, structure and function of genes and the relationship between genes from the whole level.
Why choose the human genome for research? Because human beings are the most advanced creatures in the process of "evolution", the study of them is helpful to know themselves, master the law of birth, aging, illness and death, diagnose and treat diseases, and understand the origin of life.
Measure the sequence of 3 billion base pairs of human genome DNA, find all human genes, find out their positions on chromosomes and decipher all human genetic information.
In the human genome project, it also includes the research on the genomes of Escherichia coli, yeast, nematodes, fruit flies and mice, which are called the five "model organisms" of human beings.
The purpose of HGP is to decode life, understand the origin of life, understand the law of life growth and development, understand the reasons of species and individual differences, understand the mechanism of disease and life phenomena such as longevity and aging, and provide scientific basis for the diagnosis and treatment of diseases.
The main task of HGP is human DNA sequencing, in addition to sequencing technology, human genome sequence variation, functional genomics technology, comparative genomics, social, legal, ethical research, bioinformatics and computational biology, education and training.
1. Genetic map
Genetic map, also known as linkage map, takes the genetic marker with genetic polymorphism (one locus has more than one allele, and the frequency of occurrence in the population is higher than 1%) as the "signpost", and the genetic distance (the percentage of exchange recombination between two loci in meiosis, and the recombination rate is 1% called 1cm) as the map distance. The establishment of genetic map creates conditions for gene identification and gene location. Significance: More than 6,000 genetic markers have been able to divide the human genome into more than 6,000 regions, so that linkage analysis can find evidence that a pathogenic or phenotypic gene is close to a marker, so that the gene can be located in this known region, and then the gene can be isolated and studied. For diseases, finding and analyzing genes is a key.
First generation markers: classical genetic markers, such as ABO blood group markers and HLA markers. In the middle and late 1970s, restriction fragment length polymorphism (RFLP) occurred, with the number of loci as high as 105, and the DNA strand was specifically cut by restriction endonucleases. Due to the variation of a "point" of DNA, fragments of different lengths (allele fragments) may be produced. Polymorphism can be displayed by gel electrophoresis, and it can be found by linkage analysis between fragment polymorphism information and disease phenotype. Like Huntington's disease. However, 2~3 fragments are digested at a time, and the information is limited.
The second generation marker: 1985, the variable tandem repeat VNTR in the center of small satellite can provide fragments with different lengths, and its repeat unit length is 6~ 12 nucleotides. 1989, a microsatellite marker system was discovered and established, with a repeat unit length of 2-6 nucleotides, also known as short tandem repeats (STR).
The third generation marker: Lander ES of 1996 MIT put forward the genetic marker system of SNP. The mutation rate of each nucleotide is 10~9, and the number of diallel markers in human genome can reach 3 million, with an average of about one per 1250 base pairs. There are 8 ~ 16 haplotypes composed of 3 ~ 4 adjacent markers.
2. Physical atlas
Physical map refers to the information about the arrangement and spacing of all genes that make up the genome, which is drawn by measuring the DNA molecules that make up the genome. The purpose of drawing physical maps is to arrange the genetic information about genes and their relative positions on each chromosome in a linear system. The physical map of DNA refers to the arrangement order of restriction fragments of DNA chain, that is, the position of restriction fragments on DNA chain. Because restriction endonucleases are based on specific sequences, DNA fragments with different nucleotide sequences will be produced after digestion, thus forming a unique digestion map. Therefore, the physical map of DNA is one of the characteristics of DNA molecular structure. DNA is a very large molecule, and the DNA fragment produced by restriction endonuclease for sequencing reaction is only a very small part of it. The position relationship of these fragments in DNA chain is the first problem to be solved, so the physical map of DNA is the basis of sequence determination and can also be understood as the blueprint to guide DNA sequencing. Broadly speaking, DNA sequencing begins with making physical maps, which is the first step of sequencing. There are many ways to make a physical map of DNA. Here, we choose a common and simple method-partial enzymolysis of labeled fragments to illustrate the drawing principle.
Determining the physical map of DNA by partial enzymatic hydrolysis includes two basic steps:
(1) complete degradation: select appropriate restriction endonuclease to completely degrade the DNA chain (radioisotope label) to be detected, and the degradation product is separated by gel electrophoresis and developed by itself, and the obtained map is the number and size of restriction fragments constituting the DNA chain.
(2) Partial degradation: a strand of DNA to be detected is labeled with a tracer isotope, and then the DNA strand is partially degraded by the same enzyme, that is, by controlling the reaction conditions, the gaps of the enzymes on the DNA strand are randomly broken to avoid complete degradation of all the gaps. Part of the enzymatic hydrolysis products were also separated by electrophoresis and self-developed. By comparing the autoradiographs of the above two steps, according to the fragment size and the difference between them, the position of the restriction fragment on the DNA chain can be discharged. The following is a detailed description of the DNA physical map of the histone gene.
A complete physical map should include the overlapping group map of DNA clone fragments of different vectors in the human genome, the cutting point map of large fragments of restriction endonucleases, the marker map of DNA fragments or specific DNA sequences (STS), the marker map of characteristic sequences widely existing in the genome (such as CpG sequences, Alu sequences, isovolumes), the cytogenetic map of the human genome (i.e. regions, bands and subbands of chromosomes, or marked by the percentage of chromosome length), and finally.
The basic principle is to "break" the huge DNA that cannot be started, and then splice it. Mb, kb and bp are used as graphic distances, and STS (sequence tag site) sequences of DNA probes are used as road signs. 1998 completed the physical map of continuous cloning with 52,000 sequence tag sites (STS), covering most areas of the human genome. One of the main contents of constructing physical map is to connect DNA clone fragments containing STS corresponding sequences into overlapping "overlapping groups" The library containing human DNA fragments with "YAC" as the carrier has included the construction of a highly representative fragment overlap group with a total coverage of 100%. In recent years, more reliable BAC, PAC or cosmid libraries have been developed.
3. Sequence diagram
With the completion of gene map and physical map, sequencing has become the most important. DNA sequence analysis is a multi-stage process including DNA fragmentation, base analysis and DNA information translation. The sequence map of the genome was obtained by sequencing.
Significance of HGP to human beings
1 Study of HGP on Human Disease Genes
Genes related to human diseases are important information for the structural and functional integrity of human genome. For monogenic diseases, the new ideas of "positional cloning" and "positional candidate cloning" have led to the discovery of a large number of genes that cause monogenic diseases such as Huntington's disease, hereditary colon cancer and breast cancer, laying the foundation for gene diagnosis and gene therapy of these diseases. At present, polygenic diseases such as cardiovascular diseases, tumors, diabetes, neuropsychiatric diseases (Alzheimer's disease, schizophrenia) and autoimmune diseases are the focus of disease gene research. Health-related research is an important part of HGP. 1997, "Tumor Genome Anatomy Plan" and "Environmental Genome Plan" were put forward one after another.
2.2 contribution. HGP wants medicine
Gene diagnosis, gene therapy and therapy based on genome knowledge, disease prevention based on genome information, identification of susceptible genes, lifestyle of risk population and intervention of environmental factors.
3.3 contribution. HGP to biotechnology
(1) Genetically engineered drugs: secreted proteins (polypeptide hormones, growth factors, chemokines, coagulation and anticoagulation factors, etc. ) and their receptors.
(2) Diagnostic and research reagent industry: gene and antibody kits, biochips for diagnosis and research, disease and drug screening models.
(3) Promoting cell, embryo and tissue engineering: embryonic and adult stem cells, cloning technology and organ reconstruction.
4.4 contribution. HGP to pharmaceutical industry
Screening drug targets: Combining combinatorial chemistry and natural compound separation technology, Qualcomm receptor and enzyme binding test were established. Knowledge-based Drug Design: Advanced Structural Analysis, Prediction and Simulation of Genes and protein Products —— Drug Action Pocket.
Individualized drug therapy: pharmacogenomics.
The important influence of 5.5. HGP on social economy
Biological industry and information industry are two economic pillars of a country; Social and economic benefits of discovering new functional genes; Genetically modified food; Genetically modified drugs (such as diet pills and height-increasing drugs).
The impact of 6.6. HGP's Study on Biological Evolution
The evolutionary history of organisms is engraved on the "heavenly book" of each genome; Paramecium is a relative of human beings-65.438+0.3 billion years; Humans evolved from a kind of monkey 3 to 4 million years ago. Humans "walked out of Africa" for the first time-2 million years of ancient apes; The human "Eve" came from Africa, 200,000 years ago-the second "out of Africa".
The negative impact of 7.7. Human Genome Project (human genome project)
Jurassic Park is not just a science fiction story; Racial selective extermination of biological weapons; Gene patent war; Predatory war of genetic resources; Genetic and personal privacy.