Human genome project? What impact does it have on human society?

The human genome project (HGP) was first proposed by American scientists in 1985, and was officially launched in 1990. Scientists from the United States, Britain, France, Germany, Japan and China participated in this human genome project with a budget of $3 billion. According to this plan, in 2005, all the passwords of about 20,000-25,000 genes in the human body will be unlocked, and the map of human genes will be drawn at the same time. In other words, it is to uncover the secret of 3 billion base pairs that make up 20,000-25,000 genes of human beings. The Human Genome Project, the Manhattan Atomic Bomb Project and the Apollo Project are also called the three major scientific projects. Known as the "moon landing program" of life science.

The Human Genome Project (HGP) is a large-scale, transnational and interdisciplinary scientific exploration project. Its purpose is to determine the nucleotide sequence of 3 billion base pairs contained in human chromosome (haploid), so as to draw the human genome map, identify the genes and their sequences contained in it, and achieve the ultimate goal of deciphering human genetic information. Genome project is an important step for human beings to explore their own mysteries, and it is another great project in the history of human science after Manhattan project and Apollo moon landing project. By 2005, the sequencing of the human genome project had been completed. Among them, the publication of the draft human genome work in 20001year (independently completed and published by the international human genome project funded by public funds and Celera Genome Company, a private company) is considered as a milestone in the success of the human genome project.

Significance of gene map

It can effectively reflect the Shi Kongtu of the whole gene expressed under normal or controlled conditions. Through this picture, we can know the expression of a gene in different tissues and levels at different times. We can also know the different expression levels of different genes in a tissue at different times, and we can also know the different expression levels of different genes in different tissues at a specific time.

The human genome is an international cooperation project: describing the characteristics of the human genome, sequencing and mapping the DNA of selected model organisms, developing new technologies for genome research, improving the ethical, legal and social issues involved in human genome research, cultivating scientists who can use these technologies and resources developed by HGP for biological research, and promoting human health.

Fold and edit other materials in this paragraph.

Contribution of Folding to Human Disease Gene Research

Genes related to human diseases are important information for the structural and functional integrity of human genome. For monogenic diseases, the new ideas of "positional cloning" and "positional candidate cloning" have led to the discovery of a large number of genes that cause monogenic diseases such as Huntington's disease, hereditary colon cancer and breast cancer, laying the foundation for gene diagnosis and gene therapy of these diseases. At present, polygenic diseases such as cardiovascular diseases, tumors, diabetes, neuropsychiatric diseases (Alzheimer's disease, schizophrenia) and autoimmune diseases are the focus of disease gene research. Health-related research is an important part of HGP. 1997, "Tumor Genome Anatomy Plan" and "Environmental Genome Plan" were put forward one after another.

The contribution of folding to medicine

Gene diagnosis, gene therapy and therapy based on genome knowledge, disease prevention based on genome information, identification of susceptible genes, lifestyle of risk population and intervention of environmental factors.

Contribution of folding to biotechnology

Genetically engineered drugs

Secretion of protein (polypeptide hormones, growth factors, chemokines, coagulation and anticoagulation factors, etc. ) and their receptors.

(2) Diagnostic and research reagent industry

Gene and antibody kits, biochips for diagnosis and research, disease and drug screening models.

Promote cell, embryo and tissue engineering

Embryonic and adult stem cells, cloning technology, organ reconstruction.

Contribution of folding to pharmaceutical industry

Screening drug targets: Combining combinatorial chemistry and natural compound separation technology, Qualcomm receptor and enzyme binding test were established. Knowledge-based Drug Design: Advanced Structural Analysis, Prediction and Simulation of Genes and protein Products —— Drug Action Pocket.

Individualized drug therapy: pharmacogenomics.

The important influence of folding on social economy

Biological industry and information industry are two economic pillars of a country; Social and economic benefits of discovering new functional genes; Genetically modified food; Genetically modified drugs (such as diet drugs and height-increasing drugs)

The influence of folding on the study of biological evolution

The evolutionary history of organisms is engraved on the "heavenly book" of each genome; Paramecium is a relative of human beings-65.438+0.3 billion years; Humans evolved from a kind of monkey 3-4 million years ago. Humans "walked out of Africa" for the first time-2 million years of ancient apes; The human "Eve" came from Africa, 200,000 years ago-the second "out of Africa"?

Negative effects of folding

Jurassic Park is not just a science fiction story; Racial selective extermination of biological weapons; Gene patent war; Predatory war of genetic resources; Genetic and personal privacy.

Application example of folding editing this paragraph

Folding disease gene

One of the key applications of human genome research is to find disease genes with unknown biochemical functions through localized cloning. The method includes mapping the chromosome region containing these genes through linkage analysis of the affected families, and then examining the region to find the genes.

Location cloning is very useful, but it is also boring. When this method was first proposed in the early1980s, researchers wishing to achieve localized cloning had to generate genetic markers to track inheritance, walk chromosomes to obtain genomic DNA covering the region, and analyze the region with the size of about 1Mb by direct sequencing or indirect genetic identification. 1In the mid-1990s, with the support of the Human Genome Project, with the development of human chromosome inheritance and physical map, the first two obstacles were cleared. However, the remaining obstacles are still difficult to overcome.

All these will change with the practicality of the draft human genome sequence. The human genome sequence in the public database makes it possible for the computer to quickly identify candidate genes, and then the mutation detection of related candidate genes needs the help of gene structure information.

Now, for Mendel's hereditary diseases, it is often possible to carry out a gene search in a research group of appropriate size in a few months. At least 30 disease genes have been directly located and cloned through genome sequences provided by the public. Because most human sequences were obtained in the past 12 months, many similar findings may not have been published.

In addition, in many cases, genome sequence plays an auxiliary role, such as providing candidate microsatellite markers for good genetic linkage analysis. (200 1, scientists in China, Shanghai and Beijing discovered the type ⅱ gene necessary for hereditary milk Koga).

Genome sequencing is also helpful to reveal the mechanism leading to many common chromosome deletion syndromes. In some cases, duplication deletion was found, which was caused by unequal exchange of homologous weight combinations replicated in almost the same large chromosome. Examples include the degeorge/Velocidal syndrome region on chromosome 22 and the repeated deletion of Williams-Beuren syndrome on chromosome 7.

The availability of genomic sequences also allows rapid identification of parallel homology of disease genes, which is valuable for two reasons. First of all, the mutation of collateral homologous genes will cause related genetic diseases. A good example found by using genome sequences is color blindness (complete color blindness).

CNGA3 gene is a subunit encoding GMP-gated channel in cone photoreceptor ring, which indicates that there are mutations in some color-blind families. Computer search of genome sequence showed that the collateral homologous gene encoded the corresponding B subunit, CNGB3 (which did not appear in EST database). CNGB3 gene was quickly recognized by other families as the cause of color blindness. Another example is the premature aging 1 and premature aging 2 genes, and their mutations may lead to the early occurrence of Alzheimer's disease.

The second reason is that collateral homologues can provide opportunities for treatment. For example, in individuals with sickle cell disease or β thalassemia, attempts are made to reactivate the hemoglobin gene expressed in embryos, which is caused by the mutation of β globulin gene.

We systematically searched 97 1 collateral homologues of known human disease genes in online human mendelian genetic database (OMIM) and SwissProt or TrEMBL protein database. We identified 286 potential collateral homologues (at least 50 amino acids matched, with identity over 70% but less than 90% on the same chromosome and less than 95% on different chromosomes). Although this analysis can identify some pseudogenes, 89% of the matches show the homology of more than one exon in the new target sequence, which means that many of them are functional. This analysis shows the potential to quickly identify disease genes in computers.

Folding drug target

In the past century, the pharmaceutical industry has largely relied on limited drug targets to develop new treatments. The latest outline lists 483 drug targets, which can be regarded as solving all the drugs on the market. Knowing all human genes and protein will greatly expand the search for suitable drug targets. Although only a few human genes can be used as drug targets, it can be predicted that the number will be more than several thousand, which will lead to the large-scale development of genome research in drug research and development. Some examples can illustrate this point:

⑴ Neurotransmitter (5-HT) mediates rapid excitatory response through chemically gated channels. The previously identified 5-HT3A receptor gene produces functional receptors, but its conductance is much smaller than that in vivo. Cross-hybridization experiments and EST analysis failed to reveal other homologues of known receptors.

Recently, however, a putative homologue was identified on the long arm of chromosome 1 1 of PAC clone by searching the draft sequence of human genome with low requirements. Homologues were expressed in striatum, caudate nucleus and hippocampus, and then full-length cDNA was obtained. This gene encoding amine receptor is named 5-HT3B. When combined with 5-HT3A to form heterodimers, it seems to be responsible for large conductance ceramide channels. Given the central role of amine pathway in mental illness and schizophrenia, the discovery of a major new therapeutic target is very interesting.

⑵ The contractile and inflammatory effects of cysteinyl leukotriene were previously considered as slow reaction substances (SRS-A) of allergic reactions, which were mediated by specific receptors. Through the recombination of mouse EST and human genome sequence, a second similar receptor CysLT2 was identified. This led to the cloning of a gene with 38% amino acid identity with the only other receptor previously identified. This new receptor shows high affinity and binding force to several leukotrienes and is located on chromosome 13 related to allergic asthma. The gene is expressed in airway smooth muscle and heart. As an important target of anti-asthma drug research and development through leukotriene pathway, the discovery of new receptors plays an important role.

⑶ The senile plaques of Alzheimer's disease are rich in amyloid β-protein deposits. Amyloid β-protein is produced by proteolysis of precursor protein (APP). One enzyme is β-APP lyase, which is a transmembrane aspartic protease. A computer search of the draft sequence of human genome has recently identified a new homologous sequence of BACE, encoding a protein named BACE2, which has 52% amino acid sequence identity with BACE. It contains two activated protease sites and the necessary Down syndrome region located on chromosome 2 1, similar to APP. This raises a question, whether too much BACE2 and APP will accelerate the deposition of β -amyloid in the brain of patients with Down syndrome.

In view of these examples, we systematically identified the side chain homologues of traditional drug target proteins in the genome sequence. The target list used identifies 603 entries in the SwissPrott database and has a unique access code.

Basic biology

An example is: solving a mysterious topic that has puzzled researchers for decades: the molecular basis of bitterness. Humans and other animals react differently to a bitter taste (reaction polymorphism). Recently, researchers mapped this feature to humans and mice, and then searched for related regions on the draft human genome sequence of G protein-coupled receptors. These studies soon led to the discovery of a new family of these protein, which proved that almost all of them were expressed in taste buds. Experiments have confirmed that receptors in cultured cells respond to specific bitter substrates.

The human genome map is the property of all mankind, and this research result should be shared by all mankind and benefit all mankind, which is the understanding of scientists from all countries participating in the human genome project. It is worth noting that at present, in the field of human genome research, some private companies are scrambling to apply for patents for their achievements. Celera Gene Company of the United States has said that it wants to apply for patents for some research results and provide them to pharmaceutical companies for a fee.

Some important genes that control human diseases have been discovered.

Such as obesity gene and bronchial asthma gene. New discoveries of these genes are reported every year. The discovery of these genes has enhanced people's understanding of many important disease mechanisms and promoted the whole medical thought to shift from focusing on treatment to focusing on prevention more quickly. For example, Professor Xia Jiahui of Hunan Medical University published 1998 on May 28th, and cloned the pathogenic gene of human nervous high-frequency deafness (GJB3), which is the first time in China.

Driven by the Human Genome Project, several new disciplines have emerged. Such as genomics and bioinformatics.

Industrialization of biotechnology. Some world-class large companies have shifted their focus to life science research and biotechnology products. This trend is also closely related to the human genome project.

Progress and future

On June 26th, 2000, scientists from the United States, Britain, France, Germany, Japan and China who participated in the Human Genome Project announced that the draft human genome had been completed. The final map requires that the clones used for sequencing can faithfully represent the genome structure of autosomes, and the sequence error rate is less than one in ten thousand. 95% of euchromatin regions were sequenced, and each gap was less than 150kb. As-built drawings will be completed in 2003, two years ahead of schedule.

Complete the human genome sequence completion map

⑴ The clone generated from the current physical map produced a complete sequence, covering more than 96% of the euchromatin region of the genome. The completion sequence of about 1Gb has been realized. The rest have been sketched out, and all clones are expected to reach 8 ~ 10 times coverage, which is about the middle of 200 1 year (99.99% accuracy), using established and increasingly automated protocols.

(2) Check another library to fill the gap. Use FISH technology or other methods to analyze the size of the gap that is not closed. So 22,265,438+0 chromosomes. Completed in 2003.

(3) develop new technologies to fill the gaps that are difficult to fill, and there are about hundreds of them.

Genome sequence working paper: By sequencing BAC serial clones with clear chromosome position, covering 4-5 times (covering at BAC clone level should not be less than 3 times), more than 90% genome sequences can be obtained, and the error rate should be less than 65438 0%. The working frame diagram can be used to understand the genome structure, identify and analyze genes, locate and clone disease genes, and find SNP.

The function of sketch

1, sketch, many disease-related genes were identified.

2.SNP (differences between people), the sketch provides a framework for understanding the genetic basis and the evolution of human characteristics.

3. After sketching, researchers have new tools to study regulatory regions and gene networks.

4. Comparing other genomes can reveal the same regulatory elements, and the genetic environment shared by other species can provide functional and regulatory information at the individual level.

5. Sketch is also a breakthrough point to study the three-dimensional compression of genome to nucleus. This compression may affect gene regulation.

6. In application, sketch information can develop new technologies, such as DNA chip and protein chip, as a supplement to traditional methods. At present, this chip can contain all members of the protein family, so that those members who are active in specific disease tissues can be found.

February 1, 20065438+2, 20065438 Celera Company of the United States and Human Genome Project published detailed maps of the human genome and their preliminary analysis results in Science and Nature respectively. Among them, the government-funded human genome project adopts gene mapping strategy, while Celera company adopts "shotgun strategy". So far, two different organizations have achieved their common goal by using different methods: completing the sequencing of the whole human genome; Moreover, the results are strikingly similar. The basic completion of human genome sequencing has opened up a new era of human life science, which has a far-reaching impact and great significance on the nature of life, human evolution, biological inheritance, individual differences, pathogenesis, disease prevention, new drug development, health and longevity, and the whole biology, marking the arrival of a new era of human life science.

Countless discoveries

1. Analysis shows that the whole human genome is about 2.9 1Gbp, with about 39,000 genes. The average gene size is 27kbp;; Among them, the content of G+C is low, accounting for only 38%, while the content of G+C on chromosome 2 is the highest. So far, 9% of the base pair sequences have not been determined. Chromosome 19 contains the most genes, while chromosome 13 contains the least genes, and so on. (See cmbi Special Report: Significant Progress in Life Sciences for details).

2. At present, more than 26,000 functional genes have been found and located, 42% of which are unknown. Among the known genes, enzymes account for 10.28%, nucleases for 7.5%, signal transduction for 12.2%, transcription factors for 6.0%, signal molecules for10.2%, and receptor molecules for 5%. It is of great significance to discover and understand the functions of these functional genes for gene function and new drug screening.

3. The number of genes is surprisingly small: some researchers have predicted that there are about 6.5438+0.4 million genes in human beings, but Celera Company has set the total number of human genes at 26.383 million to 39 1. 1.4 million, no more than 40,000, only twice the number of genes in nematodes or fruit flies. Humans only have 300 genes, but mice don't. So few genes can produce such complex functions, indicating that the size of genome and the number of genes may have little significance in the evolution of life. This also shows that human genes are more "effective" than other organisms, and some human genes have different functions and the ability to control protein production from other organisms. This will pose great challenges to many of our current concepts and provide new extraordinary opportunities for the development of biomedicine in the post-genome era. However, due to gene shearing, duplication of EST database and some technical and methodological errors, the number of human genes may exceed 40,000 in the future.

4. The proportion of human single nucleotide polymorphism is about 65,438+0/65,438+250 bp. There is only a difference of1.4000 nucleotides among different populations, and 99.99% of the gene codes are the same among people. It has also been found that people of different races are genetically more similar than people of the same race. In the whole genome sequence, the variation between people is only one in ten thousand, which shows that there is no essential difference between different species of human beings.

There are "hot spots" and "deserts" in the human genome. On the chromosome, there are areas where genes are clustered and densely distributed, and there are also large areas with only "useless DNA"-components that contain no or very few genes. There are about 1/4 regions in the genome without gene fragments. Of all the DNA, only 1%- 1.5% can encode protein. In the human genome, more than 98% of the sequences are so-called "useless DNA", and there are more than 3 million long-segment repetitive sequences. These repetitive "useless" sequences are by no means useless. They must contain new functions and mysteries of human genes and information about human evolution and differences. According to classical molecular biology, a gene can only express one kind of protein, but there are many complex protein in human body, suggesting that a gene can encode many kinds of protein, and protein is more important than gene.

6. The gene mutation rate of men is twice that of women, and most human genetic diseases are carried out on the Y chromosome. So men may play a more important role in human genetics.

7. About 200 genes in the human genome come from bacterial genes inserted into the genome of human ancestors. This inserted gene is rare in invertebrates, which means that it was inserted into our genome in the late stage of human evolution. It may be that before the establishment of our human immune defense system, the bacteria parasitic in the body exchanged genes with the human genome during their birth.

8. About1400,000 single nucleotide polymorphisms were found and accurately located, and more than 30 pathogenic genes were preliminarily identified. With further analysis, we can not only determine the pathogenic genes of the most serious diseases that endanger human life and health, such as genetic diseases, tumors, cardiovascular diseases, diabetes and so on. We can also find individualized drugs and methods for prevention and treatment, and at the same time play an important role in further understanding human evolution.

9. The complete set of protein (protein group) encoded by human genome is more complicated than the protein group encoded by invertebrates. Humans and other vertebrates rearranged the domains of protein, forming a new structure. That is to say, the evolution and characteristics of human beings depend not only on producing a brand-new protein, but also on rearranging and expanding the existing protein, thus realizing the variety and functional diversity of protein. It is speculated that a gene can encode 2- 10 protein on average to adapt to the complex functions of human beings.

Model organisms: The genome projects of model organisms such as yeast, Escherichia coli, Drosophila melanogaster, Caenorhabditis elegans, mice, Arabidopsis thaliana, rice and corn have also been completed or are progressing smoothly.

At present, there have been several changes in genomics research: one is functional genomics research that links the sequence and function of known genes; Secondly, gene separation based on mapping turned to gene separation based on sequence; Third, from studying the etiology to exploring the pathogenesis; Fourth, from disease diagnosis to disease susceptibility research.

In the post-genome era, if we compare and analyze the whole species that have completed genome sequencing, we hope to understand the functional significance of genome and protein Group on the whole genome scale, including the expression and regulation of genome, the diversification and evolution of genome, and the mechanism of genes and their products in the process of organism growth, development, differentiation, behavior, aging and treatment. Therefore, we must develop new algorithms and make full use of the supercomputing power of supercomputers.

On May 8, 2006, American and British scientists published the last human chromosome 1 in the online edition of Nature.

Among all 22 pairs of autosomes in human body, chromosome 1 contains the largest number of genes, reaching 3 14 1, which is twice the average level. * * With more than 223 million base pairs, it is also the most difficult to decipher. It took a team of 150 British and American scientists 10 years to complete the sequencing of chromosome 1.

Scientists have announced the completion of the human genome project more than once, but they have not published the full text. This time, the book of life is more accurate, covering 99.99% of the human genome. The "Book of Life" to interpret the human genetic code was declared complete, and the last chapter of the human genome project, which lasted for 16 years, was written.

2. Localization and cloning of disease genes

The direct motivation of the human genome project is to solve the molecular genetic problems of human diseases, including tumors. More than 6,000 kinds of monogenic genetic diseases, polygenic genetic diseases and related genes that endanger human health in a large area represent an important part of the structural and functional integrity of human genes. Therefore, the cloning of disease genes occupies a core position in HGP, and it is also the most striking part since the implementation of the plan.

Driven by gene and physical mapping, the research on the location, cloning and identification of disease genes has been formed, and the traditional way from epitope to protein to gene has turned to a new idea of "reverse genetics" or "location cloning". With the formation of the human genome map, more than 3,000 human genes have been accurately located in various regions of chromosomes. In the future, once the disease site is located, relevant genes can be selected from the local gene map for analysis. This strategy called "positional candidate cloning" will greatly improve the efficiency of finding disease genes.

3. Research on polygenic diseases.

At present, the genomics research of human diseases has entered the difficulty of polygenic diseases. Because polygenic diseases do not follow Mendel's genetic law, it is difficult to make a breakthrough from the general family genetic linkage analysis. The research in this field needs to work hard on the selection of population and genetic markers, the establishment of mathematical model and the improvement of statistical methods. Recently, some scholars have proposed to identify the activation or inhibition of genes in disease states by comparing gene expression profiles. In fact, the Cancer Genome Anatomy Project (CGAP) represents an attempt in this respect.

prospect

1, the formation of life science industry

Because genome research is closely related to pharmaceutical, biotechnology, agriculture, food, chemistry, cosmetics, environment, energy and computer industries, and more importantly, genome research can be transformed into huge productivity, a group of large international pharmaceutical companies and chemical companies have invested heavily in genome research on a large scale, forming a new industrial sector, namely life science industry.

2. Functional genomics

What is the overall development trend of the human genome project at present? On the one hand, structural genomics is moving towards the goal of completing the complete nucleic acid sequence map of chromosomes after successfully making genetic maps and physical maps. On the other hand, functional genomics has been put on the agenda. The human genome project has begun to enter the transition and transformation process from structural genomics to functional genomics. In the research of functional genomics, the possible core issues are: genome expression and its regulation, genome diversity, genome research of model organisms and so on.

2) protein omics research.

Protein's omics research is to study the level and modification of protein from the overall level. At present, a standardized and automated two-dimensional protein gel electrophoresis system is being developed. Firstly, the protein of human cells was extracted by an automatic system, then the protein of each segment was partially separated by a chromatograph, and then analyzed by a mass spectrometer, and the produced polypeptide was identified by feature analysis in a protein database.

Another important content of protein Group's research is to establish a catalogue of protein's interrelationships. The interaction between biological macromolecules constitutes the basis of life activities. Detailed mapping of assembled genome components has been successfully achieved in T7 phage (55 genes). How to establish automatic methods in the research of model organisms (such as yeast) and human genome and understand different biochemical pathways is a problem worthy of discussion.

3) Application of Bioinformatics

At present, bioinformatics has been widely used in gene discovery and prediction. However, it is more important to use bioinformatics to discover the function of protein products of genes. More and more protein coding units have been identified in model organisms, which undoubtedly provides extremely valuable information for finding the homologous relationship between genes and protein and the classification of families. At the same time, the algorithms and programs of bioinformatics are constantly improving, which makes it possible to find homologous relationships not only from the primary structure, but also from the estimated structure. However, the theoretical data obtained by computer simulation need to be verified and corrected by experiments.

⑵ Genome diversity research

Human beings are a polymorphic group. The differences in biological characteristics, susceptibility and resistance of different populations and individuals reflect the results of the interaction between genome and internal and external environment during evolution. Systematic research on human genome diversity will have a great impact on understanding the origin and evolution of human beings and biomedicine.

1) for human DNA resequencing.

It can be predicted that after the completion of the first human genome sequencing, there will inevitably be an upsurge of resequencing and fine genotyping of various races and groups. Combining these materials with anthropological and linguistic materials, it will be possible to establish a database resource for all mankind, so as to better understand human history and its own characteristics. In addition, the study of genome diversity will become one of the main contents of disease genomics, and population genetics will increasingly become the mainstream tool of biomedical research. It is necessary to resequence genes related to common multifactorial diseases (such as hypertension, diabetes and schizophrenia) and cancer-related genes on a large scale at the genome level to determine their variation sequences.

In a word, the model organism genome project provides a lot of information for the study of human genome. The research direction of future model organisms is to transform most of the 85438+ million coding genes in the human genome into multi-component core mechanisms with known biochemical functions. The knowledge of enzymes, the core mechanism of human evolutionary conservatism, and the various ways in which their disorder leads to diseases can only come from the study of human beings themselves.

Through the study of functional genomics, human beings will eventually be able to understand which evolutionary mechanisms have actually occurred and consider what new potential the evolutionary process may have. The new way to solve the developmental problem may be to combine protein functional domain and regulatory sequence to establish a new gene network and morphogenetic pathway. In other words, the future biological science can not only understand how organisms form and evolve, but also have the potential to build new objects. This plan has set a new milestone in the history of human science! This is a feat that changes the world and affects human life. As time goes on, its significance will become more and more obvious.

Stacked edition