Research related to the Human Genome Project

1. The formation of the life science industry

Since genomic research is closely related to industrial sectors such as pharmaceuticals, biotechnology, agriculture, food, chemistry, cosmetics, environment, energy and computers, more The important thing is that genome research can be transformed into huge productivity. A number of large international pharmaceutical companies and chemical industry companies have invested heavily in the field of genome research, forming a new industrial sector, the life science industry.

Some of the world’s largest pharmaceutical groups have invested in establishing genomic research institutes. Ciba-Geigy and Ssandoz formed a joint venture, Novartis, and spent $250 million to establish a research institute to carry out genomic research. Smith Kline spent $125 million to accelerate sequencing and base 25% of its drug development program on genomics. Glaxo-Wellcome invests $47 million in genomic research to double its research staff.

Large chemical industry companies are transitioning to the life science industry. Monsanto began to shift into the life sciences industry as early as 1985. By 1997, the company's investment in biotechnology and genomic research had reached $6.6 billion. In April 1998, DuPont announced that it would be reorganized into three industrial units, led by life sciences. In May 1998, the company announced that it would abandon the energy company Conaco and transform it into a life sciences company. Dow Chemical Company spent US$900 million to purchase 40% of Eli Lilly's stock, engaged in cereal and food research, and later established a life sciences company. Hoechst sold its basic chemicals division to invest in biotechnology and pharmaceuticals.

There is also a trend of consolidation in the traditional agriculture and food sectors towards biotechnology and pharmaceuticals. Genetically engineered sheep developed by Genzyme Transgenics can produce antithrombin III at a higher yield. The enzyme output of a flock of sheep is equivalent to the output of a US$115 million factory. It is estimated that the cost of producing drugs from genetically modified animals is one-tenth that of large-scale cell culture methods. Some companies are also researching the production of cereals that can fight osteoporosis, as well as the large-scale production and processing of genetically engineered foods.

The energy, mining and environmental industries have also converged on genomic research at the molecular level. For example, using the methanogen Methanobacterium as a new energy source. The radiation-resistant bacterium Deinococcus radiodurans is used to remove the contamination of radioactive substances, and after being transferred to the tod gene, the contamination of various harmful chemical substances is removed in a high-radiation environment.

2. Functional Genomics

What is the current overall development trend of the Human Genome Project? On the one hand, after successfully realizing the production of genetic maps and physical maps, structural genomics is moving towards the goal of completing complete nucleic acid sequence maps of chromosomes. On the other hand, functional genomics has been put on the agenda. The Human Genome Project has begun to enter the process of transition and transformation from structural genomics to functional genomics. In functional genomics research, possible core issues include: genome expression and regulation, genome diversity, model organism genome research, etc.

⑴ Genome expression and its regulation

1) Research on gene transcription expression profile and its regulation

The gene transcription expression level of a cell can be precise and specific It is one of the main contents of functional genomics to accurately reflect its type, developmental stage and reaction state. In order to comprehensively evaluate the expression of all genes, a new tool system needs to be established. Its quantitative sensitivity level should reach less than 1 copy/cell, its qualitative sensitivity should be able to distinguish splicing modes, and it must also achieve the ability to detect single cells. DNA microarray technologies developed in recent years, such as DNA chips, have made it possible to achieve this goal.

Studying gene transcription expression is not only to obtain genome-wide expression data for mathematical cluster analysis. The key issue is to dissect the mechanisms of gene expression networks that control entire developmental processes or response pathways.

The network concept is very important for the regulation of gene expression under physiological and pathological conditions. On the one hand, the products of genes in most cells interact with the products of other genes; on the other hand, during development, most gene products are expressed and function at multiple times and spaces, forming genes. Pleiotropy of expression. In one sense, the expression pattern of each gene only has real meaning when placed in the context of the regulatory network in which it is located. To conduct research in this area, it is necessary to establish high-throughput mouse embryo in situ hybridization technology.

2) Proteomics research

Proteomics research is to study the level and modification status of proteins from an overall level. A standardized and automated working system for two-dimensional protein gel electrophoresis is currently being developed. First, an automated system is used to extract proteins from human cells, followed by partial separation using a chromatograph, cleaving the proteins in each segment, analyzing them with a mass spectrometer, and characterizing the resulting peptides in a protein database.

Another important aspect of proteome research is the establishment of a catalog of protein interactions. The interaction between biological macromolecules forms the basis of life activities. Detailed mapping of the components of the assembled genome has been achieved with T7 phage (55 genes). How to establish automated methods and understand different biochemical pathways in the study of model organisms (such as yeast) and human genomes is an issue worth exploring.

3) Application of bioinformatics

At present, bioinformatics has been widely used in the discovery and prediction of genes. However, it is more important to use bioinformatics to discover the function of a gene's protein product. More and more protein building coding units in model organisms have been identified, which undoubtedly provides extremely valuable information for the search for homology relationships between genes and proteins and the classification of families. At the same time, bioinformatics algorithms and programs are constantly improving, making it possible to discover homologous relationships not only from primary structures but also from estimated structures. However, the theoretical data obtained by computer simulation still needs to be verified and corrected through experiments.

⑵Research on genome diversity

Human beings are a polymorphic group. Differences in biological traits and susceptibility and resistance to diseases among different groups and individuals reflect the result of the interaction between the genome and the internal and external environment during the evolution process. Carrying out systematic research on human genome diversity will have a significant impact on understanding the origin and evolution of humans, as well as on biomedicine.

1) Re-sequencing of human DNA

It can be predicted that after the completion of the first human genome sequencing, there will inevitably be re-sequencing and refined genes of various races and groups. Parting craze. Combining these data with anthropology and linguistics data, it will be possible to establish a database resource for all mankind, thereby better understanding human history and its own characteristics. In addition, the study of genome diversity will become one of the main contents of disease genomics, and population genetics will increasingly become a mainstream tool in biomedical research. Genes related to various common multifactorial diseases (such as hypertension, diabetes, schizophrenia, etc.) and cancer-related genes need to be resequenced on a large scale at the genome level to identify their variant sequences.

2) Sequencing of other organisms

Systematic comparative DNA sequencing of organisms at various stages of evolution will reveal the 3.5 billion-year evolutionary history of life. Such studies would not only outline a detailed phylogenetic tree, but also show the timing and characteristics of the most important changes in evolution, such as the emergence of new genes and whole-genome duplications.

Understanding the conservation of gene sequences in different organisms will enable us to effectively understand the factors that constrain the functionality of genes and their products. The study of sequence differences helps to understand the basis for the diversity of nature. Establishing correlations between sequence variations and spatiotemporal differences in gene expression among different organisms will help reveal the network structure of genes.

⑶ Carry out research on model organisms

1) Comparative genomic research

In the study of the human genome, the study of model organisms plays an extremely important role status. Although the genomes of model organisms are relatively simple in structure, their core cellular processes and biochemical pathways are largely conserved. The significance of this research is: 1) It helps to develop and test new related technologies, such as large-scale sequencing, large-scale expression profiling, large-scale functional screening, etc.; 2) Through comparison and identification, it is possible to understand the evolution of the genome, Thereby accelerating the understanding of the structure and function of the human genome; 3) Comparative studies among model organisms provide important clues for elucidating gene expression mechanisms.

The current knowledge about the overall structural composition of the genome mainly comes from the genome sequence analysis of model organisms. Through computer analysis of gene regulatory sequences among different species, a certain proportion of conserved core regulatory sequences have been discovered. The expression pattern database established based on these sequences provides the necessary conditions for deciphering gene regulatory networks.

2) Research on loss-of-function mutations

The most effective way to identify gene function may be to observe the phenotypic changes in cells and the whole after gene expression is blocked. In this regard, knock-out is a particularly useful tool. at present. Large-scale functional genomics research on yeast, nematodes and fruit flies has been carried out internationally, with yeast making the fastest progress. The European Community has established a research network called EUROFAN (European Functional Analysis Network) specifically for this purpose. The United States, Canada and Japan have launched similar programs.

With the completion of the genome sequencing of C. elegans and Drosophila, it is possible to conduct similar studies on these two organisms in the future. The establishment of some mutant strains and technical systems will not only become an effective means to study the function of single genes, but also lay the foundation for studying deep-seated issues such as gene redundancy and interactions between genes. As a representative model organism among mammals, mice play a special role in functional genomics research. Homologous recombination technology can destroy any gene in mice. The disadvantage of this method is its high cost. The use of random mutations caused by point mutations, deletion mutations, and insertion mutations is another possible approach. For human cells, it may be more appropriate to establish a system in which antisense oligonucleotides and ribozymes instantly block gene expression. Protein-level knockouts are perhaps the most powerful means of elucidating gene function. The use of combinatorial chemistry approaches holds the promise of producing chemical knockout reagents for activating or inactivating various proteins.

In short, the genome project of model organisms has provided a large amount of information for the study of the human genome. In the future, the research direction of model organisms is to convert most of the 80,000 to 100,000 coding genes in the human genome into multi-component core mechanisms with known biochemical functions. Knowledge of the fine-grained pathways of enzymes, a core mechanism conserved by human evolution, and of the various ways in which their disorders lead to disease, will only come from studies of humans themselves.

Through the study of functional genomics, humans will eventually be able to understand which evolutionary mechanisms have actually occurred and consider what new potential the evolutionary process can have. A new way to answer developmental questions may be to recombine protein functional domains and regulatory sequences to establish new gene networks and morphogenesis pathways. In other words, future biological science will not only be able to understand how organisms are composed and evolve, but also, more attractively, have the potential to build new organisms. This plan has erected a new milestone in the history of human science! This is a feat that changes the world and affects human life. As time goes by, its great significance will become more obvious.

Human Genome Project: Celera Human Genome Project

In 1998, eight years after the launch of the International Human Genome Project (hereinafter referred to as the "International Project"), American scientist Clegg Fante founded a small private company called Celera Genomics to carry out his own human genome project.

The company hopes to do it faster and with less investment than the International Human Genome Project ($300 million, only one-tenth of the international effort). The resumption of the Serrera Genome Project is considered a good thing for the Human Genome Project, because the competition for the Serrera Genome prompted the International Human Genome Project to improve its strategy and further accelerate its work process, allowing the Human Genome Project to be completed ahead of schedule. .

Serreira used whole-genome shotgun sequencing, a faster and riskier technique. The idea of ??shotgun sequencing is to break the genome into millions of DNA fragments, and then use a certain algorithm to reintegrate the sequence information of the fragments to obtain the entire genome sequence. To improve the efficiency of this method, sequencing and fragment information integration were automated in the 1980s. Although this method has been used to sequence bacterial genomes with a sequence length of 6 million base pairs, it was not yet known whether this technology could successfully sequence 30 million base pairs in the human genome. Conclusion.

Gene intellectual property rights dispute

Celera Genomics initially claimed that it was only seeking patent protection for 200 to 300 genes, but later revised it to seek "complete identification" A total of 100 to 300 target genes of "important structures" are subject to intellectual property protection. In 1999, Serreira applied for preliminary patent protection on 6,500 complete or partial human genes; critics argued that this move would hinder genetic research. In addition, when Celera was founded, it agreed to share data with international projects, but this agreement soon broke down when Celera refused to deposit its sequencing data into a freely accessible public database. While Serreira has promised to publish updates on their progress quarterly (daily for Project International) in accordance with the 1996 Bermuda Agreement, unlike Project International, they do not allow others to freely publish or use their data without compensation.

In 2000, US President Clinton announced that all human genome data would not be subject to patent protection and must be disclosed to all researchers. Serreira had to decide to make the data public. This incident also caused Serreira's stock price to plummet, and the Nasdaq, which relies heavily on biotechnology stocks, suffered a heavy blow; within two days, the market value of the biotechnology sector lost approximately US$50 billion.

Post-human genome project

Post-human genome project refers to several fields after humans complete the human genome project (structural genomics). In fact, it refers to further plans after the completion of the sequence. Its essence The content is bioinformatics and functional genomics. Its core issues are the study of genome diversity, the causes of genetic diseases, the coordination of gene expression regulation, and the functions of protein products.

The purpose of human genome research is not only to read out the entire DNA sequence, but more importantly, to understand the function of each gene and the relationship between each gene and a certain disease, so as to truly conduct a systematic study of life. By decoding earth science, we can achieve the goal of fundamentally understanding the origin of life, the reasons for differences between species and individuals, the mechanisms of disease, and the most basic life phenomena that plague human beings, such as longevity and aging.