Study on expression and purification system of vegetable oil body protein

Research progress of vegetable oil body expression system

And Jia's * *

Institute of Biotechnology, Chinese Academy of Agricultural Sciences, Beijing 10008 1.

This paper introduces the structural characteristics of oleosin in plant seeds and the regulation of its coding genes, and expounds the research progress and prospect of producing target protein by using vegetable oil body expression system, a new plant bioreactor.

Keywords: oil body; Oil body protein; Expression system; Plant bioreactor; Target protein

The existing foreign gene expression systems mainly include bacteria, filamentous fungi, yeast, mammalian cells, animal mammary glands, insects (insect cells, insect baculoviruses and insect bodies) and plant expression systems (plant bodies, virus vectors and oil bodies). These foreign gene expression systems have their own advantages and disadvantages in terms of expression quantity, separation and purification of expression products, activity and cost. Judging from the research progress in recent years, the large-scale production of various target proteins by plant expression system is still limited by many factors, among which low expression level and high cost of extraction and purification are the main limiting factors. The vegetable oil body expression system inserts the coding gene of the target protein into the 3' end of the coding gene of oleosin, and uses the oleosin promoter to drive the specific expression of the target protein and oleosin in the oil body of transgenic plants. After obtaining transgenic plant seeds, the seeds are crushed, the oil phase and water phase are separated by centrifugation, and the upper oil phase is recovered, which can remove most of the non-target components in the seeds, thus significantly reducing the cost of separation and purification of target proteins in recent years. As a new type of plant bioreactor, it provides a new way for transgenic plants to finally produce foreign proteins.

1 oil body

Nutrients stored in plant seeds mainly include protein, fat and carbohydrate. Lipids generally exist in the form of triacylglycerol (tag), except jojoba, which stores wax esters. The label molecules in seeds do not polymerize with each other, but disperse into many small and stable subcellular droplets, which are called oil bodies. As the smallest organelle in organism, oil body has its own structure and characteristics.

1. 1 oil body size

The oil body is spherical, with a diameter of 0.5-2.5μm, and its size varies with different plant species and is influenced by nutrition and environment. Even the same seed, the size of oil body in different tissues is different. From the biological point of view, the size of oil body is mainly determined by two factors: (1) The seeds provide the largest surface for lipase-catalyzed tag when they germinate; (2) The least consumption of oil body protein and phospholipid, pl). If the diameter of oil body is less than 0.2μm, it can provide a larger surface for lipase catalysis, but it needs a lot of pl and oleosin. On the contrary, if the oil body diameter is larger than 2.5μm, although the amount of pl and oleosin is saved, due to the small surface area, lipase can not provide the energy needed for plant growth by rapidly hydrolyzing lipids during seed germination and seedling growth.

1.2 oil composition

The composition of oil body includes: (1)92%-98% neutral lipid, mainly tag, accounting for about 95%, and a small amount of diacylglycerol (dag) and free fatty acids; (2) 1%-4% phospholipid (pl): mainly phosphatidylcholine, accounting for about 60%-70%, and a small amount of phosphatidylserine, phosphatidylethanolamine and phosphatidylinositol; (3) 1%-4% oil body protein: 90% of it is oil protein, and a small amount is caleosin and cytochrome c reductase. Lipase and acyl lipase still exist in the oil film of mature seeds of some plants (such as castor and soybean). No oleosin was detected in pollen grain oil, and no oleosin was detected in cortex oil of olive and avocado fruits. Because the lipids in these oil bodies are not used for long-term storage, murphy and vance suggested that oleoproteins may be unique to the oil bodies of storage organs, but n? Sted et al. found that oleosin exists in the top oil body.

1.3 Basic structure of oil body

According to the oil structure model proposed by tzen et al., the inside of the oil body is a liquid label, and the outside is a semi-unit membrane composed of a single phospholipid molecule and its mosaic protein-oleosin. The basic unit of this semi-unit membrane consists of 13 pl molecules and 1 oleosin molecules (Figure 1). Pl accounts for 80% of the oil level, and the remaining 20% is oleosin. The two hydrophobic acyl groups of each pl molecule face the internally hydrophobic label matrix and interact with the label molecules. The hydrophilic head base of pl faces the cytoplasm. The hydrophobic region in the middle of oleosin molecules forms a handle structure of about 1 1nm, which accounts for 2/5 of oleosin molecules and extends to the hydrophobic acyl part of pl and the label in the oil. This part is a hairpin structure composed of 68-74 amino acids, and the top of the hairpin structure is a "proline knot" composed of three proline and one serine (figure 1). The remaining 3/5 oleosin molecules cover the surface of the oil body to prevent external phospholipase from acting on the phospholipid semi-unit membrane. The results of isoelectric focusing show that the isoelectric point of the oil body is 5.7-6.6, that is, when the ph value is neutral, the surface of the oil body is negatively charged. After long-term storage of plant seeds, the oil structure remains stable and will not polymerize with each other. It is generally believed that the surface charge of oil body and the existence of oleosin are the main factors to maintain the structural stability of oil body. The recent research results show that the surface of oil body is mainly inlaid with oleosin and a small amount of other protein, such as caleosin, so the structure of oil body, as the smallest organelle in plants, may be more complicated than the above model. Olein and caleosin are two oil body proteins that have been studied at present.

2 oil body protein

2. 1 oelosin and its structural characteristics

Olein was originally found in mustard. At present, the gene sequences and amino acid sequences of oleosins in many plants (such as sesame, rape, sunflower, carrot, corn, soybean, Arabidopsis and cotton) have been reported. Olein is a highly hydrophobic alkaline protein with a molecular weight of 15-26kd, which is mainly expressed in seeds. It is generally believed that oelosin is unique to oil bodies. Recently, it was found that there was about 5% oelosin in the endoplasmic reticulum near the oil body, and it was also found in the root tip oil body. Oelosin is synthesized on the endoplasmic reticulum and is synthesized by ribosomes bound to the endoplasmic reticulum. Oil protein embedded in the surface of oil body is very important to maintain the stability of oil body. On the one hand, it hinders the polymerization of oil molecules in space, on the other hand, oleosin is considered as the binding site between lipase and oil bodies during seed germination. The antibody with the highest content of oleosin in a certain plant can also recognize oleosins with similar molecular weight in the same family, such as Cruciferae 19-20kd, Compositae 20kd and Leguminosae 24kd. Not only that, oleosins between different subjects can also have this kind of interaction.

The oleosins from different plants have the same structural characteristics, and they all have three basic domains, namely, the amphiphilic region (hydrophilic and lipophilic) consisting of 40-60 amino acids at the (1)n- terminal. This area is distributed on the cytoplasm-facing side of the oil body. (2) A highly hydrophobic region consisting of 68-74 amino acids in the middle. According to the polarity distribution of amino acid residues in this region, tzen and huang speculated that it was a trans-parallel β-folded structure extending into the tag matrix, with a "proline knot" consisting of three proline and 1 serine at the top. This region, especially the "proline knot", is highly conserved among oleoproteins from different sources, so it may be of great significance to plants from an evolutionary point of view. (3) The C-terminal 3)C- 40 amino acids constitute the α-helix domain. This domain is hydrophilic and lipophilic, and its positively charged group faces the negatively charged part of pl layer (phosphatidylserine, phosphatidylinositol, free fatty acid, etc. ), the negatively charged part faces the oil surface (Figure 2). For each oleosin molecule, about 20% amino acid residues are embedded in pl layer, 30% are immersed in tag, and the remaining 50% are exposed on the oil surface. Biochemical analysis of the secondary structure of oleosin supports Cen's model. Lacey et al. put forward a new secondary structure model of oleosin: the hydrophobic region in the middle of oleosin is two independent α -helix structures connected by "proline knot", which forms a rotation angle of 180 degrees; N-terminal is β-folded structure; C-terminal is a hydrophilic and lipophilic α-helix structure, and the above two models need further experimental verification.

2.2 oleosin gene and its expression regulation

It is found that there is only one oleosin in gymnosperms, oleosin genes in angiosperms often exist in the form of gene families, and several oleosin isomers often exist in a plant (table 1). The expression amount, position and speed of each isomer in plants are different. For example, the expression of 18kd oleosin in corn is only 16kd 18%-20%, and the expression of 20kd oleosin in rapeseed oil is only 20kd 10%.

Characteristics of oleosin gene expression regulation;

(1) oleosin gene is mainly regulated by development and expressed during seed maturity. The oleosin gene was induced by water stress, jasmonic acid, aba and osmotic stabilizer (such as sorbitol). For example, the accumulation of mrna and protein was detected 0-4 hours after aba induced more than 20k eosin gene in rape. After treatment with osmotic stabilizer (such as sorbitol) for 65438±0h hours, mrna was detected, and the accumulation of oleosin was detected in 3-6 hours. Abre(aba-responsive element) motif sequence (t/c acgtggc) exists in the promoter region of oleosin gene in rape and Arabidopsis thaliana, and is specifically induced by aba. (2) The expression of oleosin gene is tissue-specific, mainly expressed in embryo (peltate and hypocotyl) and aleurone layer of seeds. De-oliveira et al. and robert et al. reported that a special type of oleosin was found in pollen. 20kd oleosin can be detected in both spherical embryos and heart-shaped embryos derived from microspore culture of rape, and the corresponding mrna can also be detected in heart-shaped embryos, indicating that oleosin is expressed in the early stage of seed development. (3) There are other regulatory sequences in the 5 ′ upstream region of oleosin gene, such as catgcang, a regulatory element of grain storage protein gene in the upstream region of rice oleosin gene. The aatgcatg sequence existing in the oilseed promoter of rape is highly homologous to the conservative sequence ry motif (catgcatg) which controls the seed-specific expression of leguminous genes. Caca(taacaca) sequence exists in the promoter of Arabidopsis oleosin gene, which is a common sequence in leguminous plant seed protein. (4) Although oleosin gene is tissue-specifically expressed, there is no signal peptide sequence at its 5' end and correspondingly no signal sequence at its N- end. It is speculated that there may be some sequences in oleosin gene, or oleosin can form a conformation to locate it on the oil surface. For example, the deletion of the middle part of oleosin seriously affects its location on the oil body, while the deletion of its N-terminal or C-terminal has little or no effect. The experiment that three proline in "proline knot" mutates into leucine proves that "proline knot" is necessary for the localization of oleosin in oil body. The hydrophobic region in the middle of oleosin may be the localization signal of oleosin endoplasmic reticulum, but the mutation of "proline knot" does not affect the localization of oleosin endoplasmic reticulum in vitro.

2.3 caleosin

A variety of vegetable oil body proteins were separated by different methods, and it was found that there were a few other proteins besides oleosin. In 1998, Chen et al. identified the other three seed proteins sop 1, sop2 and sop3 in sesame oil for the first time by immunolabeling method. By amino acid sequence determination, it was found that sop 1 was homologous to a calcium-binding protein in rice, so it was named caleosin. Caleosin is widely found in higher plants, and there are similar protein in algae and fungi. The expression characteristics of caleosin from different sources are different. For example, rice caleosin is mainly expressed in the late stage of embryo formation, and can be expressed in seedlings and vegetative growth tissues induced by aba or water stress. Like rice calcineurin, sesame calcineurin seems to be expressed only in seeds. Under drought conditions, Arabidopsis thaliana can be induced by aba to detect caleosin homologous protein mrna. It is known that caleosin protein in higher plants is divided into three domains: (1)n-terminal hydrophilic region, which contains an ef hand that binds to ca2+. Ef-hand fusion protein expressed in E.coli can bind to ca2+ in vitro. Caleosin 1 isolated from sesame oil can also bind to ca2+. (2) Intermediate hydrophobic region, including N-terminal extracellular membrane localization region and proline-rich region adjacent to it. It is known that this structure only exists in some higher plants such as sesame, rice and Arabidopsis. (3)c-terminal hydrophilic region. The C- terminal hydrophilic region of caleosin in most plants usually includes four kinase phosphorylation sites. The structure and biological function of caleosin from different sources are not clear, and it is speculated that Caleosin may be involved in lipid biosynthesis, intracellular transport and lipid metabolism.

3 vegetable oil body expression system

3. Construction of1oleosin target protein expression vector

As mentioned above, all oleosins have three domains, and the sum of the three domains is about 15kd, but the molecular weight of oleosins is very large (15-26kd), and the redundant1-kloc-0/1KD exists in the form of C-terminal or N-terminal extension. The nucleotide sequences of oleosins from different sources are quite different between N- terminal and C- terminal, except that the hydrophobic region in the middle is highly conserved. This makes people think that inserting exogenous small molecular weight protein coding gene into the 5' end or 3' end of oleosin gene, constructing plant expression vector of "oleosin target protein" driven by oleosin promoter and transforming recipient plants will not affect the localization of oleosin in vegetable oil. Because oleosin is specifically expressed in seeds and embedded on the surface of oil body, the target protein is specifically expressed in oil body together with oleosin in the form of fusion protein in transgenic plants.

3.2 Advantages of Vegetable Oil Body Expression System

3.2. The1fusion protein is easy to separate oleaginous protein, expressed as seed-specific protein and embedded on the surface of oil body. The foreign gene inserted into the N- terminal and C- terminal of oleosin to form a fusion protein did not change the characteristics of oleosin. Therefore, by crushing transgenic plant seeds → liquid extraction → centrifugation and recovering the upper oil phase, the fusion protein can be separated from other components in the cell, and more than 90% of the seed protein can be removed. When the target protein inserted into the fusion protein is an enzyme, oleosin-fusion protein can be directly used as an enzyme, and can be recovered after the enzymatic reaction and used in the next enzymatic reaction (as an immobilized enzyme). Generally, after repeated use for 2-3 times, it still maintains strong enzyme activity. If the fusion protein is inactive, the target protein should be cut off from the oleosin. Therefore, it is necessary to introduce a protease cleavage site between the target protein and oleosin gene, which is often used as hemolysin. After the fusion protein is digested, try to separate the two.

3.2.2 The fusion protein can be stably preserved in seeds for a long time. The activity of hydrolase in mature seeds decreases, so the fusion protein can be stored stably in seeds for a long time without degradation. According to the research of van rooijen and moloney, the fusion protein of oleosin -gus(β- glucuronidase) was stored at 4℃ for more than 65438 0 years without degradation.

3.2.3 The seeds are easy to transport, which is beneficial to the evaporation of more than 95% water during the seed ripening process in industrial production, and easier to transport than other parts of the plant, which brings convenience to the mass production of the target protein.

3.2.4 At present, processing machinery is suitable for seed crushing and oil separation, such as water mill grinding for grain processing, while the equipment for separating dairy products in dairy industry can be used for centrifugal oil separation after liquid extraction.

3.2.5 Increase the added value of agricultural products. The oil after separation of target protein can still be used as edible oil or industrial oil, and the target protein can greatly increase the added value of agricultural products.

3.3 Exogenous protein expressed with vegetable oil.

199 1 year, lee et al. reported for the first time that maize oleosin gene was transferred into rape. Maize oil protein mrna only exists in the mature seeds of transgenic plants, and the expression amount is 65438 0% of the total seed protein, and 90% of the expression products are in the oil body. The correct transcription, translation and localization of oleosin gene from monocotyledonous maize in transgenic rapeseed oil showed that there was enough information in oleosin gene from monocotyledonous plants to make it play a role in dicotyledonous plants. Holbrook et al bombarded rape embryo with gene gun on 1996, and found that oleosin -gus fusion protein was correctly located in the oil body in short-term expression detection, which further verified the experimental conclusion of lee et al. Vanrooijen and moloney inserted gus gene into the 3' end of Arabidopsis oleosin gene, constructed a plant expression vector driven by oleosin promoter, and transformed rape by Agrobacterium-mediated method. In transgenic plants, 80% of gus activity is located in oil. It was also found that the oleosin -gus fusion protein itself has the activity of β -glucuronidase and it is not necessary to cut gus from oleosin. Because the fusion protein is combined with oil body, it can be used as immobilized enzyme for many times. 1995, parmenter et al. constructed an expression vector of oleosin-hirudin fusion protein driven by Arabidopsis oleosin promoter and transformed rape. By immunofluorescence detection, the fusion protein was located on the surface of oil body, and its expression accounted for 65438 0% of the total seed protein. Hirudin was separated from oleosin protease to obtain hirudin with biological activity. It is found that although the oleosin -gus fusion protein has biological activity, the oleosin-hirudin fusion protein has no specific antithrombin activity of hirudin, so it must be cut off from the fusion protein by enzyme. 1997 Liu et al. successfully expressed rumen fungal xylanase in transgenic rapeseed by using vegetable oil body expression system.

3.4 Selection of recipient crops and target proteins

Recipient crops should be crops with high oil content and easy genetic transformation. The selection of target protein should be considered: (1) molecular weight should not be too large, so as not to affect the correct positioning of fusion protein on the oil surface. At present, the largest target protein expressed in oil body is 67kd gus;; (2) The target protein should be hydrophilic, especially when the target protein needs to be cut from the fusion protein; (3) The function is clear and the gene sequence is known; (4) It is expensive, has good commercial value and can significantly increase the added value of agricultural products.

Based on the above considerations, our laboratory chooses rape and cotton as recipients. In order to study the applicability of vegetable oil body expression system, we chose calcitonin whose C-terminal needs amidation to have biological activity as the target protein. At present, the fourth generation transgenic cotton strain and transgenic rape plant have been obtained. Pcr detection of transgenic rape proved that calcitonin gene had been integrated into rape genome. Through pcr-southern and western detection, it was proved that the target gene had been integrated into cotton and expressed in oil. The expression of target protein in transgenic plants and the detection of biological activity are under way. "A new salmon calcitonin analogue and the method of producing foreign protein by using vegetable oil body expression system" has applied for national patent (reported in another article).

4 Prospects and problems

Successful expression of foreign proteins such as gus, hirudin, xylanase and calcitonin. Using oil body expression system, especially immobilized enzyme technology, undoubtedly makes people see its application prospect in industrial production of target protein. However, as a new type of plant bioreactor, there are still many problems to be further studied and discussed, including: (1) How to further improve the protein expression and reduce the cost of separation and purification of the target protein. When oleosin-fusion protein has no biological activity, this problem is more prominent, and the cost of digestion, separation and purification of target protease is high, which will seriously affect the practical application of this system. (2) It is known that some protein have biological activity only after glycosylation and amidation. Whether the target protein can be glycosylated and amidated in the oil body expression system and the degree of glycosylation and amidation need more experimental research. The limitation of oil body expression system is that it has certain requirements on the molecular weight, hydrophilicity and hydrophobicity of the target protein. No expression system is omnipotent. How to make full use of the advantages of different expression systems to express appropriate target proteins for the benefit of mankind is a subject that needs further study in the future.

Excerpted from: Agricultural Biotechnology 2003,11(5): 531-537.