User:VisvaM/Notebook/Applications of Cell-Free Protein Synthesis and Biosensors to Metabolic Engineering
Customize your entry pages![]() | ||||||||||||||||||||||||||||||
Project DescriptionApplications of Cell-Free Protein Synthesis to Metabolic EngineeringThe aim of this project is to explore how cell-free protein synthesis (CFPS) and cell free metabolic engineering (CFME) can be used to identify the concentrations of enzyme encoding DNA that lead to the maximum titre of a desired metabolite. Due to the COVID19 pandemic, this will be achieved by means of constructing computational models. Regarding the structure of the project work, the project will be split into two phases. Phase 1In phase 1, the team will carry out a library-based investigation into the different metabolic engineering approaches that have been developed and explore the advantages and limitations of each. Additionally, the team will explore the difference between CFPS and CFME and the different modelling approaches that are used to model cell-free systems. The group will also identify a biosynthetic pathway that involves three to five enzymes and yields a metabolite which has significant potential to contribute towards the bioeconomy. The decision was made amongst the group to investigate a metabolite that not only has potential to contribute to the Bioeconomy, but specifically towards the aim of ‘manufacturing medicines of the future and making existing ones more efficiently’ [1] outlined in the national bioeconomy strategy to 2030. Phase 2In phase 2, the group will carry out modelling to assess how variations in the concentrations of the enzyme encoding DNA will impact the yield of the desired metabolite in a cell-free system. The other objective of this phase will be to explore how the modelling of CFME can be used to inform the design of in vivo metabolic engineering projects. References
Introduction to the TeamOur team is composed of 6 members, five 3rd year molecular bioengineers and one intercalating medical student. Molecular Bioengineers
Intercalating Medical Students
Positions of Responsibility
Synthetic biologyWhat is synthetic biology?A report published by the Royal Academy of Engineering proposed the definition that ‘synthetic biology aims to design and engineer biologically based parts, novel devices and systems – as well as redesigning existing, natural biological systems’ [1]. Synthetic biology is an interdisciplinary field, encompassing engineering, biological sciences and modelling (Polizzi, 2013), and it is estimated that by 2025 the global market size for synthetic biology will be around $19.8 billion. [2] One of the main aims of synthetic biology is to make the engineering of biological systems easier [3]. Modularity is an example of one engineering principle which synthetic biology heavily relies on [3] and this principle is outlined in the abstraction hierarchy of synthetic biology: biological parts (bioparts) are assembled to make up devices, which come together to form systems [4]. Basic bioparts, which are the building blocks of synthetic biology, are defined as a functional unit of DNA that cannot be broken down into smaller parts [5]. An article in Nature titled the ‘Five hard truths for synthetic biology’ highlighted that many of these bioparts haven’t been characterised well and even if they have the synergistic behaviour of these parts may diverge from what is expected [6]. Thus, there is a big push to collect characterisation data for bioparts so that these parts are ‘fully characterised in the context of the hosts’ in which the experiments were carried out[3]. International Genetically Engineered Machine (iGEM) and the Registry of Standard Biological PartsiGEM is a non-profit organisation whose main aims are to further the advancement of synthetic biology, education and competition and develop a collaborative and open community [7]. iGEM hold a competition known as the iGEM competition which is aimed at students (teams are predominantly made up of university students). The Registry of Standard Biological Parts, run by iGEM, is a growing collection of more than 20,000 standard biological parts, which are referred to as BioBricks [7]. A BioBrick is a DNA sequence that is flanked on either side by standardised sequences [8] which encode restriction enzyme sites. References
Introduction to Metabolic Engineering (ME)What is Metabolic Engineering?Metabolic engineering (ME) is defined as the ‘directed improvement of product formation or cellular properties through the modification of specific biochemical reactions or the introduction of new genes with the use of recombinant DNA technology’ [1]. The applications of ME include the increased production of metabolites that are native to the host organism, the production of metabolites that are new to the host organism and the addition of pathways into the host organism that are able to degrade toxic chemicals [2]. One of the main applications for which metabolic engineering is used, is to increase the production of a target metabolite [3]. Prior to metabolic engineering, classical strain improvement was the method used to overexpress a desired metabolite; this involved mutagenesis carried out by a selection pressure such as UV light [4] followed by screening of the resulting mutant strains. However, developments in recombinant DNA technologies led to a more rational approach to strain improvement known as metabolic engineering, which allows the ‘introduction of targeted genetic changes’ [5]. In recent years, there has been a lot of attention towards the bio-based economy: a sustainable development concept that is concerned with using ‘renewable biological resources to replace fossil fuels in innovative products, processes and services’ [6]. Metabolic engineering is concerned with the optimisation of titer, productivity and rate of metabolic pathways to create cost effective means for the production of fuels, chemicals and pharmaceuticals [4], which will ultimately accelerate the transition to a bio-based economy. One example that highlights the role of metabolic engineering in developing the bioeconomy is the successful production of a precursor of the anti-malarial drug, Artemisinin, using Saccharomyces cerevisiae [7]. Artemisinin is derived from the leaves of the wormwood plant species Artemisia annua [8] and is extremely effective in the treatment of drug-resistant Malaria [9]. However, the Artemis annua plant only contains small quantities of Artemisinin and its chemical synthesis is not economically viable [9]. Paddon et al were able to demonstrate high yields of 25g/l of the precursor, artemisinic acid, as well as develop a means of the conversion of artemisinic acid to Artemisinin[7]. With support from the Bill and Melinda Gates foundation, this innovative application of metabolic engineering led to the delivery of 120 million treatments of Artemisinin-based combination therapies (ACTs) [10]. Traditional metabolic engineering involves the use of microbial cells as cellular factories [11]. The use of microbial cells in this way presents many challenges which include slow design-build-test cycles due to the constraints of cellular growth[11], toxicity of metabolites to the host organism [12] and an inability to steer resources towards the synthesis of the target metabolite [13]. One of the main disadvantages of in vivo metabolic engineering is the balance that exists between the engineer’s objectives (often the overexpression of a target metabolite) and the cellular growth objectives [14]. A nature communications article by Tsutomo et al, recently reported the development of a new metabolic engineering strategy called Parallel Metabolic Pathway Engineering (PMPE) [15]. PMPE was able to use the two different sugars found in Lignocellulosic biomass, xylose and glucose, for two different objectives: glucose was used for the production of the target metabolite, muconic acid, and xylose was used to propagation of the E. coli bacteria [16]. In this way, PMPE was able to independently control bacterial propagation and target metabolite production. Another metabolic engineering strategy that overcomes many of the aforementioned limitations of traditional in vivo metabolic engineering is Cell-Free Metabolic Engineering (CFME), which unlike PMPE, completely eliminates the concern regarding cellular growth. References
What is Cell Free Metabolic Engineering (CFME)?Cell-free metabolic engineering (CFME) enables the synthesis of metabolites outside the confines of the cell, by exploiting cellular machinery. This approach decouples cellular growth objectives from the engineer’s objective to optimise the production of a particular metabolite [1]. CFME offers many advantages over traditional in vivo metabolic engineering. These include enabling the synthesis of metabolites that are toxic to living cells, facilitating the real-time monitoring of intermediates and a greater ability to manipulate reaction conditions. Additionally, cell-free systems offer the potential to direct resources towards a desired metabolite [1].Within CFME, there exist two different synthesis systems: Crude Cell Lysate Systems and Purified Enzyme systems. Limitations of Cell Free systemsOne of the aims of the project is to explore how the modelling of cell-free metabolic engineering can be used to inform the design of in vivo metabolic engineering projects. However, cell free systems have many characteristics that make them inherently different from their in vivo counterparts. A perspective by Laohakunakorn highlights some of these characteristics, which include a lack of compartmentalisation, (due to the absence of cellular membranes in cell-free systems) the tendency of cell-free systems to approach equilibrium and the issue of resource depletion [2]. The absence of compartmentalisation results in a reduction of molecular crowding [2]. Molecular crowding limits diffusion of macromolecules [3], which suggests that the kinetics of a cell-free system and in vivo system will be different. This raises the question of to what extent can insight obtained from cell-free systems be meaningfully applied to in vivo systems? Crude Cell Lysate SystemsCrude Cell Lysate systems are prepared by cell lysis followed by the removal of cellular debris [4]. Cell lysates present some advantages over purified enzyme systems including the ability to recycle energy [5] and the preservation of native metabolic pathways from the host cell [4] which can support cofactor regeneration [4]. Additionally, cell lysates are very cheap to obtain [4] whereas the purification of enzymes required for purified enzyme systems are associated with high costs. Extensive research has been directed towards the optimisation of crude cell extracts. In particular, Jewett et Al have shown that more efficient protein synthesis can be achieved by removing unnatural components (e.g. pH buffers) from the cell lysate and more closely mimicking the cytoplasmic composition of the host cell [6]. Purified Enzyme SystemsPurified enzyme systems are obtained by an overexpression of enzymes in distinct cells, followed by the subsequent purification of these enzymes [1]. One advantage of purified enzyme systems is that they are not subject to the undesired ‘background metabolism’ that is present in crude extract systems and which can make the system difficult to model[4]. Purified enzyme systems offer a much greater degree of control over pathway fluxes due to the fact that the concentration of every component is known ([1]. However, the limitations of purified enzyme systems include the high cost of cofactors, which are required for enzyme activity, and the issues associated with cofactor regeneration[4]. References
Difference between Cell Free Protein Synthesis (CFPS) and CFMEThe distinction between CFPS and CFME is that the objective of CFPS is the engineered production of a protein. On the other hand, CFME is concerned with the production of target metabolites using in vitro combinations of enzymes, which have been obtained either from crude cell lysates or enzyme purification processes[1]. The aim of our project is to select a metabolic pathway that produces a metabolite that has the potential to contribute towards the field of healthcare within the context of the bioeconomy. Ideally, we are aiming to model CFPS that will produce the constituent enzymes that are involved in the metabolic pathway and then proceed to model the resulting metabolic network using CFME. This metabolic engineering approach is referred to in the literature as cell-free protein synthesis driven metabolic engineering (CFPS-ME) [2]. CFPS-ME further increases the DBT cycles [3] by synthesising the enzymes involved in the metabolic pathway in vitro using CFPS which eliminates the need to culture cells to individually overexpress each enzyme. Additionally, it eliminates the requirement of enzyme purification that is present in the construction of purified enzyme systems[3]. The ability to use CFPS-ME for rapid prototyping was highlighted in a study conducted by Jewett et al, who, in the space of a single day, were able to test several different enzyme homologs to try and optimise n-butanol production[2]. However, attempting to reconstruct the entire n-butanol synthesis pathway using CFPS-ME, they found that as they increased the number of enzymes produced by CFPS the amount of n-butanol produced decreased[2]. References
Metabolic BottlenecksMetabolic bottlenecks in metabolic engineering refers to the capacity of the system being limited by a single component ie. metabolite of the pathway. This term, when applied to metabolic engineering, suggests that a single synthetic pathway puts a constraint on the production of the end-product metabolite. In a given metabolic reaction network, there will exist bottlenecks which can massively decrease the yield of the final desired metabolite; Additionally, the presence of bottlenecks in the metabolic pathway can result in an overproduction of the intermediate metabolites, which could elicit a stress response in the cell[1]. This forms the basis of a major disadvantage of in vivo metabolic engineering. Common metabolic bottlenecks include the local accumulation of toxic intermediates or byproducts, cofactor imbalance and inefficient enzymatic activities[2]. By identifying these metabolic bottlenecks, one can attempt to design a metabolism pathway that overcomes these metabolic bottlenecks in order to increase the yield of the final desired metabolite. However, the complexities of these pathway designs remains a significant challenge for metabolic engineering[2]. References
MetabolitesAntiviralsSeasonal Influenza“Seasonal influenza can infect up to 20% of the population. Up to 650000 people die of respiratory diseases linked to seasonal influenza each year and up to 72000 of these deaths occur in the WHO European Region.” [1] In the context of the bioeconomy, the burden of influenza is tremendous in the form of medical care, healthcare utilization and lost working hours. [2] There are four types of influenza viruses with Influenza A and B viruses causing seasonal epidemics of disease. Structure of Influenza Virus and Viral life cycleAlthough the structure of the influenza virus can vary, there are two different glycoprotein spikes embedded on the viral envelope: hemagglutinin (HA) and neuraminidase (NA) proteins. Within the lipid envelope is the influenzas’ genome, organized into eight pieces of single stranded RNA.[3] [4] The viral life cycle can be segmented into the following stages: viral attachment to host cell, viral penetration into host cell, viral uncoating, viral genome replication and transcription, viral translation and assembly followed by viral progeny release. [5] Influenza virus first recognize and bind onto the sialic acid sugars on the surface of the host cell through HA, which triggers a merging of the viral envelope with the host’s endosomal membrane, allowing the viral ribonucleoprotein (RNP) complex which consists of viral RNA segments coated with nucleoprotein and RNA polymerase to enter the host’s cytoplasm through endocytosis. RNPs then enter the host cell nucleus where mRNA of viral origin can be exported and translated. After budding of the influenza virus is complete where mature virions bind on the sialic acid as before through the HA, they detach from the cell surface due to the sialidase activity of the NA protein. The NA protein cleaves terminal sialic acid residues from the cell surface to release progeny virus from the surface of the host cell as well as removes sialic acid residues from the virus envelope itself to prevent viral particle aggregation, enhancing infectivity. Lastly, NA also plays a role in allowing virions to enter respiratory epithelial cells by breaking down mucins in respiratory tract secretions. [6] Shikimic acid and TamifluOseltamivir (sold under Tamiflu) is in a class of medications called neuraminidase inhibitors and aids in preventing and treating influenza A and B. [8] It is ingested in the form of an oral prodrug, oseltamivir phosphate, a compound that is metabolized in the body by hepatic esterase into its pharmacologically active form, oseltamivir carboxylate (OC). It binds to and inhibits the active site of the neuraminidase enzymes that plays a significant role in the release of progeny viruses and enhancement of infectivity, as mentioned above. OC effectively reduces viral replication, limits viral load, drastically reducing influenza symptoms and associated complication. Because of the highly conserved active site of the neuraminidase enzyme, it is effective against all neuraminidase subtypes.[9] Shikimic acid is an important intermediate for the manufacturing of oseltamivir, along with other pharmaceutical compounds. These include Zeylenone and 3,4-Oxo-isopropylidene-SA that has antiviral, anticancer, antibiotic properties and antithrombotic and anti-inflammatory effects, respectively. [11] Shikimate, the more commonly known anionic form of shikimic acid, [12] is naturally extracted from the pods of Chinese star anise plants; however, inadequate raw feedstocks and the complex extraction that involves a 10-step process that takes approximately 6-8 months to complete have rendered it challenging to meet the worldwide demand for it.[13] Chemical- based synthesis are also commercially unattractive because of environmental concerns.[14]. Therefore, there is an ever -increasing demand for an alternative synthesis method for elevated shikimate production. The shikimate pathway is found in bacteria, fungi, plants and algae and is shown in the diagram below. It begins with the condensation of intermediates of the glycolytic pathway and the pentose phosphate (PP) pathway, phosphoenolpyruvate (PEP) and erythrose-4-phosphate (E4P), respectively. These two pathways are part of the central carbon metabolism (CCM), consisting of enzymatic reactions that convert carbon sources into biomass energy.[15] These two compounds are initially combined to produce 3-deoxy-D-arabino-heptulosonate-7-phosophate (DAHP) by DAHP synthase. In E.coli, there are three DAHP synthase isozymes – aroF, aroG and aroH. Hence the resultant DAHP is converted to shikimate with reactions catalyzed by these enzymes. These three isozymes are feedback inhibited by one of the end products, namely aromatic amino acids- tyrosine, phenylamine and tryptophan.[16] [17] Antivirals - Figure 1. Shows the Interaction between CCM (PP, TCA, Glycolysis) and the shikimate pathway (upper)[18] (lower)[19]
Metabolic EngineeringMetabolic Engineering offers a solution to enhanced shikimate biosynthesis by blocking pathways that consume precursors of shikimate as well as overexpressing key genes of enzymes that play a role in its biosynthesis, increasing and directing flux towards the production of shikimate. The main metabolic engineering approaches have been based on genetic manipulations of the CCM and shikimate pathways.[20] The fluxes of PEP and E4P can be increased through genetically engineering the glycolytic and PP pathways, respectively. Previous studies have shown the overexpression of transketolase (tktA) gene has contributed to an elevated concentration of E4P leading to a shikimate yield from 38 to 52 g/L. [21] Other studies have shown that the overexpression of phosphoenolpyruvate synthase (ppsA) involved in gluconeogenesis[22] have also increased shikimate level to 71 g/L. The flux of PEP directed towards the shikimate pathway is often constrained due to the phosphotransferase (PTS) system, the method used by bacteria for sugar uptake where the source of energy is PEP. [23] Hence, genetically modifying the PTS system or utilizing a different glucose transport system or other sources of energy such as pyruvate or glycerol have also been used to increase the yield.[24] [25] Another method that has been used is modifying the feedback resistance enzymes of the DAHP synthase isoenzymes. By eliminating the pathway regulators, the aromatic amino acids or by engineering the enzymes to be feedback-resistant, the yield can also be drastically increased. Overexpression of these three rate-limiting enzymes can also lead to a high level of synthesis of shikimate.[26] Potential of Cell-Free Metabolic EngineeringAlthough there have been no published studies on conducting cell- free metabolic engineering to increase the yield of shikimic acid. There are many advantages and disadvantages to employing a cell-free approach, outlined below. Cell- free systems bypass cell growth, permitting more DBT cycles and avoids the conflict of resource allocation between cell growth and biosynthesis of target products. The open environment allows for direct modification of the components in the reaction. [27] CFPS-based metabolic engineering where gene cloning, bacterial transformation, in-vivo protein expression, cell lysate preparation and protein purification are eliminated to further bypass cell growth, drastically shortening DBT cycles.[28] L-DOPASignificance of L-DOPA in the BioeconomyL-DOPA is commonly used as a Parkinson’s treatment. The production of L-DOPA has a substantial market size of 3.99 Billion USD as of 2016 [29]. This value has an expected Compound Annual Growth Rate (CAGR) of 6.5% from 2017 to 2022 to ultimately achieve a larger market size of 5.69 Billion USD by 2022[29]. Traditional Production of L-DOPAThe conventional commercial production of L-DOPA is chemical synthesis with asymmetric/ enantioselective hydrogenation[30]. L-DOPA - Figure 1. Enantioselective hydrogenation of C2-symmetric diphosphine with Rubidium catalyst to form L-DOPA[31]
The major limitations of this method is the poor conversion rate and low enantioselectivity[30]. The poor conversion rate is attributed to the complicated reaction procedure that has to be conducted under very strict and harsh operational conditions, with low substrate specificity[30]. Moreover, the reaction requires Rubidium as a metal catalyst, which is costly[30]. Biotechnology Production of L-DOPABiotechnological approaches have been explored as alternatives, with the goals of improving the reaction conversion rate and enantioselectivity, as well as economizing the production of L-DOPA. A method used is microbial fermentation[30]. However, there are many limitations with these methods. The following are the microbial fermentation methods and their limitations[30]:
The following are the overall limitations of biotechnological approaches[30]:
Metabolic Engineering of L-DOPAThe L-DOPA production depends on: availability of L-tyrosine & effective enzyme to catalyze the hydroxylation process. There has been an attempt of producing L-DOPA from L-tyrosine from glucose with glycerol using E. coli using metabolic engineering[33]. In order to overcome the metabolic bottleneck, there was an introduction of feedback resistant enzymes, deletion of competing pathways, enhancement of catalytic activity of hpaB, and overexpression of critical pathway enzymes[33]. To address the concerns about metabolic flux, carbon flux was channelled towards L-tyrosine production[33]. This method of production allowed for a three times higher titre of L-DOPA from glucose with glycerol compared to previous works[33]. However, the overexpression of relevant enzymes to increase the carbon flux in E.coli resulted in increased acetate production, which is toxic to the cellular health of E.coli[33]. This limitation calls for the need for cell free metabolic engineering approaches to synthesised L-DOPa, which has yet to be explored. CarotenoidsIntroductionAs outlined by the World Health Organization, “Vitamin A deficiency remains the single most important cause of childhood blindness in developing countries” – whilst also significantly contributing to morbidity and mortality rates infants of low-income countries [34] [35]. This health problem, which in fact is prevalent in more than half the countries of the world, is primarily caused by a lack of provitamin Vitamin A in the diet, and may be exacerbated by high rates of infection[34]. Indeed, as humans cannot naturally synthesise Vitamin A themselves, it must be supplied through the diet by compounds known as carotenes, a subclass of carotenoids. Carotenoids, also known as terpenoids, are a group of naturally occurring compounds. Initially identified as colourants, these compounds collectively form the basis for a class of red, orange and yellow pigments widely found in nature [36] [37]. Though often considered to be plant pigments, they are also of crucial significance to numeral biological functions in the human body [38]. Their physiological importance and their potential application for the prevention of conditions such as cancer, cardiovascular diseases, cataracts and other degenerative diseases have led to a growing demand for efficient and cost-effective production methods[38]. In 2019, the global carotenoids market was valued at USD 1.5 billion and it is projected to grow to USD 2.0 billion by 2026. This significant market growth may be linked to the growing use of carotenoids in preventative healthcare, the increased use of carotenoids as food colourants and the emergence of new technologies for the extraction and synthesis of natural carotenoids [39]. In the context of our project, the development of a model for cell-free synthesis of carotenoids could be of great potential value to the UK bioeconomy, given their vital significance in treating nutritional and age-related conditions. The Chemistry and Function of CarotenoidsTo date, over 1100 carotenoid compounds have been identified from various species of bacteria, eukaryotes and some archaea[36]. Some commonly known carotenoids include lycopene, β-carotene, lutein, astaxanthin and canthaxanthin. As a subclass of isoprenoids, carotenoids are built from C5 isoprene units and rearranged to form long-chain C40 carotenes or xanthophylls[36] – the two main groups of carotenoids. Carotenes are strictly hydrocarbons. Xanthophylls differ from carotenes in that they contain oxygen in addition to hydrogen and carbon, thus having functional groups such as alcohols or carbonyls[38]. Carotenoids can also be grouped according to their structure. As outlined by Perera, C. O. et al, “carotenoids may be acyclic (e.g. lycopene) or contain a ring of six carbons at one or both ends of the molecule (e.g. β-carotene)”. It is indeed these different characteristic chemical structures that give rise to the physiological functions of carotenoids as antioxidants, provitamin A nutrients, and UV protection agents[38]. Carotenoids - Figure 1 shows some major food sources of different carotenoids. Carotenoids - Figure 1. Carotenoid Food Sources[40]
Traditional synthesis of CarotenoidsPlant ExtractionCarotenoids were traditionally obtained through plant extraction methods based on physiochemical processes. Depending on the carotenoid of interest, different plants or vegetables may be used for extraction. As an example, the orange carrot is the most popular plant source for extraction of β-carotene, having up to 9.9932 mg beta-carotene per 100 g edible carrot[37]. However, there are several disadvantages associated with the use of physiochemical extraction methods for carotenoids, including high costs, seasonal and geographical availability of the raw plant materials and the low yield of carotenoid per kg of plant material processed. Chemical synthesisMethods for chemical synthesis of carotenoids were developed in the 1905s and have since been used as an alternative to plant extraction. For chemical carotenoid synthesis, either Witting reactions or Grignard compounds are used. Witting reactions involve the production of aldehydes from aldehydes or ketones, whereas synthesis through Grignard compounds makes use of organometallic Grignard compounds[37]. Though still widely used, some studies have shown that chemically produced carotenoids, such as synthetic beta-carotene, exhibit carcinogenic activity [41]. Metabolic Engineering for Carotenoid SynthesisThe limitations of natural extraction and chemical synthesis of carotenoids led to the application of metabolic engineering methods as an alternative production route to meet the growing demands from industry. As well as overcoming several of the challenges linked to natural extraction methods, the production of carotenoids through native microbial hosts also enables the use of low-cost substrates and the development of competitive bioprocesses[42]. Many efforts have been made to genetically engineer microbial hosts such as E.coli and C. cerevisiae for the production of various carotenoids including lycopene and β-carotene[36]. A review of the literature shows that many of these efforts involve optimising the biosynthetic pathway native to the host of choice. Others have successfully attempted to introduce the pathway into non-native hosts. To understand how metabolic engineering may be applied to carotenoid synthesis, a brief explanation of the carotenoid biosynthetic pathway is outlined below. Biosynthetic pathway of carotenoidsAs outlined by Wang. C et al., “the carotenoid biosynthetic pathway, native to several eukaryotes, bacteria and some archaea, can be divided into four steps: geranylgeranyl diphosphate (GGPP, C20) precursor supply, phytoene desaturation, lycopene cyclization, and carotene modifications”[36]. A summary of these steps can be seen in Carotenoids - Figure 2. Carotenoids - Figure 2. The Carotenoid Biosynthetic Pathway[43]
Supply of GGPP is achieved through two different isoprenoid pathways: the mevalonate (MVA) pathway native to eukaryotes and some bacteria and the non-mevalonate (MEP) pathway native to most bacteria and some plant plastids[36]. Both of these pathways are involved in the synthesis of the two isoprenoid precursors DMAPP (dimethylallyl diphosphate) and IPP (isopentenyl diphosphate), of which one DMAPP and three IPPs are condensed to form one GGPP in a series of reactions. Several subsequent steps involving phytoene desaturation lead to the synthesis of the first carotene: lycopene. From here, different lycopene cyclases catalyse the introduction of a β- or ε-ring at the chain ends of lycopene to form α-, β-, γ- or δ-carotene, depending on the position of the ionone ring[36] [44]. Finally, these carotenes can be hydroxylated at the terminal ionone ring by hydroxylase and ketolases to form various xanthophylls such as zeaxanthin, canthaxanthin or astaxanthin[36]. Metabolic Engineering of CarotenoidsIn efforts to produce carotenoids through in vivo metabolic engineering of microbial hosts, several methods have been tested and evaluated. As outlined in a review by Li, M. et al., efforts to produce lycopene through metabolic engineering of E. coli have involved a variety of optimisation strategies. These include but are not limited to:
Overall, the ease of genetic manipulation, rapid growth and simple media requirements make microbial hosts such as E.coli good candidates for the in vivo production of lycopene, as well as other isoprenoids. However, the extraction of carotenoids produced in microorganisms remains a complicated and expensive process. Long-chain hydrophobic carotenes such as lycopene typically accumulate in membrane compartments[45] [46], thus complicating the extraction process. Indeed, studies have shown that high lycopene contents lead to membrane stress as excess lycopene accumulates in the cell membrane. This challenge highlights the possible need for cell-free methods in the biosynthesis of natural carotenoids – where extraction processes are easier and the accumulation of lycopene in membranes would not occur. The Potential for Cell-Free Synthesis of CarotenoidsThough recent literature is scarce of studies that have attempted cell-free synthesis to produce and increase the yield of carotenoids, there may be several advantages of producing carotenoids through such an approach. Besides the aforementioned ease of extraction and overcoming the issue of cell membrane stress, CFME also allows for reduction of DBT (Design-Build-Test) cycles, operational simplicity, high conversion yield and productivity, quantitative and precise assessment of performance and greater design flexibility [47]. References
ModellingIntroductionThe goal of metabolic engineering is to develop targeted methods to improve the metabolic capacities of industrially relevant microorganisms, and mathematical modelling is a very successful scientific tool that can be used to achieve this [1]. A mathematical model is an ‘abstract representation of a target system’ [2] and one of its purposes is to ‘simulate, predict and optimize procedures, experiments and therapies[3]. These models are used to understand internal and external cell interactions and how they influence cell metabolism[4]. Specifically, processes such as metabolite concentrations and reaction fluxes of metabolic pathways regulated by enzymes under varying internal and external conditions are examined and translated to numerical problems with formal representations[4]. This resulting model should have a high level of accuracy and detail in order to represent the complexity of the behavior of a metabolic network[4]. Developing a computational model that simulates a metabolic pathway after, for example, the knockout of a component is very useful, since this process is much more difficult, time-consuming, and costly to carry out experimentally[5]. For cell free systems, we can use modelling to simulate the outcome of metabolic pathways in order to predict and optimize the yield of a desired metabolite. There are two approaches for mathematically representing biological networks: stationary modelling examines the system at an equilibrium point where metabolite concentrations are constant over time[4]. Dynamic modelling takes into account the changes in metabolite concentrations over time[4]. Overall, depending on the extent of a models’ abstraction, it has a particular granularity, which refers to the level of detail in the description of the system[2]. The choice of its level has many tradeoffs and considerations including cost and time-benefits[2]. The accuracy of a model is limited by its granularity, and as George Box said “Essentially, all models are wrong, but some are useful”[2]. Overall Framework for metabolic engineeringThe overall modelling framework begins with reconstructing the model from its genome sequence, information from biological databases and also literature. The model created can include stoichiometric information only or it can be incorporated with kinetics. Then the model is simulated for prediction of the phenotype of the system. For kinetic methods, steady and transient state behaviors of metabolite concentrations and fluxes are computed, while for stoichiometric methods, a flux distribution imposed by constraints is calculated. Lastly, the phenotype is evaluated and optimized until it meets termination criteria. The optimization includes integrating solutions as perturbations to the model, by changing the kinetic parameters or constraints so that a new phenotype can be simulated[4]. Constraint Based modelling (CBM)Constraint based modelling framework is an approach to predict metabolic phenotypes in the stationary case, and limits models to only encompass reaction stoichiometry and reversibility[4]. Thus, it is established on the assumption of steady-state operation and cannot express transient behaviours[4].The module formulations employed in this approach are based on linear equation systems that are normally undetermined, resulting in infinite solutions. For this reason, restrictions or ‘constraints’ and assumptions or ‘objective functions’ are implemented to identify the ideal solution[4]. These rely on physical and chemical restrictions including fluxes, mass balance and thermodynamics and produce solutions that are characterized by steady-state flux distributions within the space of feasible solutions known as the flux hyper cone[4]. The CBM method does not require any mechanistic knowledge, only minimum and maximum flux bounds are needed since changing metabolite concentrations over time and transient behaviors are not taken into account[4]. CBM thus circumvents difficulties associated with estimating kinetic rate equations and parameters but still produces results that are usually close to experimental observations[4]. The most common approach in identifying the optimal value for the flux distribution is Flux Balance Analysis (FBA), which examines the internal metabolite concentrations in a quasi-steady state and imposes specific constraints[4]. FBA defines a linear, biological objective function relevant to the problem and optimizes it through minimization or maximization to determine the optimal flux distribution[4]. FBA calculates the flow of metabolites through a metabolic network; thus, it can predict the growth rate of an organism or the rate of production of an important metabolite[6]. Specifically, the first step of FBA is to metabolically reconstruct the network that is made up of a list of stoichiometrically balanced biochemical equations. The reconstruction is then mathematically modelled, which is done by creating a stoichiometric matrix (S), representing the metabolic reactions, of size m*n, where every row m represents a metabolite or compound (the system has m compounds) and every column representing a reaction (the system has n reactions)[6]. The entries in each column are the stoichiometric coefficients of the metabolites engaging in each reaction that implement restrictions on the flow of metabolites through the network[6]. A negative coefficient is used for each metabolite that is consumed, a positive coefficient represents a metabolite that is produced and a coefficient of zero is employed for metabolites that do not take part in a certain reaction[6]. The flux through all the reactions in a network is captured by the vector v, with a length n and the concentrations of metabolites are represented by the vector x, with length m.[6] Since the system assumes internal concentrations of metabolites do not change over time [7], the system of mass balance equations at the steady state with dx/dt = 0, is Sv = 0.[6] These constraints result in a range of solutions, however since there are more reactions than metabolites in most realistic models, there is more than one unique solution to the equation[6]. Thus, the next step is to identify an objective function Z = cTv, c being a vector of weights representing how much each reaction contributes to the objective function [6]. The system is then optimized via linear programming and the output is a specific flux distribution v that maximizes the objective function[6]. The COBRA Matlab Toolbox can be used to carry out necessary calculations[6]. Dynamic ModellingDynamic modeling describes cell metabolism using methods based on the kinetics of metabolic processes and produces detailed, unique solutions in time for both the transient and equilibrium states, starting from any initial metabolite concentration[4]. It is based on knowledge of enzyme mechanisms and experimental data and can represent changes in metabolite concentrations over time through the use of ordinary differential equations (ODEs)[4]. The ODEs encompass initial metabolite concentration values, reaction rate equations and kinetic parameters[6]. The solution spaces of dynamic approaches are a subset of constraint-based solutions, as they impose kinetic derived constraints on top of the same core constraints[4]. In order to create a dynamic model, the first step involves establishing the interaction mechanism of the network and then secondly, the type of representation for the kinetic rate expressions needs to be decided [4]. After this, the model can be expressed by a set of ODEs that depict the time trajectories of the processes and output values that can be supplemented with experimental data[4]. In order to build dynamic models, there are two approaches taken: bottom-up modeling integrates isolated pathway components obtained through mechanistic descriptions of behaviors to build the model and then uses experimental data to validate the model, involving many kinetic parameters [4]. On the other hand, building the model via a top-down approach employs experimental data to improve pre-existing models, allowing for the integration of unidentified mechanisms, interactions and properties into the model[4]. Dynamic models can be subdivided into deterministic and non-deterministic models[4]. There are two kinds of deterministic models, mechanistic and approximate kinetic models[4]. Mechanistic modelsThese models are based off mass action law, which explains that the reaction rate is proportional to the concentration of the reactants. Mechanistic expressions are employed for one-step reaction or their combination of their mass actions into multi-step reactions[4]. Michaelis-Menten models are used to model enzymatic equations that do not have allosteric effects[5] i.e. when a ligand binds to the enzyme which changes the properties of another site on the protein. Assumptions that are made for this model include Quasi steady state approximation that states that the concentration of the enzyme substrate (ES) complex is constant, meaning we assume that the enzyme concentration is much smaller than the initial substrate concentration leading to saturation of the enzyme[4]. Conservation laws are also assumed, i.e. that no enzyme is consumed or produced. The following equation is used to express the rate of the enzymatic reaction[5]: [math]\displaystyle{ v_0 = \frac{d[P]}{dt} = v_{max} \frac{[s]}{[s]+k_m} }[/math]
where v is the reaction rate, [S] is the concentration of the substrate, [P] is the concentration of the product, Vmax is the maximum reaction rate reached by the system, and km is a constant that represents the substrate concentration where the reaction rate is half of its maximum rate[4]. Hill rate laws are used when cooperative binding is involved in the reaction, i.e when an enzyme has many binding sites and after binding of a molecule, the affinity of the enzyme changes[5]. The hill equation is the following: [math]\displaystyle{ v_0 = \frac{d[P]}{dt} = v_{max} \frac{[s]^n}{[s]^n+k_m} }[/math]
Thus, when the order n=1, the hill equation is the same as the Michaelis-Menten equation. Approximate Kinetic ModelsThese models aim to describe the behavior of biological systems using only a small number of defined parameters[4]. This is useful since reliable rate equations required by mechanistic models are not always known for certain reactants and the rate expression can be challenging to quantify[4]. These models are ‘approximate’ because they exhibit similar model behavior to mechanistic models with respect to transient and steady states but with fewer parameters[4]. Linear-Logarithm (Lin-Log) Kinetics are used in gene regulatory systems where rates of enzyme reaction are proportional to enzyme concentration and can produce analytical solutions for the steady states[4]. They work best modelling systems that undergo exponential growth and have been proved to model in-vivo systems accurately[4]. The equation for this can be seen in Modelling - Table 1. Log-Linear (Log-Lin) Kinetics are used in metabolic where parameters are influenced by spatiotemporal variations[4]. This approach is very similar to Lin-Log in that it results in analytic solutions by imposing linear and logarithmic expressions, however for Log-Lin the reaction rate is not proportional to the enzyme concentrations[4]. Thus, it can be employed to represent dynamic systems with strong nonlinearities. However, this model uses quasi steady state approximation, which assumes that all state variables are constant meaning it is not suitable for transient systems[4]. Power laws are used to simply model aggregation and consumption processes in enzyme catalyzed reactions and can calculate analytical solutions at the steady state[4]. It has the equation seen in Modelling - Table 1. Convenience rate laws are used to thermodynamically, independently represent enzyme saturation and regulation through activators and inhibitors[4]. It is a general form of Michaelis-Menten kinetic which incorporates stoichiometries and assumes a fast equilibrium between substrates, products and enzymes[4]. It assumes an enzyme mechanism of random order and can also be used for reaction with any number of substrates and products[4]. It only needs a small number of parameters which can easily be calculated via least-squares estimation[4]. Modular Rate Laws are used for reactions with arbitrary stoichiometry or various types of regulation. This model simplifies thermodynamic-kinetic modelling but is also less accurate than detailed kinetic equations[4]. The equation is found in Modelling - Table 1. Cooperativity and Saturation are used to fit experimental data using systems that have a saturated form [4]. It uses equations that are similar to the Hill rate laws, is accurate over a wider range if the approximated functions are saturated but therefore also needs more parameters[4].
Modelling - Table 1. Rate Law equations for lin-log kinetics, power laws and modular rate laws[4]
Comparing Constraint based Modelling and Dynamic ModellingOverall, models are defined by network complexity in size and the level of detail and accuracy[4]. This can be seen in Modelling - Figure 1, where models are characterized on a coordinate plane according to these features[4]. Constraint based models are placed in the top left as they can model large and complex networks but therefore do not include a lot of detail[4]. Kinetic models can be seen in the graph as split between deterministic and non-deterministic models, and overall, are much lower and further right; deterministic methods carry a medium complexity and degree of detail since they incorporate dynamic information[4]. Non-deterministic models can represent the system with the highest accuracy but are therefore hindered in the network size they are able to model[4]. Hybrid approaches, which combine the best features of both constraint based and dynamic methods and thus have a very high level of accuracy of modelling large networks[4]. Modelling - Figure 1. Comparing different modelling methods according to their accuracy and network complexity[10]
Modelling - Table 2. Advantages and Disadvantages of constraint based modelling and dynamic modelling[4]
When considering which modelling approach to use for modelling we can look at the advantages and disadvantages of both approaches depicted in table 2. There are two main reasons for why dynamic models are much more applicable for cell free systems than constraint-based methods. Firstly, within cells there are homeostasis regulation mechanisms allowing metabolite concentrations to remain relatively constant over time. For cell-free systems these homeostatic regulations do not exist, causing metabolite concentrations to vary over time. As seen in table 1, since constraint-based methods only employ stoichiometric measures, they cannot account for transient behaviors. Thus since it is a stationary modelling method working at the equilibrium point, it assumes metabolite concentrations are constant over time[7] and cannot accurately represent a cell free system. Dynamic modelling can take this into account. The other reason is that in cell free systems, resource competition is an important behaviour, since there are only a fixed amount of resources in the cell free extract. This has to be accounted for in modelling, and constraint based models cannot take this into account. For these reasons dynamic modelling can better represent cell free systems. Parameter EstimationA big challenge for dynamic modelling is the estimation of kinetic parameters. Since many issues arise when using parameters from different sources, parameter estimation techniques are used to find optimal values of parameters[4]. Parameter estimation uses an optimization algorithm that searches through many possible values, under specific constraints and non-linear structures, and then finds complex objective functions with many solutions through local optima[4]. Thus, the goal of optimization problems is to find a global optimum in a reasonable time employing local or global methods[4]. Local methods start the optimization with reference parameters measured or found in literature and then these are improved by each run of the algorithm. These algorithms are often based on the Hessian and gradient of the objective functions, computed numerically through methods like finite difference approximations. Local methods however do not always find the global solution depending on its neighborhood. Global methods use metaheuristics including simulated annealing and genetic or evolutionary algorithms. Combining global and local methods is the most successful method when the solutions are close to an optimal[4]. The most common approach to find the objective function is to minimize the difference between the model output and the experiment data[4]. This approach was used in the dynamic modelling by Wayman et. al[12] and Horvath et. al[13]. Databases and ToolsPathway Databases
Tools for Metabolic Modelling
ExamplesToward a genome scale sequence specific dynamic model of cell-free protein synthesis in Escherichia coli[13]In this study, a dynamic mathematical model of E.coli cell free protein synthesis was developed. CFPS model equations were defined using the hybrid cell-free modelling framework of ‘Wayman et.al’ that integrates kinetic modelling with a logical rule-based description of allosteric regulation. The model parameters were approximated from measurements of glucose, organic acids, energy species, amino acids and the protein product chloramphenicol acetyltransferase (CAT). A constrained Markov Chain Monte Carlo (MCMC) algorithm was employed to estimate a collection of 100 model parameter sets, which minimized the squared difference between model simulations and experimental measurements. A range for each kinetic parameter was established from the BioNumbers database. The constrained MCMC approach estimated parameter sets with a median error 2 times less than using random parameter sets generated from the bounds in literature. The kinetic models were then used to examine the performance of the CFPS system and to find the pathways most important to protein production. The metabolic network was divided into 19 reaction groups and group knockout analysis was used to estimate the effect of different network functions on protein synthesis. It was found that CAT was produced with an efficiency of 12%, which showed that a lot of the energy resources for protein synthesis were channelled to non-productive pathways. By simulating the knockout of metabolic enzyme groups, it was shown that metabolism and protein production were reliant on oxidative phosphorylation and glycolysis/gluconeogenesis. References
TX-TL ModellingN.B It should be noted that although primary sources are cited below, much of the general understanding stems from the mini review published by Koch et al., 2018[1]. What is TX-TL?Transcription-Translation (TX-TL) modelling extends the standard model of gene expression to cell-free systems. It takes into account, as its name suggests, the shared transcriptional and translational machinery in in-vitro cell-free systems. Typically, TX-TL models are classified as dynamic, deterministic models and hence involve modelling the kinetics of the cell-free system through non-linear ordinary differential equations. This involves breaking down the process of transcription and translation into two similar equations and relating these to the rate of protein production through various parameters, whilst adhering to the various conservation laws outlined below. Since these models use non-linear ODEs they cannot be solved analytically, parameters must either be measured experimentally or estimated using software. Generally, TX-TL models require few external parameters and those that are required generally have clear physical interpretations meaning most can be quantified experimentally. In the papers reviewed below the model parameters were tuned using parameter tuning software alongside real experimental data. Ideally this is the approach we would take in developing our model, however due to the current COVID-19 situation it is unlikely that we will be able to secure sufficient lab time to gather the in-vitro data. One approach could be to test and tune our model based on data which has already been collected for similar metabolites. Why use TX-TL for Cell-Free Systems?In-vivo modelling techniques are designed with several active cellular processes in mind, in cell-free systems many of these processes no longer continue to run. Additionally, several environmental differences exist for example molecular crowding [2], the significantly altered resource distribution [3] and the lack of resource competition with the host. It follows that to accurately model a cell-free system these standard modelling techniques must be adapted to take these environmental differences into account. What do TX-TL Models Predict?Cell-free systems are used for protein production, according to [4] the rate of protein production can be broken down into four phases: 1) Increasing rate of production, 2) Constant rate of production, 3) Decreasing rate of production, 4) Rate of production is zero. And for mRNA production: 1) Increasing concentration, 2) Constant concentration, 3) Decreasing concentration, 4) Concentration close to zero. These phases are summarised in TX-TL - Figure 1. TX-TL - Figure 1. From Koch et al., 2018, Summarises the four phases of protein and mRNA production[1]
TX-TL models are deterministic models which use ODEs to describe transcription, translation, mRNA and protein degradation processes and fit these to the phases described above using various parameters. The General AssumptionThe total DNA concentration is constant Usually, a protected linear template or plasmid DNA is inserted into the cell-free system for protein production, since no dNTPs are added[1] replication can be ignored. Since the half life of DNA is so long[5] DNA degradation is generally neglected too. General IssuesOne issue which is consistently outlined in papers is that batch-to-batch variations in systems can often have a significant impact on how the system behaves. This means that the values of parameters can also vary widely between batches, as well as from in vivo to cell-free[1]. A Basic ModelThe following model by Karzbrun et al.[6] requires ten parameters and fits the protein and mRNA dynamics described above whilst remaining coarse grained. Parameters were derived based on experiments which used cell extract solely comprising of Escherichia Coli machinery. Transcription Dynamics [math]\displaystyle{ \dot{m}(t) = k_{TX} \cdot N^{-1}_m [R_pD]- \frac{m(t)}{\tau_m}, \forall t \gt \tau_0 }[/math]
Translation Dynamics [math]\displaystyle{ \dot{p}_{syn} = k_{TL} \cdot N^{-1}_p \cdot \frac{m(D, t-\tau_f)}{1 + K_{TL}/R} }[/math]
The authors found that the model was able to accurately predict TX/TL and degradation dynamics for phases 1 and 2, however the model did not reach a steady state. Additionally, the degradation modelled occurred at a constant rate rather than exponentially. This points to the model being appropriate for use in a cell-free system prior to resource consumption or waste accumulation cause the reaction to halt[1], this is approximately the first 60 minutes[7]. A TX-TL Model Based on Michaelis-Menten KineticsStögbauer et al.[7] established a rate equation model with eight free parameters which fit the experimental data they gathered. Experimentally the concentration of GFP present in the reaction mixture was measured using a spectrometer. A cell-free expression system which consisted of reconstituted components was used, the system was RNAse and protease free (neglecting degradation through proteins) and contained 50 purified components. The model builds on using deterministic rate equations to capture transcription and translation, specifically Michaelis-Menten and Hill kinetics were tested. Transcription Dynamics [math]\displaystyle{ \dot{mRNA} = \frac{k_ts \cdot TsR \cdot DNA}{K_s + DNA} - \delta_{mRNA} \cdot mRNA }[/math]
Translation Dynamics [math]\displaystyle{ \dot{GFP} = \frac{k_tl \cdot TlR \cdot mRNA}{K_l + mRNA} - k_{mat} \cdot GFP }[/math]
[math]\displaystyle{ \dot{GFP}^* = k_mat \cdot GFP }[/math]
Prefactors [math]\displaystyle{ \dot{TsR} = - \frac{k_Cs \cdot TsR \cdot DNA}{K_s + DNA} }[/math]
[math]\displaystyle{ \dot{TlR} = - \frac{\delta_{TlR} \cdot TlR}{K_{TlR} + TlR} }[/math]
The transcription and translation dynamics are represented through Michaelis-Menten like reactions. The authors note that since no proteases were present in the lysate GFP degradation has been ignored, so applying this model to a normal cell-free system would require the addition of a protein degradation term. Models of this type fit the early and late phases of biosynthesis reactions whilst remaining coarse grained (i.e. simple). Variations of TX-TL Models Involving Michaelis-Menten KineticsAuthors have built upon the model above leading to more detailed descriptions of the TX-TL processes[1]. Tuza et al.[8] added reversible binding of RNA Polymerase, a reversible binding of the first amino acid as well as an irreversible elongation step. Similar reversible binding reactions were applied to the translational equation. Moore et al.[9] also developed a model which took the above into account, as well as the sharing of the NTP energy source (i.e. an energy source for cellular reactions). Both models mentioned in this section led to more accurate predictions of cell-free systems due to the additional properties of the cell-free system that they captured. Resource CompetitionThe following resources are limited in a typical cell-free system[1]: NTPs, ribosomes, RNA polymerase, elongation factors, chaperone proteins, tRNA synthase. It follows that in a typical cell-free system the reason for phase 2 (constant rate of protein production) is because of a limitation of at least one of the mentioned resources. The models mentioned below describe competition for fixed amounts of transcriptional and translational resources only, i.e. they do not take into account metabolite production or consumption. TX-TL Models Which Consider Resource CompetitionAlgar, Ellis and Stan[10] produced a model which focuses on translational elongation in chassis cells, assuming that the resources for transcription are far less restricted than those for translation. The model, which is derived from first principles in their paper, is a probabilistic model which uses deterministic nonlinear ordinary differential equations. In a separate paper[11], some of the same authors tested the existing model in a cell-free system. The paper focussed on using cell-free assay to predict the demands of expressing a protein in Escherichia Coli (in vivo), their findings showed that the in-vitro system could accurately predict the in vivo results. Given the current global situation, this paper suggests that when testing our in-silico model, we can use similar in vivo data as a suitable alternative to gathering in vitro measurements. However, this raises the question that if in vitro models can be tuned using in vivo data from the equivalent system, why can the parameters of the standard model of gene expression not be tuned to suit in vitro systems? Borkowski et al. also found that increasing DNA concentration only increases protein production until a saturation point has been reached. They suggest that this could owe to the higher transcriptional rate using up resources shared between the translation and transcription portions of the system, one such example is GTP. GTP is used as a nucleotide during transcription as well as an energy source when elongation occurs during translation. Gyorgy and Murray[12] proposed a dynamic model which focused on resource competition, namely the limited availability of RNA polymerase and ribosomes. The equations are outlined below: Transcription Dynamics [math]\displaystyle{ b_i + p \underset{k_i^+}{\stackrel{k_i^-}{\rightleftharpoons}} c_i \qquad and \qquad c_i \stackrel{\alpha_i}{\rightarrow} b_i + p + m_i }[/math]
Translation Dynamics [math]\displaystyle{ m_i + r \underset{k_i^+}{\stackrel{k_i^-}{\rightleftharpoons}} d_i \qquad and \qquad d_i \stackrel{\alpha_i}{\rightarrow} m_i + r + x_i }[/math]
Degradation Dynamics [math]\displaystyle{ m_i \stackrel{\delta_i}{\rightarrow} 0 }[/math]
[math]\displaystyle{ d_i \stackrel{\delta_i}{\rightarrow} r }[/math]
[math]\displaystyle{ x_i \stackrel{\gamma_i}{\rightarrow} 0 }[/math]
Resulting model [math]\displaystyle{ \dot{b}_i = -(k_i^+ pb_i - k_i^-c_i) + \alpha_i c_i, }[/math]
[math]\displaystyle{ \dot{c}_i = -(k_i^+ pb_i - k_i^-c_i) - \alpha_i c_i, }[/math]
[math]\displaystyle{ \dot{m}_i = \alpha_i c_i - \delta_i m_i - (k_i^+ m_ir - k_i^-d_i) + \beta_i d_i, }[/math]
[math]\displaystyle{ \dot{d}_i = (k_i^+ m_ir - k_i^-d_i) - \beta_i d_i - \delta_i d_i, }[/math]
[math]\displaystyle{ \dot{x}_i = \beta_i d_i - \gamma_i x_i, }[/math]
This model reaches a quasi-steady state but only through assuming that the rates of protein production and decay are far slower than the rates of the binding and unbinding reactions. The authors also note that this model is only accurate until phase 2 (constant rate of protein production). MetabolismKoch et al.[1] mentions that the approaches discussed above can be less accurate when used on cell-free metabolic engineering systems. This is because the dynamic models do not consider metabolites as part of the limited resources, cell-free metabolic engineering systems (CFME) often compete for metabolites. Constraint-based models are used to predict the metabolic production of the entire CFME system and dynamic models are used for the transcription and translation dynamics of individual proteins[13]. It is important to consider metabolic activity on a system level since each protein can have varying dynamics along the pathway, but the rate of metabolite production is still governed by the rate limiting step. In their review Koch et al.[1] notes that the field of modelling CFME systems is in its infancy and so very few models exist at the moment. They also note that despite these being constraint based models, only a few constraints have been identified, this raises the question of how accurate these models are in general systems. Implementing ModelsA handful of ‘toolboxes’ have been created by research groups to aid the implementation of models. These contain pre-written functions, usually in MATLAB, which represent each protein in the system[8] [14]. Obtaining Parameter ValuesParameters that cannot be measured through experiments must be estimated by using software to fit the parameters of the model to the relevant experimental data. The simplest algorithm is the Random Walk. This takes random steps over a given range of numbers and uses this value for the parameter. The error between the prediction and the experimental data is then measured. This repeats and the algorithm is stopped after a set time and the best parameters are taken. Moore et al.[9] mentions two other methods: Bayesian Statistical Inference and Maximum likelihood point estimates. The first involves treating each parameter as a random variable. This means that each parameter will have a probability density function, these become narrower as more data is added, pointing to the best values. The author states that maximum likelihood point estimates have some shortcomings in complex parameter estimation situations. Model Types SummaryModels based on Michaelis-Menten kinetics fit the early and late stages of cell-free systems well. They remain accurate and coarse-grained throughout this. Models which focus on resource competition in cell-free are accurate until phase 2 (constant rate of protein production). These usually only require parameters which can be measured experimentally, and in some cases remove the need for parameter estimation. References
Paper ReviewsIntroductionThe aim of this section is to provide a high level overview of papers relating to the project. The views outlined here are purely the interpretation of the group. Quantitative Modelling of Transcription and Translation of an all-E. coli cell-free systemIntroduction
TX: Transcription
Calculating the mRNA steady state
TL: TranslationThe following section contains a summary of the translation part of this TXTL system; the parameters, how they were measured/estimated/determined and the derived equations.
Rate of deGFPdark production: Steady state for deGFPdark: For low plasmid concentrations [P70] < 1 nM, one can expect KM, R/R0 << 1. Hence, we get: A simple expression for the linear accumulation of deGFPmat at low plasmid concentrations is then:
(taking kdeg,m = 8.25 10−4 s−1 for the deGFP mRNA)
Parts Combination and Sensitivity Analysis
Strengths of synthetic vs natural regulatory elements
TXTL load calculator
Running the model derived in this paper using the group's own codeThe commented code for this paper can be found in the following Google Colab Notebook: https://colab.research.google.com/drive/1FfpIHa75qd5xZmYeDokivv2ZSVVmwu21?usp=sharing Version 1The group was advised to follow the industry standard of TX-TL simulation and code the model using Python. Google Collaboratory was used to enable several group members easily to work on the same code in a Jupyter Notebook style document. In the first version of the code (Noireaux(V1).ipynb) a function was written to calculate constants E0 (concentration of free RNAP), R0 (concentration of free ribosomes) and mSS the mRNA concentrations at the steady state. The naming convention used was “calc_” followed by the name of the constant. It was often necessary to solve an equation which was set equal to zero to obtain the value of these constants, to do this the solveset function from the sympy library was used. It was necessary to isolate the positive real roots following the use of the solveset function. The three ODEs (equations 17, 18 and 19) in the paper were modelled using a function representing the 3-equation system (function model) and the function was solved (function txtl) using the odeint function from the scipy library. A second ODE system was added to the model (functions degrad_model and degrad_txtl), in which a linear degradation term was added to equation 19. This meant that the concentration of deGFPmat no longer increased infinitely. This system of equations was also solved using the odeint function. The matplotlib library was used to plot the various graphs found in the paper. Comparisons of the group’s code and the figures from the Noireaux paper can be found in the figures below: Whilst the code captures the correct shape of the relevant figures in the paper, the values predicted by the code are not correct. Version 2 of the code will address this. We have confirmed that the graphs in the paper are indeed experimental data and not from the TXTL model Version 2, 3 and 4In later versions of the code, a parameter vector was used for clarity. This included all the previously used parameters. The ODEs were rewritten into sums of linear rates, where each rate represents a different chemical reaction. Sensitivity analysis was also added by incorporating parameters for P70b and its corresponding UTR2 and P70c and corresponding UTR3. General code readability improvements were also carried out: including more functions and comprehensions. Five different functions were created to compute the concentrations: E0, R0, Mss and deGFPss. A further new function was created to modify the parameters corresponding to the chosen promoter, which changes the kcat,m and kcat,p values in the parameter vector. Finally, a new parameter was added to the TXTL model to incorporate degradation in the sum of linear rates.
| ||||||||||||||||||||||||||||||
Recently Edited Notebook Pages |