Metabolic engineering, which draws upon the key engineering principles of integration and quantification, is a platform technology that provides solutions to practical problems in the context of systems and synthetic biology. In particular, we are interested in developing and applying systematic and combinatorial methods for strain improvement. Also, I would like to extend these methods to the study of fundamental biology, while also satisfying practical demands in microbial metabolic engineering. Ultimately, these new methods and the knowledge gained from them will contribute greatly to the emerging realm of biological engineering. Among many applications of metabolic engineering, bioenergy (biochemicals) production is primary applications of our research program.
The overall goals of our research are (1) to develop useful/efficient computational and experimental tools for the dissection of complex metabolic networks in microbial cells, and (2) to create optimal strains for biotechnological processes using these developed tools. Our current research program can be summarized as follows:
Metabolic engineering 2.0
Metabolic engineering is an optimization process in strain improvement. Since none of naturally existing microorganisms exhibit optimal (perfect) phenotypes for man-created (designed) bioconversion processes, cellular (metabolic) activities of microbial strains in the bioconversion process need to be optimized for the purpose. To this end, recombinant DNA technology (genetic engineering) has been employed to elicit optimal phenotypes by amplifying or reducing specific metabolic activities. With rapid advances in molecular biology, it became a relatively easy task to introduce heterologous genes or delete endogenous genes for implementing an intended phenotype. However, the resulting phenotypes are often suboptimal due to distant effects of the genetic modifications or unknown regulatory interactions. For instance, three genes (XYL1, XYL2, and XYL3) from Pichia stipitis, a xylose fermenting yeast, were introduced into S. cerevisiae for the addition of the xylose metabolic pathway. The engineered S. cerevisiae strain was able to assimilate xylose as intended, but ethanol production from xylose was insignificant unexpectedly. This is a typical pitfall of metabolic engineering 1.0 where we tried to engineer target genes based on rational or hypothesis-driven methods. Now we understand that overexpression of three genes (XYL1, XYL2, and XYL3) is not a necessary and sufficient condition but a necessary condition for efficient ethanol production from xylose. As such, an open question in metabolic engineering 2.0 is how to identify gene targets that have direct or indirect impact on a particular phenotype of interest. In order to address the problem, we propose two different (complementary) approaches: systematic and combinatorial search of gene targets. In systematic search, global stoichiometric modeling can be employed to analyze stoichiometric interactions of putative gene targets using appropriate objective functions and constraints. This method is compounded by the lack of comprehensive and accurate metabolic models that capture both reaction kinetics and genetic regulation. An alternative method for identifying gene targets, termed combinatorial search method, takes advantage of recent advances in high-throughput screening and traceable genetic perturbation. In this method a perturbation library (either overexpression or knockout) is first screened for a desirable phenotype, and the genetic modifications responsible for the desired phenotype are traced by sequencing or microarray-based methods. We have exemplified the above-mentioned methods in the context of xylose fermentation by recombinant S. cerevisiae and lycopene production by recombinant E. coli . We will continue to identify novel gene targets or genetic (metabolic) networks that impact yield and rate of xylose fermentation in recombinant S. cerevisiae through systematic and combinatorial search
Multi-dimensional search of genetic perturbations
The above-mentioned studies relied on one-dimensional searches whereby only one type of perturbations was probed for the improvement of a metabolic phenotype. One of the open questions in the quest of genetic targets for metabolic engineering is how multiple targets are identified. In principle, there should be a set of genes conferring the optimal phenotype. However, such an optimal set cannot be experimentally discovered as it is impossible to construct and evaluate all possible combinations of these genes. Therefore, a proper search strategy is required to identify multiple targets. We demonstrated a sequential search method where additional gene targets are explored after implementation of the identified targets from previous rounds of the search. While successful in improving product synthesis, the above methods are still limited in the sense that they probe the sequence space along a pre-specified, sequential, gene-by-gene trajectory. Therefore, there is no assurance that such searches can reach a global phenotypic maximum. In order to expand the sequence space over which such gene target searches are conducted, we propose a hybrid approach which combines systematic and combinatorial search methods for the identification of gene targets. We have demonstrated the improvement of lycopene production in Escherichia coli using a strategy based on a two-dimensional search . Still, the search could be extended into three or four dimensions by adding additional axis. For instance, knockdown of knockout gene targets, or more precise modulation of the overexpression targets could be included as a new axis. By adding another layer of complexity through the multi-dimensional search, we will be able to explore bigger mutant phenotype space and even understand the complex interactions which might have been undetectable in monotonic searches. Recently We have investigated the relationship between genotype and phenotype in the context of heterologous protein overexpression in E. coli using systematically perturbed mutant collections. Interestingly many unknown gene targets that significantly increase soluble protein fraction minimizing inclusion body formation in E. coli were discovered through the multi-dimensional search. I would like to apply this search method for other phenotypes relevant to the overproduction of biomolecules and biochemicals.
Global perturbation of gene expression
Desired phenotypes of strains for bioconversion process, such as solvent tolerance, resistance to the inhibitors produced during pretreatment processes of biomass, are often to be multigenic. Since no single gene can confer ethanol resistance or resistance to growth inhibitors by simple perturbation, simultaneous and concerted manipulation of multiple genes’ expression is necessary to create a multigenic phenotype of interest. To this end, we are using artificial transcriptional factor (ATF) libraries based on Zinc finger proteins. Currently we are screening ATF libraries in order to isolate ATFs that elicit many beneficial phenotypes in S. cerevisiae, such as sugar tolerance, ethanol tolerance, and resistance to growth inhibitors from biomass pretreatment. Next step is to investigate changes in transcriptome, proteome, and metabolome upon the introduction of the isolated ATF in yeast. By the system level investigation, we will determine a minimal set of gene targets to be perturbed for creation of the desired phenotype.