- Assistant Professor
- Department of Biology
- University of North Dakota
- 10 Cornell Street Stop 9019
- Grand Forks, ND 58203
- United States
- Email: manu_dot_manu_at_und_dot_edu
- 2007, Ph.D., Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA
- 2000, M.B.A., Indian Institute of Foreign Trade, New Delhi, India
- 1997, B.Sc. (Hons) Physics, S.G.T.B. Khalsa College, University of Delhi, Delhi, India
I am interested in understanding how cell fate - the future identity of a cell - is specified during development. Many molecular processes, such as transcriptional regulation, intracellular signaling, chromatin modification, and RNA regulation, are known to be involved in cell-fate specification. In many developmental systems, such as Drosophila segmentation and mammalian hematopoiesis, cell-fate specification is largely governed by networks of cross-regulating transcription factors (TFs). This simplification makes the analysis of such developmental systems more tractable than others.
Transcriptional networks have a multi-level organization. At the network level, each TF can potentially activate or repress many others in the network. At the level of DNA sequence, the gene expression of each TF is driven by many cis-regulatory modules (CRMs) - 500-2000bp sequences - that cause the gene to be expressed in particular tissues or at specific times. The CRMs themselves have sub-structure - each is composed of several TF binding sites. Understanding cell-fate specification thus requires that we understand the complex behavior of transcriptional networks at each level of detail.
Data-driven mathematical modeling is a powerful tool for gaining insight into how transcriptional networks control cell-fate specification. Most of the underlying biophysical or biochemical parameters of models of gene regulation are hard to measure. To be data driven means that we constrain these parameters from quantitative gene expression data using global nonlinear optimization techniques on a parallel computer. Models inferred in this manner can then be further analyzed or used to simulate conditions not part of the training data to gain insight into network behavior. Below are brief descriptions of my past research and current work.
Stability of gene-network output to gene expression variation
How do continuously-varying and noisy inputs elicit discrete and stable responses from gene networks? Cell types are specified by the establishment of different transcriptional programs in cells. In the case of regulative development, cell-cell communication or morphogen gradients instruct cells to activate lineage-specific transcriptional programs. The prevailing view of gradient-directed cell-fate-specification, proposed by L. Wolpert as the French-Flag model, is that target genes are turned on at fixed threshold concentrations of the morphogen gradient. This model perhaps explains not only how cell fates are specified, but also why they are discrete. Fixed thresholds are, however, incompatible with the considerable amount of cell-to-cell and individual-to-individual variation observed in the concentration of transcription factors by quantitative assays.
We studied a specific example of the robust specification of cell fates by a noisy morphogen gradient during Drosophila segmentation. Maternal gradients such as Bicoid, thought to activate downstream gap gene targets at a fixed threshold, vary considerably from embryo to embryo. The French-Flag model predicts that gap gene expression patterns should exhibit the same level of variability as Bicoid. The gap gene expression patterns are, however, highly reproducible . Moreover, early gap gene expression patterns exhibit high variability, similar to that of maternal gradients, that reduces over time.
One of the limitations of traditional genetic analysis is that although one can remove genes one at a time from a gene network, it is hard to deduce how the network works as a whole to produce dynamic phenomena such as the one described above. We adopted a hybrid theoretical and experimental approach designed to overcome such limitations. We built a mathematical model comprising over 200 coupled Ordinary Differential Equations and inferred its parameters from quantitative gene expression data using a global optimization algorithm, simulated annealing. The model displayed the correct variation of expression patterns without having been provided any prior information about it. This powerful tool allowed us to analyze the relative regulatory contributions of individual genes in the context of the full genetic system. We showed that upstream variation is filtered out due to negative feedback loops in the gap gene network . The expression of target genes depends on multiple activating and repressing inputs that respond adaptively and dynamically to upstream variation, leading to reproducible output.
In order to gain a deeper understanding of how these feedback loops worked, we used non-linear systems theory to show that gene expression variation was reduced over time because the developmental trajectory of each cell was attracted to stable steady states or a stable trajectory encoded by the feedback loops. These results provided a theoretical insight into cell type specification in the segmentation system. The complex process of the initial specification of about 60 cells fated to be part of the future thorax and abdomen, involving seven genes and approximately 30 regulatory interactions could be understood in terms of just three dynamical mechanisms: selection of stable states, a bifurcation, and a stable trajectory [3, 4].
Stability conferred by cis-regulatory sequences and dosage compensation
We experimentally investigated the question of whether, besides network regulation, cis regulation is also optimized to produce stable gene expression. We took the approach of perturbing the cis regulation of the segmentation gene even-skipped (eve), a model for eukaryotic gene regulation, at the resolution of single binding sites. One major goal was to not just detect the small quantitative effects of such perturbation on gene expression, but also to characterize the functional impact on downstream development.
To achieve these ends, we replaced the endogenous locus with a precisely engineered one, generated using recombineering. The engineered version lacked certain binding sites and was tagged with a fast maturing (∼10 min half-life) version of Yellow Fluorescent Protein. We developed a protocol to acquire gene expression movies from living embryos and an image-processing algorithm for automated extraction of quantitative expression data. Using these technologies, we showed that enhancer structure is, in fact, optimized to buffer genetic and environmental perturbations .
As a result of the ability to track the secondary effects of the mutation on eve’s targets and viability and the ability to make precise measurements, we made a surprising discovery. We found that eve, an autosomal gene, is expressed in a sex-specific manner even though segmentation itself is not sex specific. Using genetic analysis, we were able to show that this difference arises from the incomplete dosage compensation of an X-linked transcription factor. This work  implies that the upregulation of transcription on the X chromosome is not sufficient for complete dosage compensation, and that additional pathway-specific autosomal regulation is necessary.
Reverse engineering gene regulation from high-throughput data
Most recently I have been developing methodologies in collaboration with Eric Bertolino at the University of Chicago for reverse engineering gene regulation. Despite the availability of genome-wide gene expression and ChIP datasets for the past decade or so, the inference of cis regulation remains a low-throughput and time-consuming process. I have developed a new modeling-based approach – which takes into account biophysical and phenomenological rules of transcription factor binding and interaction – to infer cis-regulatory logic from genomic high-throughput datasets. This methodology is quite general and should be applicable in a wide variety of biological contexts.
The approach combines a mechanistic model of transcription with high-throughput gene expression and reporter-activity data. It can be broken down into three steps. 1) Identify putative CRMs using evolutionary conservation and ChIP-seq data and assay their activity quantitatively using reporter assays. 2) Construct a series of mathematical models that can predict the activity of CRMs by calculating TF occupancy using the principles of statistical thermodynamics and then simulating mechanisms of activation and repression. 3) Reverse engineer cis-regulatory logic by identifying the model that best explains the activity data. This is achieved by determining model parameters using simulated annealing.
We have applied this approach to decode the cis regulation of three hematopoietic genes, C/EBPα, Egr1, and Egr2 during the differentiation of macrophages and neutrophils from progenitors in an inducible cell differentiation system. The model indicates that C/EBPα has strikingly complex regulation, with multiple activating and repressing elements. It identifies distal CRMs. Surprisingly, the model indicates that the C/EBPα locus harbors a distal element mediating repression from a lymphoid factors. Although cross-lineage antagonism is well documented in hematopoiesis, this is the first instance of it being mediated by long-distance repression.
Error fetching PMID 23333947:
Error fetching PMID 23290311:
Error fetching PMID 22102826:
Error fetching PMID 21794172:
Error fetching PMID 20636356:
Error fetching PMID 19911861:
Error fetching PMID 19750121:
Error fetching PMID 19282965:
Error fetching PMID 18067886:
Error fetching PMID 15342511:
Error fetching PMID 15254541:
- Error fetching PMID 18067886:
- Error fetching PMID 19750121:
- Error fetching PMID 19282965:
- Error fetching PMID 21794172:
- Error fetching PMID 22102826:
- Error fetching PMID 23410834:
- Error fetching PMID 23333947:
- Error fetching PMID 23290311:
- Error fetching PMID 20636356:
- Error fetching PMID 19911861:
- Error fetching PMID 15342511:
- Error fetching PMID 15254541: