Tessa A. Morris Week 10

Article

Transcriptional Regulatory Networks in Saccharomyces cerevisiae

10 Biological Terms

Nucleate: To form a nucleus; to act as a nucleus (for). source
Chromatin: A complex of nucleic acids (e.g. DNA or RNA) and proteins (histones), which condenses to form a chromosome during cell division. In eukaryotic cells, it is found within the cell nucleus whereas in prokaryotic cells, it is found within the nucleoid. Its functions are to package DNA into a smaller volume to fit in the cell, strengthen the DNA to allow mitosis and meiosis, and to serve as a mechanism to control expression. source
Motifs: The smallest group of atoms in a polymer that, when under the influence of a rotation-translation operator, will assemble the rest of the atoms in the chain. source
Genome-wide location analysis: a tool for identifying protein–DNA interaction sites on a genomic scale source
myc epitope tag: Epitope tagging is a technique in which a known epitope is fused to a recombinant protein by means of genetic engineering. By choosing an epitope for which an antibody is available, the technique makes it possible to detect proteins for which no antibody is available. This is especially useful for the characterization of newly discovered proteins and proteins of low immunogenicity. By selection of the appropriate epitope and antibody pair, it is possible to find a combination with properties that are suitable for the desired experimental application, such as Western blot analysis, immunoprecipitation, immunochemistry, and affinity purification. source
Immunoblot analysis (western blotting) is a rapid and sensitive assay for the detection and characterization of proteins that works by exploiting the specificity inherent in antigen-antibody recognition. It involves the solubilization and electrophoretic separation of proteins, glycoproteins, or lipopolysaccharides by gel electrophoresis, followed by quantitative transfer and irreversible binding to nitrocellulose, PVDF, or nylon. source
Peptone: The soluble and diffusible substance or substances into which albuminous portions of the food are transformed by the action of the gastric and pancreatic juices. Peptones are also formed from albuminous matter by the action of boiling water and boiling dilute acids. Collectively, in a broader sense, all the products resulting from the solution of albuminous matter in either gastric or pancreatic juice. In this case, however, intermediate products (albumose bodies), such as antialbumose, hemialbumose, etc, are mixed with the true peptones. Also termed albuminose. pure peptones are of three kinds, amphopeptone, antipeptone, and hemipeptone, and, unlike the albumose bodies, are not precipitated by saturating their solutions with ammonium sulphate. source
Dextrose: a sirupy, or white crystalline, variety of sugar, C6H12O6 (so called from turning the plane of polarization to the right), occurring in many ripe fruits. Dextrose and levulose are obtained by the inversion of cane sugar or sucrose, and hence called invert sugar. Dextrose is chiefly obtained by the action of heat and acids on starch, and hence called also starch sugar. It is also formed from starchy food by the action of the amylolytic ferments of saliva and pancreatic Juice. The solid products are known to the trade as grape sugar; the sirupy products as glucose, or mixing sirup. These are harmless, but are only about half as sweet as cane or sucrose. source
Chromatin immunoprecipitation: detecting interactions between a protein and a DNA sequence in vivo source
Thiamine: chemical name: Thiazolium, 3-((4-amino-2-methyl-5-pyrimidinyl)methyl)-5-(2-hydroxyethyl)-4-methyl- chloride A B vitamin that prevents beriberi; maintains appetite and growth.More commonly known as vitamin c and found commonly in cereals, thiamine acts as a coenzyme used to breakdown sugars. source
Fhl1 function: Regulator of ribosomal protein (RP) transcription source
Locus: The location of a gene (or of a significant sequence) on a chromosome, as in genetic locus. source
Epitope: That part of an antigenic molecule to which the t-cell receptor responds, a site on a large molecule against which an antibody will be produced and to which it will bind. source
Abf1: functions in transcription, replication, gene silencing, and NER (nucleotide excision repair) in yeast source

Outline

Abstract

It was determined how most transcriptional regulators encoded in Saccharomyces cerevisiae associate with genes across the genome in living cells, which can be used to describe potential pathways yeast cells can use to regulate global gene expression programs
In this experiment, the scientists used this information to identify network motifs (simplest units of network architecture) and show that an automated process can use motifs to assemble a transcription regulatory network structure
Results: eukaryotic cellular functions are highly connected through networks of transcriptional regulators that regulate other transcriptional regulators

Introduction

Aim of paper is to understand how cells control global gene expression programs
Each cell is the product of specific gene expression programs involving regulated transcription of thousands of genes
Transcriptional programs are modified as cells progress through the cell cycle due to changes in environment and during organismal development
Gene expression programs are dependent on the recognition of specific promoter sequences by transcriptional regulatory proteins
Regulatory proteins recruit and regulate chromatin-modifying complexes and components of the transcriptional apparatus
- Knowledge of the sites bound by all the transcriptional regulators encoded in a genome can provide the information necessary to nucleate models for transcriptional regulatory networks
With the availability of complete genome sequences and development of a method for genome-wide binding analysis (genome-wide location analysis), investigators can identify the set of target genes bound in vivo by each of the transcriptional regulators that are encoded in a cell’s genome.

Experimental Design

Used genomewide location analysis to investigate how yeast transcriptional regulators bind to promoter sequences across the genome
- Figure 1-A:
  - Yeast transcriptional regulators were tagged by introducing the coding sequence for a c-myc epitope tag into the normal genomic locus for each regulator.
  - 106 of the yeast strains contained a single epitope-tagged regulator whose expression could be detected in rich growth conditions.
  - Chromatin immunoprecipitation (ChIP) was performed on each of these 106 strains.
  - Promoter regions enriched through the ChIP procedure were identified by hybridization to microarrays containing a genome-wide set of yeast promoter regions.
Studied all 141 transcription factors listed in the Yeast Proteome Database and reported to have DNA binding and transcriptional activity
Yeast strains were constructed so that each of the transcription factors contained a myc epitope tag.
Epitope tag coding sequences were introduced into the genomic sequences encoding the COOH terminus of each regulator to increase the likelihood that tagged factors were expressed at physiologic levels
Appropriate insertion of the tag and expression of the tagged protein were confirmed by polymerase chain reaction and immunoblot analysis.
Introduction of an epitope tag might have affected the function of some transcriptional regulators
- For 17 of the 141 factors, they were not able to obtain viable tagged cells, despite three attempts to tag each regulator.
Not all the transcriptional regulators were expected to be expressed at detectable levels when yeast cells were grown in rich medium, but immunoblot analysis showed that 106 of the 124 tagged regulator proteins could be detected under these conditions.
Performed genome-wide location analysis experiment for the 106 yeast strains that expressed epitope-tagged regulators.
Each tagged strain was grown in three independent cultures in rich medium (yeast extract, peptone, and dextrose).
Genome-wide location data were subjected to quality control filters and normalized, then the ratio of immunoprecipitated to control DNA was determined for each array spot.
Confidence value (P-value) for each spot from each array was calculated using an error model.
- Data for each of the three samples in an experiment were combined by a weighted average method
- Each ratio was weighted by P-value and then averaged.
- Final P values for these combined ratios were then calculated.
Error models were used to obtain a probabilistic assessment of regulator location data because of the properties of the biological system of study (cell populations, DNA binding factors capable of binding to both specific and nonspecific sequences) and the expectation of noise in microarray-based data
- Figure 1-B: The total number of protein-DNA interactions in the location analysis data set, using a range of P value thresholds
  - Effect of P-value threshold.
  - The sum of all regulator-promoter region interactions is displayed as a function of varying P value thresholds applied to the entire location data set for the 106 regulators.
  - More stringent P values reduce the number of interactions reported but decrease the likelihood of false-positive results.

Specific P value thresholds were selected to facilitate discussion of a subset of the data at a high confidence level, but this artificially imposes a “bound or not bound” binary decision for each protein-DNA interaction.
The results obtained were described as a P value threshold of 0.001 because the analysis indicated that this threshold maximizes inclusion of legitimate regulator-DNA interactions and minimizes false positives.
Various experimental and analytical methods indicate that the frequency of false positives in the genome-wide location data at the 0.001 threshold is 6% to 10%
- Conventional, gene-specific chromatin immunoprecipitation experiments have confirmed 93 of 99 binding interactions (involving 29 different regulators) that were identified by location analysis data at a threshold P-value of 0.001.
Use of a high-confidence threshold should underestimate the regulator-DNA interactions that actually occur in these cells.
Estimated that about one-third of the actual regulator-DNA interactions in cells are not reported at the 0.001 threshold.

Regulator Density

There were nearly 4000 interactions observed between regulators and promoter regions at a P value threshold of 0.001.
The promoter regions of 2343 of 6270 yeast genes (37%) were bound by one or more of the 106 transcriptional regulators in yeast cells grown in rich medium.
Many yeast promoters were bound by multiple transcriptional regulators (Fig. 2A), a feature previously associated with gene regulation in higher eukaryotes, suggesting that yeast genes are also frequently regulated through combinations of regulators.
- Figure 2-A:
  - Plot of the number of regulators bound per promoter region.
  - The distribution for the actual location data (red circles) is shown alongside the distribution expected from the same set of P values randomly assigned among regulators and intergenic regions (white circles).
  - At a P value threshold of 0.001, significantly more intergenic regions bind four or more regulators than expected by chance.
More than one-third of the promoter regions that are bound by regulators were bound by two or more regulators (P value threshold = 0.001), and, relative to the expected distribution from randomized data, a disproportionately high number of promoter regions were bound by four or more regulators.
Because of the stringency of the P value threshold, this represents an underestimate of regulator density.
- Figure 2-B The number of different promoter regions bound by each regulator in cells grown in rich medium ranged from 0 to 181 (P value threshold = 0.001), with an average of 38 promoter regions per regulator
  - Distribution of the number of promoter regions bound per regulator.
The regulator Abf1 bound the largest number (181) of promoter regions.
Regulators that should be active under growth conditions other than yeast extract, peptone, and dextrose were typically found, as expected, to bind the smallest number of promoter regions.
- Thi2 (which activates transcription of thiamine biosynthesis genes under conditions of thiamine starvation) was among the regulators that bound the smallest number (3) of promoters.
Identification of a set of promoter regions that are bound by specific regulators allowed us to predict sequence motifs that are bound by these regulators

Network Motifs

STOP FOR MY PART OF THE PROJECT (finish later)

Assembling Motifs into Network Structures

Coordination of Cellular Processes

Significance of Regulatory Network Information

Questions

What is the main result presented in this paper?
- Created a model based on the peak expression, to model the network of transcriptional regulators
- The computational approach correctly assigned all the regulators to stages of the cell cycle, where they were shown to function in previous studies
- Two regulators that have been implicated in cell cycle control but whose functions were ill-defined (35–37) could be assigned within the network on the basis of direct binding data. #*Reconstruction of the regulatory architecture was automatic and required no prior knowledge of the regulators that control transcription during the cell cycle
What is the importance or significance of this work?
- This represents a general method for constructing other regulatory networks
Briefly describe their methods, including the following information. A flow chart may be helpful here.
1. How did they treat the cells (what experiment were they doing?)
  - Tagged 106 strains with a regulator, used chromatin IP to enrich the promoters bound by regulator in vivo, then used microarray to identify the promoters bound by regulator in vivo.
2. What strain(s) of yeast did they use? Was the strain haploid or diploid?
  - They used 106 strains of yeast
3. What media did they grow them in? Under what conditions and temperatures?
  - Yeast extract, peptone, and dextrose
4. What controls did they use?
  - Error models were used to obtain a probabilistic assessment of regulator location data because of the properties of the biological system of study (cell populations, DNA binding factors capable of binding to both specific and nonspecific sequences) and the expectation of noise in microarray-based data
5. How many replicates did they perform per condition?
  - Three independent cultures
6. What mathematical/statistical method did they use to analyze the data?
  - Statistical methods: p-value for each spot was calculated using an error model, ratio was weighted by p-value then averaged to find the final p-values for these combined ratios
7. What transcription factors did they talk about?
  - Abf1 and Thi2

Presentation

Partners: Kristen M. Horstmann and Lucia I. Ramirez

Biomathematical Modeling Navigation

User Page: Tessa A. Morris
Course Page: Biomathematical Modeling

Tessa A. Morris Week 10

Contents

Article

10 Biological Terms

Outline

Abstract

Introduction

Experimental Design

Regulator Density

Network Motifs

Assembling Motifs into Network Structures

Coordination of Cellular Processes

Significance of Regulatory Network Information

Questions

Presentation

Biomathematical Modeling Navigation

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools