LauraTerada Individual Journal Assignment Week 12

From OpenWetWare
Jump to navigationJump to search

Clustering and Gene Ontology Analysis with STEM

Downloading and extracting the STEM software

  1. Download link, register, and download file.
  2. Unzip file.
  3. Go to programs>accessories>command prompt, type in: cd Desktop\stem , then press enter.
  4. Then type: java -mx512M -jar stem.jar -d defaults.txt , then press enter.

Prepare your microarray data to fit STEM

  1. Insert new worksheet called "stem" into Week 9 Assignment Excel and copy data from "final" into "stem."
  2. Rename "MasterIndex" to "SPOT," and "ID" to "Gene Symbol."
  3. Delete all data except AvgLogFC columns. Rename columns with time and units (15m, 30m, etc). Save as .txt.


  1. Section 1: Click on Browse of main STEM interface window, select file. Click on radio button No normalization/add 0.
  2. Section 2: select Saccharomyces cerevisiae (SGD). Select No cross references and select No Gene Locations.
  3. Section 3: Clustering Method should say "STEM Clustering Method."
  4. Section 4: Click yellow Execute button to run STEM.

View and Save STEM Results

  1. Click on Interface Options, click on radio button that says "Based on real time." Close Interface Options window.
  2. Take a screenshot of window and paste in Powerpoint.
  3. Click on each of the colored profiles, take a screen shot, and save in Powerpoint.
  4. Click on "Profile Gene Table" and "Profile GO Table" for each colored profile. Press "Save Table" for each and relabel file with descriptive names. Upload these files to LionShare and provide link to Dr. Dahlquist and Dr. Fitzpatrick.

Analyzing and Interpreting STEM Results

  1. Select one of the profiles you saved in the previous step for further interpretation of the data.
    • Why did you select this profile? In other words, why was it interesting to you?
      • I chose profile 2 because it shows distinct down regulation up to 60 min (first three time points), and after that there is clear up regulation.
    • How many genes belong to this profile?
      • 126.0 genes
    • How many genes were expected to belong to this profile?
      • 49.8 genes
    • What is the p value for the enrichment of genes in this profile?
      • 2.9E-20 (significant)
    • How many GO terms are associated with this profile at p < 0.05?
      • 4
    • How many GO terms are associated with this profile with a corrected p value < 0.05?
      • 0
    • Select 10 Gene Ontology terms from your filtered list (either p < 0.05 or corrected p < 0.05). Look up the definitions for each of the terms at Write a paragraph that describes the biological interpretation of these GO terms. In other words, why does the cell react to cold shock by changing the expression of genes associated with these GO terms?
      • GO:0005886 (plasma membrane): "The membrane surrounding a cell that separates the cell from its external environment. It consists of a phospholipid bilayer and associated proteins." [[1]]
      • GO:0071944 (cell periphery): "The part of a cell encompassing the cell cortex, the plasma membrane, and any external encapsulating structures." [[2]]
      • GO:0031225 (anchored to membrane): "Tethered to a membrane by a covalently attached anchor, such as a lipid group, that is embedded in the membrane. When used to describe a protein, indicates that none of the peptide sequence is embedded in the membrane." [[3]]
      • GO:0031224 (intrinsic to membrane): "Located in a membrane such that some covalently attached portion of the gene product, for example part of a peptide sequence or some other covalently attached group such as a GPI anchor, spans or is embedded in one or both leaflets of the membrane." [[4]]
    • Explanation
      • I chose a profile that was down regulated until 60 minutes. After 60 minutes, the genes in this profile were up regulated. Moreover, I only had 4 GO IDs that were statistically significant (both normal and corrected). It is apparent that the only 4 genes that had significant values, defined above, all have to deal with proteins in the cell membrane. In cold shock, it is important to understand that cold shock affects membrane composition and function. Cold shock also reduces membrane fluidity. As such, it is expected that genes that involve membrane composition, function, and fluidity would change expression. When the yeast was exposed to cold shock up to 60 minutes, the expression of these genes reduced. After 60 minutes, however, the cells acclimated to the temperature, thus the gene expression was up regulated for the genes. This change from down regulation to up regulation shows how yeast cells can adjust their gene expression due to prolonged exposure to cold temperatures.

Using YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes

  1. Open Excel of cluster from above. Launch YEASTRACT database, click onl ink to group by TF.
  2. Paste list of genes into ORFs/Genes box. Check box for Check for all TFs. Uncheck box for Indirect Evidence. Click Search.
  3. What are the top 10 transcription factors in your results? List them on your wiki page with the percent of the genes in your cluster that they each regulate.
    • Ste12p 42.4 %; Sok2p 27.2%; Rap1p 25.6%; Skn7p 19.2%; Fhl1p 16.8%; Mbp1p 16.0%; Yap5p 15.2%; Cin5p 15.2%; Tec1p 14.4%; Pdr1p 14.4%
  4. Are Cin5, Gln3, Hmo1, and Zap1 on the list? What percentage of the genes in the cluster does they each regulate? How many genes does they each regulate?
    • Cin5: 15.5%, 19 genes; Gln3 2.4%, 3 genes; Zap1 1.6%, 2 genes
  5. Which transcription factors do you want to add to the model and why?
    • Ste12p, Sok2p, Rap1p, Yap5p, Tec1p because these transcription factors are not on the given list of factors but are in the YEASTRACT list of transcription factors that have the highest percentage of number of genes for profile 2.
  6. Go to YEASTRACT database and go to Generate Regulation Matrix. Paste list above plus additional transcriptional factors. Uncheck box for Indirect Evidence and select JPEG from Output Image. Generate.
  7. Click link to "Regulation Matrix" and save. Click on Image link and save to Powerpoint.
  8. Open Regulation Matrix file in Excel. Select Column A, select Data tab, select Text to Columns, select Delimited, press Next, then select Semicolon and press Next. Press Finish. Save as .xls.
  9. Make sure there is at least "1" in a row/column for that transcription factor. If not, must choose other transcription factors.
  10. Insert new worksheet and copy data in it. Select Paste special, check box for Transpose.
  11. Upload files: