Carmen E. Castaneda: Week 12

Clustering and Gene Ontology Analysis with STEM
Analyzing and Interpreting STEM Results Select one of the profiles you saved in the previous step for further intepretation of the data. We suggest that you choose one that has a pattern of up- or down-regulated genes at the early (first three) timepoints. Answer the following: I picked this profile because it showed an interesting pattern of fluctuation in its early time periods. I would want to see how the cold shock affected these genes and what kind of genes where involved. There are 128.5 genes in this profile. There were 82 genes expected to belong to this profile. The p value for the enrichment of genes in this profile is 7.07 x 10^(-7) There are 24 GO terms associated with this profile at p < 0.05. There are zero GO terms associated with this profile.
 * Why did you select this profile? In other words, why was it intersting to you?
 * How many genes belong to this profile?
 * How many genes were expected to belong to this profile?
 * What is the p value for the enrichment of genes in this profile? Bear in mind that in  last week's assignment, you computed p values to determine whether each individual gene had a significant change in gene expression at each time point.  This p value determines whether the number of genes that show this particular expression profile across the time points is significantly more than expected.
 * Open the GO list file you saved for this profile in Excel. This list shows all of the Gene Ontology terms that are associated with genes that fit this profile.  Select the third row and then choose from the menu Data > Filter > Autofilter.  Filter on the "p-value" column to show only GO terms that have a p value of < 0.05.  How many GO terms are associated with this profile at p < 0.05?
 * The GO list also has a column called "Corrected p-value". This correction is needed because the software has performed thousands of significance tests.  Filter on the "Corrected p-value" column to show only GO terms that have a corrected p value of < 0.05.  How many GO terms are associated with this profile with a corrected p value < 0.05?
 * Select 10 Gene Ontology terms from your filtered list (either p < 0.05 or corrected p < 0.05). Look up the definitions for each of the terms at http://geneontology.org.  Write a paragraph that describes the biological interpretation of these GO terms.  In other words, why does the cell react to cold shock by changing the expression of genes associated with these GO terms?
 * sulfur compound biosynthetic process is a chemical reaction which deals with the formation of compounds that contain sulfur, such as amino acids therefore when the cell gets exposed to cold shock the amino acids no longer need to be formed since there isn't a lot of cell growth during cold shock.
 * metal ion transport carries any metal ion with an electric charge in and out of cell. Since the cell is not acting to its full capacity there is no need for the transportation of metal ions in or out of the cell.
 * sulfur compound metabolic process is a chemical reaction that envolves the nonmetal sulfur and compounds of it therefore as the cell is cold shocked this chemical reaction again isn't as essential since the cell is being exposed to an enviorment in which it does not require sulfur as much.
 * ion transport moves atoms and small charged particles in, out, and around the cell through a transporter or pore. In cold shock the cell saves it energy for essential funtions and in that kind of enviorment the movement of atoms or charged particles might not be a crucial.
 * aspartate family amino acid metabolic process is the chemical reaction that has to do with amino acids in the aspartate family. Thus again since the cell is not growing in cold shock the amino acids do not need to be produced as much.
 * aspartate family amino acid biosynthetic process is the chemical reaction that produces amino acids of the same family and since the metabolic process is affected then the production is also hindered by the cold shock.
 * transmembrane transport is in charge of transporting a solute from one side of the membrane to another, and again because the cell is not growing or being part of the life cycle of the cell sincce it is trying to survevi, this gunction gets depressed as the cell is exposed to cold shock.
 * amine metabolic process is a chemical reation of any weakly basic organic compound and involves an animo acid thus the cell would repress this process at the lack of cell growth.
 * protein folding assists in the process of the assembling the of single polypeptides' structures and as the cell is not growing it does not need peptides so this gene is affected.
 * cellular ketone metabolic process is a chemical reaction having to do with any of a class of organic compounds that contain the carbonyl group and because carbon is limited during cold shock the cell does not think this gene/ process is vital therefore gets repressed.

Using YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes

 * What are the top 10 transcription factors in your results? List them on your wiki page with the percent of the genes in your cluster that they each regulate.


 * Ste12p 31.0 %
 * Rap1p 16.7 %
 * Ino4p 15.9 %
 * Fhl1p 13.5 %
 * Sok2p 11.9 %
 * Phd1p 11.9 %
 * Gcn4p 11.1 %
 * Mbp1p 10.3 %
 * Yap1p 9.5 %
 * Tec1p 9.5 %
 * Abf1p 8.7 %

Yes Gln3 is on the list. It has a 3.2 %. It regulates 4 genes, which are ERG26, SLT2, DCG1, and MRPL4.
 * Is Gln3 on the list? What percentage of the genes in the cluster does it regulate?  How many genes does it regulate?  What are the names of the genes?

CIN5 CUP9 FHL1 GTS1 HSF1 MSN1 MSN4 NRG1 RAP1 RCS1 REB1 ROX1 RPH1 YAP1 YAP6

I added the top five transcription genes because they had the highest percentages; Ste12p, Rap1p, Ino4p, Fhl1p, Sok2p.

The list I tried to get the matrix for CIN5 CUP9 FHL1 GTS1 HSF1 MSN1 MSN4 NRG1 RAP1 RCS1 REB1 ROX1 RPH1 YAP1 YAP6 Ste12 Rap1 Ino4 Fhl1 Sok2 Gln3

and kept getting the same error

Considering documented regulations suported by direct evidence.

Processing regulations...Could not open file 'http://www.yeastract.com/tmp/RegulationTwoColumnTable_Documented_2011412_732_1633345783.tsv' Generating Transcriptional Regulatory Network static image...

After fixing some duplicates my new list of genes is CIN5 CUP9 FHL1 GTS1 HSF1 MSN1 MSN4 NRG1 RAP1 RCS1 REB1 ROX1 RPH1 YAP1 YAP6 Ste12 Ino4 Sok2 Gln3 Phd1 Gcn4