Lucia I. Ramirez Week 12

From OpenWetWare
Revision as of 22:03, 13 April 2015 by Lucia I. Ramirez (talk | contribs) (make changes)
Jump to navigationJump to search

Using YEASTRACT to Infer which Transcription Factors Regulate a Cluster of Genes

  1. Opened the gene list (file name: wt_profile37_genelist) in Excel for the profile/cluster that I analyzed for the Week 11 Assignment.
    • Copied the list of gene IDs onto my clipboard.
  2. Launched a web browser and go to the YEASTRACT database.
    • On the left panel of the window, clicked on the link to Rank by TF.
    • Pasted my list of genes from my cluster into the box labeled ORFs/Genes.
    • Checked the box for Check for all TFs.
    • Accepted the defaults for the Regulations Filter (Documented, DNA binding plus expression evidence)
    • Did not apply a filter for "Filter Documented Regulations by environmental condition".
    • Ranked genes by TF using: The % of genes in the list and in YEASTRACT regulated by each TF.
    • Clicked the Search button.
  3. Questions:
    • In the results window that appears, the p values colored green are considered "significant", the ones colored yellow are considered "borderline significant" and the ones colored pink are considered "not significant". How many transcription factors are green or "significant"?
      • There are 26 significant transcription factors
    • List the "significant" transcription factors, along with the corresponding "% in user set", "% in YEASTRACT", and "p value".
      • CIN5, GLN3, HMO1, and ZAP1 are not on the list.
    • How many of the transcription factors appear in both of your lists?
      • SFP1P, YHP1P, YOX1P, YLR278C, MSNP2, and MSN4P, a total of 6 common. 26 transcription factors were generated in my wt_list and 11 were generated in Lauren's HMO1_list. We decided to focus on the mutant gene, in which we added CIN5, GLN3, HMO1 and ZAP1 to have a total of 15 transcription factors.
    • Went back to the YEASTRACT database and followed the link to Generate Regulation Matrix.
    • Copy and pasted the list of transcription factors I identified (plus CIN5, GLN3, HMO1, and ZAP1) into both the "Transcription factors" field and the "Target ORF/Genes" field.
    • Generated several regulation matrices, with different "Regulations Filter" options.
      • First one: accepted the defaults: "Documented", "DNA binding plus expression evidence"
      • Clicked the "Generate" button.
      • In the results window that appears, I clicked on the link to the "Regulation matrix (Semicolon Separated Values (CSV) file)" that appears and saved it to my Desktop. Renamed this file with a meaningful name.
      • Repeated these steps to generate a second regulation matrix, this time applying the Regulations Filter "Documented", "Only DNA binding evidence".
      • Repeated these steps a third time to generate a third regulation matrix, this time applying the Regulations Filter "Documented", DNA binding and expression evidence".

Analyzing and Visualizing Your Gene Regulatory Networks

I analyzed the regulatory matrix files you generated above in Microsoft Excel and visualized them using GRNsight to determine which one will be appropriate to pursue further in the modeling.

  1. First formated the output files from YEASTRACT. Repeated these steps for each of the three files you generated above.
    • Opened file in Excel.
      • It will not open properly in Excel because a semicolon was used as the column delimiter instead of a comma. To fix this, Select the entire Column A. Then go to the "Data" tab and select "Text to columns". In the Wizard that appears, select "Delimited" and click "Next". In the next window, select "Semicolon", and click "Next". In the next window, leave the data format at "General", and click "Finish". This should now look like a table with the names of the transcription factors across the top and down the first column and all of the zeros and ones distributed throughout the rows and columns. This is called an "adjacency matrix." If there is a "1" in the cell, that means there is a connection between the trancription factor in that row with that column.
    • Saved this file in Microsoft Excel workbook format (.xlsx).
    • Checked to see that all of the transcription factors in the matrix are connected to at least one of the other transcription factors by making sure that there is at least one "1" in a row or column for that transcription factor.
      • If a factor is not connected to any other factor, delete its row and column from the matrix. Make sure that you still have somewhere between 15 and 30 transcription factors in your network after this pruning.
    • For this adjacency matrix to be usable in GRNmap (the modeling software) and GRNsight (the visualization software), we need to transpose the matrix. Inserted a new worksheet into your Excel file and name it "network". Went back to the previous sheet and selected the entire matrix and copy it. Went to my new worksheet and clicked on the A1 cell in the upper left. Selected "Paste special" from the "Home" tab. In the window that appears, check the box for "Transpose". This will paste your data with the columns transposed to rows and vice versa.
    • The labels for the genes in the columns and rows match. (Deleted the "p" from each of the gene names in the columns.) Adjusted the case of the labels to make them all upper case.
    • In cell A1, copy and pasted the text "rows genes affected/cols genes controlling".
  2. Looked at network properties. Repeated these steps for each of the three gene regulatory matrices you generated above. See this file for an example of how to do the following instructions.

Back to User page: Lucia I. Ramirez

Journals/Assignments

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15