Katherine Grace Johnson Electronic Lab Notebook: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
(Edited April 14, added notes for April 29)
Line 33: Line 33:


YEASTRACT analysis of profile cluster #45
YEASTRACT analysis of profile cluster #45
*24 significant transcription factors
*19 significant transcription factors
  Sfp1
  Sfp1
  Fkh2
  Fkh2
Line 53: Line 53:
  Gat3
  Gat3
  Mcm1
  Mcm1
Pop2
*Our transcription factors from deletion strains (CIN5, GLN3, HMO1, ZAP1) are not included on this list.
*Our transcription factors from deletion strains (CIN5, GLN3, HMO1, ZAP1) are not included on this list.
*Use "Only DNA binding evidence" selection choice when generating networks in YEASTRACT
*Use "Only DNA binding evidence" selection choice when generating networks in YEASTRACT
Line 66: Line 65:
***Estimation Excel sheet containing estimated production rates, estimated b values (if applicable), and optimized weights for each transcription factor in the network
***Estimation Excel sheet containing estimated production rates, estimated b values (if applicable), and optimized weights for each transcription factor in the network
***Output graphs for each transcription factor in the network
***Output graphs for each transcription factor in the network
*Both networks were visualized using GRNsight:
**The output Excel sheet can be used in GRNsight with one minor edit: change name of "out_network_optimized_weights" to "network_optimized_weights"

Revision as of 23:32, 29 April 2015

This is my lab notebook

February 6, 2015

Repeat microchip data normalization for Ontario and GCAT from protocol Dahlquist:Microarray Data Processing in R. Data processed 1/30/15, but repeated today in order to record protocol to this notebook. Both normalized Excel data sheets will be compared to each other and to Natalie's to determine if there is a difference in normalization from computer to computer.

R x64 3.1.0 version used

Within Array Normalization for the Ontario Chips and Within Array Normalization for the GCAT Chips (includes between chip normalization)

  • Change Directory - Must scroll down to "User" to locate kjohn102, then select folder "Microarray Data"
  • to unzip files - right click, 7Zip, Extract here - this will place the unzipped file in the folder you are currently in
  • R asks you to call the data file (.script), then an Excel target file (.csv) in which to put the normalized data. These must both be in the same folder (Microarray data), and downloaded before R is run
  • Excel files are not generated until both normalizations are run
  • Two Excel files generated: GCAT_and_Ontario_Within_Array_Normalization.csv and GCAT_and_Ontario_Final_Normalized_Data.csv. File desired is Normalized Data. Rename with suffix _date_GJ
  • created Excel file, Comparison_Finalized_Normalized_Data_GJNW_20150206.csv to compare three sets of Normalized data: GJ1, GJ2, and NW
    • GJ1 vs NW results - avg 10^-11 difference
    • GJ1 vs GJ2 results - 0 difference
    • Another normalization was run, named GJ3. This was compared to GJ2 in the Excel comparison document. Computer restarted, another normalization created - GJ4
    • GJ2 vs GJ 3 results - 0 difference
    • GJ3 vs GJ 4 results - 0 difference

Conclusions: Data normalization did not change from trial to trial on paradoxus computer, no matter the time of normalization. Normalization produced a slight difference between boulardii and paradoxus computers.

April 14, 2015

Completing Week 11 and Week 12 assignments from [BIOL398]. I will complete statistical testing of wild type data, and generate a network from this data.

Notes for improvement:

  • use COUNTIF function instead of filtering the numbers when looking at p-values
  • To prepare for analysis in STEM, columns containing #VALUE! had to be removed by using custom filter: does not equal #VALUE!. Remaining number values had to be copied and pasted into a new sheet.
  • On macs, cluster files from STEM are not recognized by Excel. Textedit files must be converted to csv by the following procedure:
    • Select a tab character and press Command F, Paste into top bar
    • Click replace, then type a comma into the replace bar. Click replace all.
    • Save with file extension .csv (type manually if it is not a drop down option)

YEASTRACT analysis of profile cluster #45

  • 19 significant transcription factors
Sfp1
Fkh2
Yhp1
Yox1
Cyc8
YLR278C
Ace2
Rif1
Msn2
Stb5
Asg1
Msn4
Mig2
Swi5
Snf6
Pdr1
Gcr2
Gat3
Mcm1
  • Our transcription factors from deletion strains (CIN5, GLN3, HMO1, ZAP1) are not included on this list.
  • Use "Only DNA binding evidence" selection choice when generating networks in YEASTRACT
    • Network should have 40-60 edges

April 29, 2015

Completing Week 13 and Week 14 assignments from [BIOL398]. I will use profile #45 from the YEASTRACT database as the basis for the network to be run through GRNmap. Including the four deletion strains, this network has 23 nodes and 46 edges.

  • Protocol for Week 13 and 14 assignments was followed to produce:
    • Outputs keeping b parameter fixed (i.e. fix_b is set to 1 on the optimization_parameters sheet of input workbook)
    • Outputs allowing b to be estimated (i.e. fix_b is set to 0 on the optimization_parameters sheet of input workbook)
    • Outputs for both runs include:
      • Estimation Excel sheet containing estimated production rates, estimated b values (if applicable), and optimized weights for each transcription factor in the network
      • Output graphs for each transcription factor in the network
  • Both networks were visualized using GRNsight:
    • The output Excel sheet can be used in GRNsight with one minor edit: change name of "out_network_optimized_weights" to "network_optimized_weights"