Katherine Grace Johnson Electronic Lab Notebook: Difference between revisions
No edit summary |
(Edited April 14, added notes for April 29) |
||
Line 33: | Line 33: | ||
YEASTRACT analysis of profile cluster #45 | YEASTRACT analysis of profile cluster #45 | ||
* | *19 significant transcription factors | ||
Sfp1 | Sfp1 | ||
Fkh2 | Fkh2 | ||
Line 53: | Line 53: | ||
Gat3 | Gat3 | ||
Mcm1 | Mcm1 | ||
*Our transcription factors from deletion strains (CIN5, GLN3, HMO1, ZAP1) are not included on this list. | *Our transcription factors from deletion strains (CIN5, GLN3, HMO1, ZAP1) are not included on this list. | ||
*Use "Only DNA binding evidence" selection choice when generating networks in YEASTRACT | *Use "Only DNA binding evidence" selection choice when generating networks in YEASTRACT | ||
Line 66: | Line 65: | ||
***Estimation Excel sheet containing estimated production rates, estimated b values (if applicable), and optimized weights for each transcription factor in the network | ***Estimation Excel sheet containing estimated production rates, estimated b values (if applicable), and optimized weights for each transcription factor in the network | ||
***Output graphs for each transcription factor in the network | ***Output graphs for each transcription factor in the network | ||
*Both networks were visualized using GRNsight: | |||
**The output Excel sheet can be used in GRNsight with one minor edit: change name of "out_network_optimized_weights" to "network_optimized_weights" |
Revision as of 23:32, 29 April 2015
This is my lab notebook
February 6, 2015
Repeat microchip data normalization for Ontario and GCAT from protocol Dahlquist:Microarray Data Processing in R. Data processed 1/30/15, but repeated today in order to record protocol to this notebook. Both normalized Excel data sheets will be compared to each other and to Natalie's to determine if there is a difference in normalization from computer to computer.
R x64 3.1.0 version used
Within Array Normalization for the Ontario Chips and Within Array Normalization for the GCAT Chips (includes between chip normalization)
- Change Directory - Must scroll down to "User" to locate kjohn102, then select folder "Microarray Data"
- to unzip files - right click, 7Zip, Extract here - this will place the unzipped file in the folder you are currently in
- R asks you to call the data file (.script), then an Excel target file (.csv) in which to put the normalized data. These must both be in the same folder (Microarray data), and downloaded before R is run
- Excel files are not generated until both normalizations are run
- Two Excel files generated: GCAT_and_Ontario_Within_Array_Normalization.csv and GCAT_and_Ontario_Final_Normalized_Data.csv. File desired is Normalized Data. Rename with suffix _date_GJ
- created Excel file, Comparison_Finalized_Normalized_Data_GJNW_20150206.csv to compare three sets of Normalized data: GJ1, GJ2, and NW
- GJ1 vs NW results - avg 10^-11 difference
- GJ1 vs GJ2 results - 0 difference
- Another normalization was run, named GJ3. This was compared to GJ2 in the Excel comparison document. Computer restarted, another normalization created - GJ4
- GJ2 vs GJ 3 results - 0 difference
- GJ3 vs GJ 4 results - 0 difference
Conclusions: Data normalization did not change from trial to trial on paradoxus computer, no matter the time of normalization. Normalization produced a slight difference between boulardii and paradoxus computers.
April 14, 2015
Completing Week 11 and Week 12 assignments from [BIOL398]. I will complete statistical testing of wild type data, and generate a network from this data.
Notes for improvement:
- use COUNTIF function instead of filtering the numbers when looking at p-values
- To prepare for analysis in STEM, columns containing #VALUE! had to be removed by using custom filter: does not equal #VALUE!. Remaining number values had to be copied and pasted into a new sheet.
- On macs, cluster files from STEM are not recognized by Excel. Textedit files must be converted to csv by the following procedure:
- Select a tab character and press Command F, Paste into top bar
- Click replace, then type a comma into the replace bar. Click replace all.
- Save with file extension .csv (type manually if it is not a drop down option)
YEASTRACT analysis of profile cluster #45
- 19 significant transcription factors
Sfp1 Fkh2 Yhp1 Yox1 Cyc8 YLR278C Ace2 Rif1 Msn2 Stb5 Asg1 Msn4 Mig2 Swi5 Snf6 Pdr1 Gcr2 Gat3 Mcm1
- Our transcription factors from deletion strains (CIN5, GLN3, HMO1, ZAP1) are not included on this list.
- Use "Only DNA binding evidence" selection choice when generating networks in YEASTRACT
- Network should have 40-60 edges
April 29, 2015
Completing Week 13 and Week 14 assignments from [BIOL398]. I will use profile #45 from the YEASTRACT database as the basis for the network to be run through GRNmap. Including the four deletion strains, this network has 23 nodes and 46 edges.
- Protocol for Week 13 and 14 assignments was followed to produce:
- Outputs keeping b parameter fixed (i.e. fix_b is set to 1 on the optimization_parameters sheet of input workbook)
- Outputs allowing b to be estimated (i.e. fix_b is set to 0 on the optimization_parameters sheet of input workbook)
- Outputs for both runs include:
- Estimation Excel sheet containing estimated production rates, estimated b values (if applicable), and optimized weights for each transcription factor in the network
- Output graphs for each transcription factor in the network
- Both networks were visualized using GRNsight:
- The output Excel sheet can be used in GRNsight with one minor edit: change name of "out_network_optimized_weights" to "network_optimized_weights"