Nick Rohacz: Summer 2011 Week 1
May 20, 2011
After attempting to find a match between each .gpr file and the individual LogFC columns in the original raw data some problems occurred. A single .gpr file was chosen to run a few tests on to try and match with one of the columns. The file that was selected in this case was t15/t0 for flask 3 of the Dahlquist wildtype strain. Flask 3 was chosen for a dye swap of Red/Green to Green/Red, however, when computing the LogFC for each spot on the chip, it was discovered that the LogFC was performed using the Red intensity divided by the Green intensity. This will have to be confirmed with Dr. Dahlquist to see if this should be changed at all or if the Excel sheets obtained from May 19th, 2011, a journal for this day does not exist. However, using Red/Green for the fold change, a column was obtained that matched the LogFC originally performed in the .gpr file. The next step was to try and perform this in the statistics processing software R. The original code given was changed to focus on the median, due to the fact that R was formatted to process the mean foreground intensity over the median background intensity. Once this was done, R successfully created a list of LogFC, of which the first five were focused on to compare to the original .gpr file. Of these five, two were duplicates giving three individual genes to be focused on in the file. The first five data points given by R matched up with the first five genes given in the .gpr file, so we know that this was functioning correctly. The main problem that arose was that when the LogFC given by the .gpr file and R and the LogFC found in the raw data were compared, an inconsistency occurred. These numbers did not match, and after running a seperate test on the .gpr file for t30/t0 for the same culture, the same problem occurred.
Nicholas A. Rohacz 13:39, 20 May 2011 (EDT)