Katrina Sherbina: Week 3

06/01/2011
One Master Excel workbook was created with a spreadsheet of the GCAT chip normalizations that a coworker completed today, a spreadsheet of all of the within array normalizations already done last week for all of the Ontario chips, a spreadsheet with the between array normalizations for the Ontario chips, and also a spreadsheet that integrated both the GCAT chip normalizations and the within array normalizations for the Onatario chips.

To create the integrated spreadsheet mentioned above, Microsoft Access was used to merge the normalized data from the GCAT and the Ontario chips eliminating any genes in the GCAT chips that were not also in the Ontario chips. The steps are as follow.
 * 1) Save the Excel files of the data for the GCAT and Ontario chiops as tab delimited files.
 * 2) Create a new database on Access.
 * 3) Import the data (File->Get External Data->Import)
 * 4) Go through the import Wizard: specify the data as delimited, keep the delimiter as tab and the text qualifier as none and indicate that the first row contains field names, and choose my primary key as the ID names (the genes). The import wizard is gone through twice once with the GCAT data and once with the Ontario data.
 * 5) In the window for the current database, go to queries and select "Create query in Design View".
 * 6) Add both imported tables (the GCAT and Ontario).
 * 7) In the "Select Query" window, join GCAT ID and Ontario ID with a line. Right click on the line and press the "Join Properties" option and then third option (to include all records from the Ontario data and only those of the GCAT data that are also within the Ontario data).
 * 8) Select all of the fields in the GCAT query window and drag into the first box in the "Field" row in the table below.
 * 9) Select all of the fields in the Ontario query window and drag into the next free box in the "Field" row in the table below.
 * 10) Create a new table for this joined data (Query->Make-Table Query).
 * 11) Copy and pastet the new table into a new Excel spreadsheet.

Katrina Sherbina 20:27, 1 June 2011 (EDT)

06/02/2011
It was found that the MA plots generated before and after within array normalization only corresponded to the first microarray chip (first GPR file) in the targets file imported into R. A new code was written to generate MA plots before and after within array normalizations for all of the GPR files in the targets file. The number of rows and columns of MA plots, the number of iterations for the for loop, and the limits of the y-axis were altered for each strain.

par(mfrow=c(3,5)) for (i in 1:14) {plotMA(RG[,i],ylim=c(-4,4))} for (i in 1:14) {plotMA(MA[,i])}

Originally, individual graphs of boxplots were generated side by side for before and after between array normalization for all the GPR files in the targets file. A new code was written to generate all of the boxplots for all of the GPR files in one graph. The limits for the y-axis were changed for each strain.

x<-as.matrix(MA$M) boxplot(x[,1],x[,2],x[,3],x[,4],x[,5],x[,6],x[,7],x[,8],x[,9],x[,10],x[,11],x[,12],x[,13],x[,14],ylim=c(-6,6)) y<-as.matrix(MAScale$M) boxplot(y[,1],y[,2],y[,3],y[,4],y[,5],y[,6],y[,7],y[,8],y[,9],y[,10],y[,11],y[,12],y[,13],y[,14],ylim=c(-6,6))

In addition, box plots were generated in one graph for all the GPR files in the targets file before any kind of normalization. The code was similar to the code for the boxplot above with the exception that in place of the x or y values the log base 2 ratio of red-red background to green-green background was found for each individual microarray chip. For chips that word dye swapped, the negative log base 2 ratio was found.

Katrina Sherbina 19:34, 2 June 2011 (EDT)