Salomon Garcia: Lab notebook Week 7

Procedures for Excel calculations

 * Original data is uploaded with no significant changes yet as of 6:08pm 10/19/10 Link to the assignment:[[Media:Merrell_Compiled_Raw_Data_Vibrio_SGV.xls‎ | Dataset for assignment]]


 * For the first portion of this assignment several calculations were done using excel standard deviation and averages were found link to this uploaded [[Media:Raw_Data_Vibrio_Salomon_Garcia_Valencia.xls‎ | Vibrio data continued]]


 * After getting the averages and standard deviations the number were scaled centered to each of replicates and the infomation looks like the Link shown here:[[Media:Raw_Data_Vibrio_Salomon_Garcia_Valencia1.xls‎ | Data has been scaled centered for each of the replicates]]


 * After doing some of the statistical analysis further work needs to be done and finish the excel, here is the link:[[Media:Raw_Data_Vibrio_Salomon_Garcia_Valencia1.5.xls‎ | continued version]]


 * This data that has just been uploaded includes the information with the p-values and it includes the TSTAT and the data will be fixed in order for it to be ready to be used for GenMAPP. So far this is the data at this particular moment, at 11:52am 10/23/10/ [[Media:Raw_Data_Vibrio_Salomon_Garcia_Valencia1.75.xls‎|File is ready but further analyzing still required]]


 * For the next portion of the analyzing of the data some of the cells were cut and they were rearranged in a way that helps in the use of GenMAPP. Then a new column was inserted in order to add a new name (this will be seen in the link that is at the botttom of this). Here I have added a copy of the excel worksheet [[Media:Raw_Data_Vibrio_Salomon_Garcia_Valencia1.8.xls‎|Data that has been manipulated for usage of GenMAPP]] and here is the copy of the text file required GenMAPP [[Media:Raw_Data_Vibrio_Salomon_Garcia_Valencia1.8.txt| Data for GenMAPP usage is ready]]


 * How many genes have p value < 0.05?944 genes have p-values less than 0.05
 * What about p < 0.01?233 genes have p-values less than 0.01
 * What about p < 0.001?23 genes have p-values less than 0.001
 * What about p < 0.0001?2 genes have p-values less than 0.0001


 * Keeping the "Pvalue" filter at p < 0.05, filter the "Avg_LogFC_all" column to show all genes with an average log fold change greater than zero. How many are there?353 genes are shown to fall in this criteria


 * Keeping the "Pvalue" filter at p < 0.05, filter the "Avg_LogFC_all" column to show all genes with an average log fold change less than zero. How many are there?591 genes are shown to fall in this criteria


 * What about an average log fold change of > 0.25 or < -0.25? (This is a more realistic value for the fold change cut-off because it represents about a 20% fold change which is about the level of detection of this technology.)918 genes were shown to fall under this criteria.


 * What criteria did Merrell et al. (2002) use to determine a significant gene expression change? How does it compare to our method?


 * Merrell et al. (2002) report that genes with IDs: VC0028, VC0941, VC0869, VC0051, VC0647, VC0468, VC2350, and VCA0583 were all significantly changed in their data. Look these genes up in your spreadsheet? What are their fold changes and p values? Are they significantly changed in our analysis?


 * VC0028 show a p-value of 0.3587 
 * VC0941 shows a p-value of 0.3587
 * VC0869 shows a p-value of 0.3587
 * VC0051 shows a p-value of 0.3587 
 * VC0647 shows a p-value of 0.3587
 * VCO468 shows a p-value of 0.3587
 * VC2350 shows a p-value of 0.3587 
 * VCA0583 shows a p-value of 0.3587