Evan Montz Week 9

GenMAPP Expression Dataset Manager
Evan- Number of Errors from old dataset = 772

Claudia- Number of Errors from new dataset= 121

It was assumed that the difference in errors must be due to the updated data in the last two years. Because the new data is different, it is intuitive that there would be differences.

MAPPFinder Procedure
MAPPFinder was used to examine the expression data set and yielded the following results:

Top Ten GO Terms
 * 1) Protein Folding
 * 2) Aromatic amino acid family biosynthetic process
 * 3) Chorismate metabolic process
 * 4) Unfolded protein binding
 * 5) Cytoplasm
 * 6) Protein-N(PI)-phosphohistidine-sugar phosphotransferase activity
 * 7) Zinc ion binding
 * 8) Intracellular part
 * 9) Cation: sugar symporter activity
 * 10) Sugar: hydrogen symporter activity

Again, all of these terms were different than Claudia's because I was using the old database which hasn't been updated for two year. It is beginning to come clear that a lot changes with these gene expression datbabses can occur in a relatively short period of time.

While searching for the specific genes that were mentioned by Merrell et al. (2002), the program only identified two genes using the older database: VC0647 and VCA0583. The GO terms associated with these genes are listed below.

VC0647
 * 1) mRNA catabolic process
 * 2) RNA Processing
 * 3) Cytoplasm
 * 4) RNA Binding
 * 5) 3'-5' exoribonuclease activity
 * 6) Transferase activity
 * 7) Nucleotidyltransferase activity
 * 8) Polyribonucleotide nucleotidyltransferase activity

VCA0583
 * 1) Transport
 * 2) Outer-membrane-bounded periplasmic space
 * 3) Transporter activity
 * These appear to have a parent-child relationship.

All of the Claudia's associated genes were different from mine and using the new dataset, her search yielded a lot more matches.

Spreadsheet
When comparing the excel spreadsheet, all data was different again between the two databases (old and new). This again, reconfirmed that frequent changes occur within these databases. [[Media:Decreased_EPM-Criterion1-GO-Sorted.xls | Filtered Spreadsheet]]

Calculation summary
 * 578 probes met the [Avg_LogFC_all] < -0.25 AND [Pvalue] < 0.05 criteria.
 * 473 probes meeting the filter linked to a UniProt ID.
 * 254 genes meeting the criterion linked to a GO term.
 * 5221 Probes in this dataset
 * 4449 Probes linked to a UniProt ID.
 * 1990 Genes linked to a GO term.
 * The z score is based on an N of 1990 and a R of 254 distinct genes in the GO.

VCA0583

 * I chose this gene to more closely examine
 * This gene is actively involved in the transport of solutes across the cytoplasmic membrane.
 * The GO database was not functioning so I was unable to draw any conclusions from this database. However, UniProt database was helpful while trying to find more about this specific gene.  From this information, it appeared that this gene's only function dealt with the transport of solutes across the cytoplasmic membrane.  From a perspective that has a limited background in biology, it doesn't seem that this function has anything to do with the pathogenecity of the bacterium.

Week 9 Files
[[Media:Evan_Montz_Week_9_Files.zip | Evan Montz Week 9 Files]]

Evan Montz 01:12, 1 November 2010 (EDT)