Aherman week 9
- Downloaded Andrew Forney's txt file version of the VC data.
- Downloaded and uploaded new data set for strain of interest
- Opened GenMAPP and downloaded our expression dataset through the dataset manager, there were 121 errors recorded when uploaded. (the number of errors the other Andrew calculated was over 772 with the old data set) The most likely result of this is that my updated version of the database has been formated in a way that more closely relates to the data available, more genes have been written into the updated database, which more closely matches the modern dataset given.
- Opened the excel file created by GenMAPP and filtered errors column to identify the specific genes that could not be matched by the database. Recorded this information (most errors were recorded as the gene being not found in the database.)
- Created new color set to match new dataset, named lab vs pathogenic. Set parameters for coloration, increased was light blue (Avg log fold change greater than 0.25 and P value is less than 0.05), while decreased is yellow (Avg log fold change is less than -0.25 and P value is less than 0.05). Saved new parameters
- Added gene VC0941 to genmapp page and uploaded new expression dataset upon it, the box remained grey as no parameters were met
- Opened mapfinder and calculated new results for increased data, named file Mapp_finder_increased_AH
- MappFinder spent a few minutes computing, and then opened up a representative tree of the new gene ontology for this VC data.
- Opened the ranked list to view the top ten most effected ontology pathways in this analysis
The Top Ten Ontology Pathways Includes
- Branched chain family amino acid metabolic process
- Branched chain family amino acid biosynthetic process
- IMP biosynthetic process
- IMP metabolic process
- Purine nucleoside monophosphate metabolic process
- Purine nucleoside monophosphate biosynthetic process
- Purine ribonucleoside monophosphate biosynthetic process
- Purine ribonucleoside monophosphate metabolic process
- 'de novo' IMP biosynthetic process
- Cellulat nitrogen compound biosynthetic process
- These results are very different with the older version, due to the fact that the new gene data base more closely matches the information given on this dataset, i.e. newly discovered information on the roles of these genes, which in turn changes the effects the data will have on the various pathways. This will affect the rankings of the list and the roles that each gene plays in specific pathways.
- When searching for genes in the ontology tree that were listed in the Merrel paper these were the results:
- VC0028- metal ion binding, magnesium ion binding, iron-sulfur cluster binding (4 iron/4 sulfur cluster), catalytic activity, lyase activity, dihydroxy-acid dehydratase activity,
- VC0941- transferase activity, glycine hydroxymethyltransferase activity,
- VC0869- ATP binding, catalytic activity, ligase activity, phosphoribosylformylglycinamidine synthase activity
- VC0051- nucleotide binding , ATP binding, catalytic activity, lyase activity, carboxy-lyase activity, phosphoribosylaminoimidazole carboxylase activity
- VC0647- transferase activity, nucleotidyltransferase activity, polyribonucleotide nucleotidyltransferase activity
- VC0468- nucleotide binding, ATP binding, catalytic activity, ligase activity, glutathione synthase activity
- VC2350- catalytic activity, lyase activity, deoxyribose-phosphate aldolase activity
- VCA0583- Outer membrane-bounded periplasmic space
- Clicked on Go term for metal ion binding for VC0028 (ILVD_VIBCH), the results showed a significant difference that was increased (Pvalue 0.0474)
- A link to the EMBL database indicates that this gene is involved in catalytic activity, dihydroxy-acid dehydratase activity, metabolic processes, cellular amino acid biosynthetic process, branched chain family amino acid biosynthetic process, lyase activity, metal ion binding, iron-sulfur cluster binding.
- In comparing the critereon set map finder analysis
The Comparision Between the Old and New Version of the Database
- My data- 339 probes met the [Avg_LogFC_all] > 0.25 AND [Pvalue] < 0.05 criteria. (339)
338 probes meeting the filter linked to a UniProt ID. (291) 219 genes meeting the criterion linked to a GO term.(his 184) 5221 Probes in this dataset (5221) 5100 Probes linked to a UniProt ID. (4449) 2475 Genes linked to a GO term. (1990) The z score is based on an N of 2475 and a R of 219 distinct genes in the GO. (1990/184)
- His data shows probes that met the Avg log fold change all and P value was the same, in the older data there were less genes meeting the filter link, the older data had less genes meeting the filter criterion, equal amounts of probes,
- These results again confirm the conotation that the new data set contains more closely matched data, with pathways that are more closely representative of the data.
- When filtered these GO terms were observed:
- purine nucleotide biosynthetic process - no relation to other processes
- arginine metabolic process - see glutamine family amion acid biosynthetic process
- ribonucleoside monophosphate metabolic process - parent of purine ribonucleoside monophosphateprocess
- ribonucleoside monophosphate biosynthetic process - parent of purine ribonucleoside monophosphate biosynthetic process
- arginine biosynthetic process - child of glutamine family amino acid biosynthetic process
- glutamine family amino acid metabolic process - parent of arginine biosynthetic process
- purine nucleotide metabolic process - see ribonucleotide biosynthetic process
- glutamine family amino acid biosynthetic process - the parent process for arginine biosynthetic process,
- purine ribonucleotide metabolic process - see ribonucleoside monophosphate metabolic process
- cell projection organization - parent of flagellum organization
- purine ribonucleotide biosynthetic process - parent of purine nucleotide metabolic process,
- extracellular region - no relations
- flagellum organization - see cell projection organization
- These results indicate that VC0028 is related to the cellular organization, including flagellum organization, and is also heavily involved in the synthesis of critical amino acids including arginine and glutamine. There is also a relationship to in purine ribonucleotide development. These results taken into prospective may indicate that this gene is heavily involved in the repression of cellular reproduction and development, two critical components in pathogenesis of the strain. The development of receptors, to specific ions in the environment, assists the cell in indicating when it is going to become a pathogen (water versus human body for VC). And the development of flagellum will also assist in the ability of the strain to become a pathogen. The rest of the information indicates an ability to divide and reproduce at a higher rate. This is related to my partners results, in that his data showed the relationship to external environment cues and the development of flagellum. Due to the fact that all these processes are increased when VC0028 is removed from the genome, we may be able to conclude that it can serve as a component to the inhibition of pathogenesis in the Vibrio cholera strain of interest.