Harvard:Biophysics 101/2007/Notebook:HRH/2007-5-3

Summary of code (to be inserted in the top of the code).

This code takes the input from the user (a codon sequence) which is searched against the human database to look for SNPs (Single Nucleotide Polymorphisms). This codon sequence is found in BLAST SNP, and any SNPs are reported as an RS number*. The query is compared against the sequence in dbSNP to determine if the sequence is really a mutation; if this test passes, the RS number is then is used to generate mesh terms from PubMed, and determines which mesh terms are the most relevant. The potential disease and the prevalence of this disease (derived from the California State Prevalence data) are extracted from the most pertinent mesh terms. These mesh terms are then used to provide updated news regarding the disease.


 * Aside: one portion of the code accesses BLAST SNP without using RS numbers. The information acquired through this the BLAST SNP website is queried in OMIM, and the output is as follows: disease name, mutation, and name of the mutation.  The disease name is then used to search different websites for drugs, procedure, and experts regarding the disease, and is used to provide a list of web pages with the disease name searched.