Harvard:Biophysics 101/2007/Project in Progress
- ATTENTION: Everyone needs to post their code to one place. Let's say everyone post a link from here that works to their code and then I'll be able to combine it all. --Katie Fifer
- Could someone who typed this up today please add the other sections that are being worked on? --TChan, 12:47 20 March 2007
5:32 20 March 2007
- Editted to add some of my own notes and to reflect some semblance of order.. --Cchi 10:00, 22 March 2007 (EDT)
- PM / encourage documentation
Sequence to BLAST SNP to rs#
- Zach, Mike, and Tiffany
- Parsing XML of Biopython BLAST - Deniz
- Relevant file: Python25/Lib/site-packages/Bio/Blast/NCBIWWW.py
- Discussion on BLAST SNP can proceed on the discussion page.
Accessing BLAST SNP using URLAPI
- To access snp blast database using BLAST URLAPI, you only need to provide the "DATABASE" parameter an appropriate value.
- The path and name of SNP blast databases available to URLAPI and blastcl3 client are documented at http://www.ncbi.nih.gov/staff/tao/URLAPI/remote_accessible_blastdblist.html#8
- For more information on URLAPI, please see: http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/new/
Info from BioPython (via Zach)
- It looks like that if you know the name of the database (here "snp/human_9606/human_9606"), then you can run for example
from Bio.Blast import NCBIWWW result_handle = NCBIWWW.qblast("blastn", "snp/human_9606/human_9606", seq) and then parse the results as usual (see section 3.4 in the Biopython tutorial).
- Database names can be found here: http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/remote_accessible_blastdblist.html
- The result_handle should give you the same information as the web page, and Biopython's parser should parse all information from result_handle correctly. If you find that some information seems to be missing, please let us know.
--smd 18:05, 22 March 2007 (EDT)
OMIM XML Parse
- Xiaodi - completed? Yup: here
- rs -> OMIM XML parse -> phenotype text
- Resmi, Cynthia, and Hetmann
- Handling the text from the parse: OMIM output to list of PubMed Review Articles <-The code on this page will print out review articles that might be relevant to the disease retruned by OMIM.
Controlled Vocabulary for parsing OMIM records
- Masseroli et al.: "Our efforts to derive from the OMIM entries a controlled vocabulary of phenotype locations and descriptions enabled us to normalize and structure the valuable OMIM phenotypic data according to the obtained vocabulary and make them suitable for computational use. Although detailed phenotype descriptions could be further homogenized and standardized, their subdivision in hierarchical levels of detail that we performed allows to group specific phenotypes according to their common general traits, without loosing their specific characteristics. So, for example "Mental retardation, moderate" and "Mental retardation, nonspecific" can be both generally considered as "Mental retardation" and at the same time they can be treated as different types of mental defects. This provides the chance to modulate analysis granularity when searching for phenotypic traits shared among multiple diseases or genotypes. It also ensures more significant and clear results when categorical statistical analyses are performed at lower granularity levels of detail. Such interesting feature, proper of the hierarchical structure and hence belonging also to the defined phenotype location hierarchy, is exploited in the new GFINDer Genetic Disorders modules implemented for the study of genetic disorder related genes."
—smd 13:19, 22 March 2007 (EDT)
- Tiffany, Resmi, Deniz, Xiaodi, Mike, Chris (note: ask if API exists)
- Chris, Deniz
- figure out with of multiple SNPs are relevant
- not in SNP db... then what? - I'd like to point out new efforts that aim to replace OMIM, called the "Human Variome Project" -- Deniz
- OMIM DOA
- systematically nonsyn. -> mutation not in OMIM or dbSNP?
- other dbs: genecard (spec. conservation, pop. freq)
- looking into linking gene expression w/ GEO?