Harvard:Biophysics 101/Notebook:ZS/2007-3-20

From OpenWetWare
Jump to: navigation, search

Assignment for 20th

Xiaodi's discussion page has most of the interesting stuff... I found code from the biopython cookbook which retrieves a BLAST ID, and then based on http://biopython.org/DIST/docs/api/public/Bio.Blast.Record-module.html one can get interesting information. For a CDS, the GI# can be fed into dbSNP which will then find the SNP in question... if not a CDS though the chromosome position doesn't come up for the hit in any useable form...

from Bio import SeqIO
from Bio.Blast import NCBIWWW
from Bio.Blast import NCBIXML

file_handle = open("070313_smd.fasta")#put sequence here
records = SeqIO.parse(file_handle, format="fasta")
record = records.next()

sequence = record.seq.data

print sequence
result_handle = NCBIWWW.qblast("blastn", "nr",sequence)

blast_results = result_handle.read()
save_file = open("my_blast.xml","w")

result_handle = open("my_blast.xml")
blast_records = NCBIXML.parse(result_handle)
for blast_record in blast_records:
    for alignment in blast_record.alignments:
        for hsp in alignment.hsps:
            if hsp.expect < 0.04:
                print '**allign***'
                print 'seq', alignment.title
                print 'leng', alignment.length
                print 'e value:', hsp.expect
                print hsp.query[0:75] + '...'
                print hsp.match[0:75] + '...'
                print hsp.sbjct[0:75] + '...'