Harvard:Biophysics 101/2007/Notebook:Michael Wang/2007-5-2

Polyphen code: This has been a rather frustrating search for something I know already exists. The pseudocode is as follows:

Input: Coding sequence, mutation position, original amino acid, mutated amino acid

Step 1: Use the python equivalent of get to submit to polyphen and download results locally

Step 2: Parse results to retrieve the prediction and PISC score. Ideally, this could return a table similar to the one in the polyphen documentation

Problems:

Input: Zach has gotten pretty far on this, but I still need to retrieve the actual amino acid sequence and translate the SNP position to Amino Acid position

Step 1: This seems to be fairly easy to implement in perl, but its been impossible to find the equivalent functions in python. I've been messing with the urllib function, but I'm still stuck on url encoding. Also, it seems as if using a straight get function instead of a post data submission automatically gets rejected by the server so that may not work anyway.

Step 2: I've been working with beautiful soup to parse the downloaded htm file. Unforunately its a mess and I may just end up using a straight regular expression to pull out the "This variant is predicted to be... "