Harvard:Biophysics 101/Notebook:ZS/2007-3-15

From OpenWetWare
Revision as of 20:37, 14 March 2007 by Zsun (talk | contribs) (→‎Zach's Mystery Solution)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
DONT LOOK AT ME

I'm supposed to be private :D a


a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

a

aa

a

a

a

a

a

BLAST Results

>ref|NT_030059.12|Hs10_30314  Homo sapiens chromosome 10 genomic contig, reference assembly
Length=44617998

 Features flanking this part of subject sequence:
   3895 bp at 5' side: hypothetical protein
   425 bp at 3' side: HtrA serine peptidase 1


 Score =  736 bits (398),  Expect = 0.0
 Identities = 400/401 (99%), Gaps = 0/401 (0%)
 Strand=Plus/Plus

Query  1         CACCCTCGCCAGTTACGAGCTGCCGAGCCGCTTCCTAGGCTCTCTGCGAATACGGACACG  60
                 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  42968870  CACCCTCGCCAGTTACGAGCTGCCGAGCCGCTTCCTAGGCTCTCTGCGAATACGGACACG  42968929

Query  61        CATGCCACCCACAACAACTTTTTAAAAGAATCAGACGTGTGAAGGATTCTATTCGAATTA  120
                 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  42968930  CATGCCACCCACAACAACTTTTTAAAAGAATCAGACGTGTGAAGGATTCTATTCGAATTA  42968989

Query  121       CTTCTGCTCTCTGCTTTTATCACTTCACTGTGGGTCTGGGCGCGGGCTTTCTGCCAGCTC  180
                 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  42968990  CTTCTGCTCTCTGCTTTTATCACTTCACTGTGGGTCTGGGCGCGGGCTTTCTGCCAGCTC  42969049

Query  181       CGCGGACGCTGCCTTCGTCCAGCCGCAGAGGCCCCGCGGTCAGGGTCCCGCGTGCGGGGT  240
                 |||||||||||||||||||| |||||||||||||||||||||||||||||||||||||||
Sbjct  42969050  CGCGGACGCTGCCTTCGTCCGGCCGCAGAGGCCCCGCGGTCAGGGTCCCGCGTGCGGGGT  42969109

Query  241       ACCGGGGGCAGAACCAGCGCGTGACCGGGGTCCGCGGTGCCGCAACGCCCCGGGTCTGCG  300
                 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  42969110  ACCGGGGGCAGAACCAGCGCGTGACCGGGGTCCGCGGTGCCGCAACGCCCCGGGTCTGCG  42969169

Query  301       CAGAGGCCCCTGCAGTCCCTGCCCGGCCCAGTCCGAGCTTCCCGGGCGGGCCCCCAGTCC  360
                 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct  42969170  CAGAGGCCCCTGCAGTCCCTGCCCGGCCCAGTCCGAGCTTCCCGGGCGGGCCCCCAGTCC  42969229

Query  361       GGCGATTTGCAGGAACTTTCCCCGGCGCTCCCACGCGAAGC  401
                 |||||||||||||||||||||||||||||||||||||||||
Sbjct  42969230  GGCGATTTGCAGGAACTTTCCCCGGCGCTCCCACGCGAAGC  42969270

OMIM Results

Looking under the HtrA serine pepidase 1 protein, through OMIM found: http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=602194&a=602194_AllelicVariant0001

Then, through NCBI SNP: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs11200638 for the page, says:

>gnl|dbSNP|rs11200638|allelePos=201|totalLen=401|taxid=9606|snpclass=1|alleles='A/G'|mol=Genomic|build=120
 CACCCTCGCC AGTTACGAGC TGCCGAGCCG CTTCCTAGGC TCTCTGCGAA TACGGACACG
 CATGCCACCC ACAACAACTT TTTAAAAGAA TCAGACGTGT GAAGGATTCT ATTCGAATTA
 CTTCTGCTCT CTGCTTTTAT CACTTCACTG TGGGTCTGGG CGCGGGCTTT CTGCCAGCTC
 CGCGGACGCT GCCTTCGTCC
 R
 GCCGCAGAGG CCCCGCGGTC AGGGTCCCGC GTGCGGGGTA CCGGGGGCAG AACCAGCGCG
 TGACCGGGGT CCGCGGTGCC GCAACGCCCC GGGTCTGCGC AGAGGCCCCT GCAGTCCCTG
 CCCGGCCCAG TCCGAGCTTC CCGGGCGGGC CCCCAGTCCG GCGATTTGCA GGAACTTTCC
 CCGGCGCTCC CACGCGAAGC

Note that this matches exactly the SNP from the blast result.

Therefore, quoting OMIM:

MOLECULAR GENETICS 

From a cohort of Southeast Asians in Hong Kong, DeWan et al. (2006) identified 96 patients who had been previously diagnosed with wet age-related macular degeneration (see 610149) and 138 matched control individuals who were ARMD-free. Because the putative locus on 10q26 in which a previously identified SNP with significant association with ARMD had been removed from GenBank (see 610149), DeWan et al. (2006) sequenced the entire local genomic region, including promoters, exons, and intron-exon junctions of PLEKHA1 (607772) and HTRA1, in search of the functional variant. They found that 1 SNP in the promoter region of HTRA1, rs11200638(602194.0001), located 512 base pairs upstream of the HTRA1 putative transcriptional start site and 6,096 base pairs downstream of the previously identified SNP, exhibited a complete linkage disequilibrium pattern with the previously identified SNP. The SNP rs11200638 resides within putative binding sites for the transcription factors adaptor-related protein complex 2-alpha (AP2-alpha; 107580) and serum response factor (SRF; 600589). Preliminary results showed higher HRTA1 expression correlated with the risk (AA) compared with the wildtype (GG) genotype in in vitro transfection assays. 

Yang et al. (2006) independently identified the same SNP in the HTRA1 promoter region as causative of age-related macular degeneration in a Caucasian cohort in Utah. The authors suggested that the estimated population-attributable risk for the SNP is 49.3%. Consistent with an additive effect, the estimated population-attributable risk from a joint model with CFH Y402H (134370.0008) (i.e., for a risk allele at either locus) is 71.4%. 


ALLELIC VARIANTS
(selected examples) 


.0001 MACULAR DEGENERATION, AGE-RELATED, 7 [HTRA1, -512G-A]
MACULAR DEGENERATION, AGE-RELATED, NEOVASCULAR TYPE, SUSCEPTIBILITY TO, INCLUDED
DeWan et al. (2006) identified a SNP (rs11200638) for which homozygosity for the AA genotype results in a 10-fold (confidence intervals 4.38 to 22.82) increased risk of wet age-related macular degeneration (see ARMD7, 610149) in a Southeast Asian population identified in Hong Kong. Yang et al. (2006) independently identified this variant as conferring risk in a Caucasian cohort from Utah. 

Conclusion for this sequence

As a doctor, I would say that the patient may be at an increased risk for wet age-related macular degeneration, and that more tests are necessary to determine if he/she has a homozygous or heterozygous condition, which would affect the patient's risk factor. I would suggest possible preventative treatment for the disease, and suggest genetic screening with the husband to determine risk of passing trait.

What I did/would have done in Python

  1. Blast the sequence
  2. Search the protein of interest in OMIM and see if the BLAST matches any results
    • 'Wouldnt this be hard to implement?'
  3. Put through written program to determine reading frame and AA mutation
    • If intron, then probably not big deal
    • If silent or same amino acid type, likewise
    • If different AA type or causes frameshift/missense/nonsense mutation, then may be a big deal.
  4. Research revelant literature to make final conclusion
    • 'Python can only tell you the literature to look at given sequence, if one is a good programmer...'

Zach's Mystery Sequence

>Bob's sequence - is he going to die?
 TACAAACATAACAAATCCATGTGGCTCTTCCACTGGAAAGACTTCAGGTAAAGAATCCAT
 AACCAGCCCCTGGTGGCTTGCGTGACAGTCTCCACACATCTCCAAAATTAACACTTTAGC
 TAAATTCTCTCCACATGGGATAGAATTGCTCTGCCACTGTCTGAAAATGCGTATGTATGT
 CTTCTTTCTTTGTGAGAAGTTTCTACATGAGATGCAGAAATTACCATCTCTACGGCTATT
 GGAAAATGTTGGCTCATTTTATAGCTGCTGGCATAGATTTTTCCAGAGATTTAAGTTTCA
 CTCCTCCGGTCATAATGAATCACATCTGTGTATTTGTACAACATTTCTCCCTTCTTCCAT
 TCACTGAAAGTATCCCTGCCTGTGAATGTGTAGGTTTA 

Zach's Mystery Solution

DONT CLICK HERE IF YOURE SOLVING IT OR A PIG WILL LAND ON YOU AND SQUASH YOU ALIVE. click here