AninditaVarshneya BIOL368 Week 9: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Electronic Lab Notebook: saving progress)
(→‎Data and Files: added protein file)
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Electronic Lab Notebook==
==Electronic Lab Notebook==
===Purpose===
===Purpose===
To learn more about the structure of the gp120 envelope protein in HIV-1.
===Methods and Results===
===Methods and Results===
* Convert DNA sequence of subject 4 during visit one into a protein sequence using [https://www.ncbi.nlm.nih.gov/orffinder/ NCBI ORF Finder]
* Convert DNA sequence of subject 4 during visit one into a protein sequence using [https://www.ncbi.nlm.nih.gov/orffinder/ NCBI ORF Finder]
** The frame without the stop codon is the true open reading frame.
** The frame without the stop codon is the true open reading frame.
** My translated protein sequence: E V V I R S E N F T N N A K I I I V Q L N E S V E I N C T R P N N N T R K S I H I G P G R A F Y T T G D I I G D I R Q A Y C N I S R A E W N N T L K H I V I K L R E H F G N K T I V F N H S S  
** My translated protein sequence: E V V I R S E N F T N N A K I I I V Q L N E S V E I N C T R P N N N T R K S I H I G P G R A F Y T T G D I I G D I R Q A Y C N I S R A E W N N T L K H I V I K L R E H F G N K T I V F N H S S  
[[Image:AV-translate-tool.PNG | 500px | center]]
** Markham et al. protein sequence: EVVIRSENFTNNAKIIIVQLNKSVEINCTRPNNNTIRRIPIGPGRAFYTTGRIGDIRPAHCNISRTKWNNALKLIVNKLREQFRNKTIIFNQSS
** Markham et al. protein sequence: EVVIRSENFTNNAKIIIVQLNKSVEINCTRPNNNTIRRIPIGPGRAFYTTGRIGDIRPAHCNISRTKWNNALKLIVNKLREQFRNKTIIFNQSS
* Learn more the gp120 protein using [http://www.uniprot.org/ UniProt Knowledgebase (UniProt KB)]
* Learn more the gp120 protein using [http://www.uniprot.org/ UniProt Knowledgebase (UniProt KB)]
Line 12: Line 15:
*** Function, Names and Taxonomy, Subcellular location, Pathology/Biotechnology, PTM/Processing, Interaction, Structure, Family and Domains, Sequence, Cross-references, Entry information, Similar proteins, and other miscellaneous information.
*** Function, Names and Taxonomy, Subcellular location, Pathology/Biotechnology, PTM/Processing, Interaction, Structure, Family and Domains, Sequence, Cross-references, Entry information, Similar proteins, and other miscellaneous information.
* Use the [https://ppopen.informatik.tu-muenchen.de/ Predict Protein Server] to analyze the V3 region of Subject 4, Visit 1-1
* Use the [https://ppopen.informatik.tu-muenchen.de/ Predict Protein Server] to analyze the V3 region of Subject 4, Visit 1-1
[[Image:AV-Protein-predictor.PNG | 500px | center]]
** Copy and paste their sequence from the [http://bioquest.org/bedrock/problem_spaces/hiv/amino_acid_sequences.php Bedrock HIV Problem Space] into PPS
** Copy and paste their sequence from the [http://bioquest.org/bedrock/problem_spaces/hiv/amino_acid_sequences.php Bedrock HIV Problem Space] into PPS
** Information provided here includes summary data of the sequence (length, aligned proteins, matched PDB structures) and amino acid composition. The UniProt website provides much more information about the sequence, but the PredictProtein server more visual representations of similar information.
*** I appreciate that the PredictProtein server also explains how they collect certain data.
*** The PredictProtein server seems to now have as much information as the Uniprot website.
* Download the HIV Protein Structure from [https://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=8099 Kwong et al. (1998) structure 1GC1.]
** Open the protein on [https://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml Cn3D software site] that is available on the computers in the lab.
[[Image:AV-Cn3D_protein_structure.PNG | 500px | center]]
** N and C termini can be found be selecting the first and last amino acid for each of the 4 chains. They are the yellow highlighted ends of each protein in the model.
[[Image:AV-N-c-termini.PNG | 500px | center]]
** The secondary structures are shown below, but I run out of time in class to compare them with the results presented by Markham et al.
[[Image:AV-Secondary-structures.PNG | 500px | center]]
===Conclusion===
===Conclusion===
The correct open reading frame of the Markham et al. sequences can only be found by identifying the reading frame that does not have any stop codons. This is because the DNA sequence provided is of the V3 region which exists in the middle of the gp120 protein, so there shouldn't be any stop codons. It is ok for there to be methionine (the start codon) because it often appears in the middle of sequences. Using several different analysis websites and services, it became evident just how variable the V3 region of gp120 truly is. In the PredictProtein server, there is an adjacency matrix that quantifies how likely it is for each amino acid in the V3 sequence to mutate into a different amino acid. This is of particular interest to our research group because we will be focusing on the effects of nonconservative point mutations on the functionality of the gp120 protein. Overall, this research exercise provided quite a bit of information with which we were able to set up our upcoming research project. More details on that project can be found below.
===Data and Files===
===Data and Files===
*[[Media:AV-Mmdb_1GC1-structure-data.cn3 | gp120 protein structure data from Kwong et al.]]


==Defining the Research Project==
==Defining the Research Project==
#What is your question?
#*What amino acid type changes most effected the success of viral membrane fusion.
#Make a prediction about the answer to your question before you begin your analysis.
#*Changes from hydrophobic to charged hydrophilic will have the greatest effect on fusion.
#Which subjects, visits, and clones will you use to answer your question?
#*4, 11, 9, 14. These four subjects have identical dS/dN ratios but are in different progressor groups. Sequences will be taken from visits 1 and 4, as subjects 2 and 11 only have 4 visits total.
==Acknowledgements==
==Acknowledgements==
[[User: Matthew_R_Allegretti| Matt Allegretti]] and I worked together to define our research project during lab on 10/25. While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source.
==References==
==References==
*[[BIOL368/F16:Week_9 | Week 9 Assignment Instructions]]


{{Template: Anindita Varshneya}}
{{Template: Anindita Varshneya}}

Latest revision as of 13:53, 27 October 2016

Electronic Lab Notebook

Purpose

To learn more about the structure of the gp120 envelope protein in HIV-1.

Methods and Results

  • Convert DNA sequence of subject 4 during visit one into a protein sequence using NCBI ORF Finder
    • The frame without the stop codon is the true open reading frame.
    • My translated protein sequence: E V V I R S E N F T N N A K I I I V Q L N E S V E I N C T R P N N N T R K S I H I G P G R A F Y T T G D I I G D I R Q A Y C N I S R A E W N N T L K H I V I K L R E H F G N K T I V F N H S S
    • Markham et al. protein sequence: EVVIRSENFTNNAKIIIVQLNKSVEINCTRPNNNTIRRIPIGPGRAFYTTGRIGDIRPAHCNISRTKWNNALKLIVNKLREQFRNKTIIFNQSS
  • Learn more the gp120 protein using UniProt Knowledgebase (UniProt KB)
    • Searching "HIV gp120" returned 206,278 results
    • I selected the entry with the accession number: P04578
    • The types of information provided include:
      • Function, Names and Taxonomy, Subcellular location, Pathology/Biotechnology, PTM/Processing, Interaction, Structure, Family and Domains, Sequence, Cross-references, Entry information, Similar proteins, and other miscellaneous information.
  • Use the Predict Protein Server to analyze the V3 region of Subject 4, Visit 1-1
    • Copy and paste their sequence from the Bedrock HIV Problem Space into PPS
    • Information provided here includes summary data of the sequence (length, aligned proteins, matched PDB structures) and amino acid composition. The UniProt website provides much more information about the sequence, but the PredictProtein server more visual representations of similar information.
      • I appreciate that the PredictProtein server also explains how they collect certain data.
      • The PredictProtein server seems to now have as much information as the Uniprot website.
  • Download the HIV Protein Structure from Kwong et al. (1998) structure 1GC1.
    • N and C termini can be found be selecting the first and last amino acid for each of the 4 chains. They are the yellow highlighted ends of each protein in the model.
    • The secondary structures are shown below, but I run out of time in class to compare them with the results presented by Markham et al.

Conclusion

The correct open reading frame of the Markham et al. sequences can only be found by identifying the reading frame that does not have any stop codons. This is because the DNA sequence provided is of the V3 region which exists in the middle of the gp120 protein, so there shouldn't be any stop codons. It is ok for there to be methionine (the start codon) because it often appears in the middle of sequences. Using several different analysis websites and services, it became evident just how variable the V3 region of gp120 truly is. In the PredictProtein server, there is an adjacency matrix that quantifies how likely it is for each amino acid in the V3 sequence to mutate into a different amino acid. This is of particular interest to our research group because we will be focusing on the effects of nonconservative point mutations on the functionality of the gp120 protein. Overall, this research exercise provided quite a bit of information with which we were able to set up our upcoming research project. More details on that project can be found below.

Data and Files

Defining the Research Project

  1. What is your question?
    • What amino acid type changes most effected the success of viral membrane fusion.
  2. Make a prediction about the answer to your question before you begin your analysis.
    • Changes from hydrophobic to charged hydrophilic will have the greatest effect on fusion.
  3. Which subjects, visits, and clones will you use to answer your question?
    • 4, 11, 9, 14. These four subjects have identical dS/dN ratios but are in different progressor groups. Sequences will be taken from visits 1 and 4, as subjects 2 and 11 only have 4 visits total.

Acknowledgements

Matt Allegretti and I worked together to define our research project during lab on 10/25. While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source.

References

Other Links

User Page: Anindita Varshneya

Bioinfomatics Lab: Fall 2016

Class Page: BIOL 368-01: Bioinfomatics Laboratory, Fall 2016

Weekly Assignments Individual Journal Assignments Shared Journal Assignments

SURP 2015

Links: Electronic Lab Notebook