Matthew R Allegretti Week 9

From OpenWetWare
Jump to navigationJump to search

Week 9 Assignment


  • To familiarize ourselves with protein sequence tools.

Methods and Results

  1. Convert one of your DNA sequences into protein sequences using either the NCBI Open Reading Frame Finder or the ExPASY Translate tool.
    1. How do you know which of the six frames is the correct reading frame (without looking up the answer)?
      • The one without a stop codon is the proper reading frame since the sequences come from the middle of the protein.
    2. Once you answered the question above, you can check your answer and obtain the rest of the protein sequences from the BEDROCK HIV Problem Space.
      • My answer was confirmed in class by Dr. Dahlquist.
  2. Find out what is already known about the HIV gp120 envelope protein in the UniProt Knowledgebase (UniProt KB). UniProt KB has two parts to it, Swis-Prot, which contains entries for proteins that have been manually reviewed, and TrEMBL (which stands for "Translated EMBL"), which are automated translations of all DNA sequences in the EMBL/GenBank/DDBJ databases.
    1. If you search on the keywords "HIV" and "gp120", in the main UniProt search field, how many results do you get?
      • 206,278 results
    2. Use the entry with accession number "P04578" which corresponds to the reference entry for HIV gp120. What types of information are provided about this protein in this database entry?
      • Function, Names and Taxonomy, Subcellular location, Pathology and Biotech, PTM/Processing, Interaction, Structure, Family and Domains, Sequence, Cross-references, Entry information, Miscellaneous, Similar proteins
  3. We are going to use the PredictProtein server to analyze just the V3 region from Markham et al. (1998).
    1. Paste one of the amino acid sequences from Markham et al. (1998) into the input field and submit. Explore the types of information provided. How does this information relate to what is stored in the UniProt database?
      • The UniProt database provides much more textual information about the overview of a protein while the PredictProtein server, while the latter provides a lot of graphical representations of the structure of a protein.
  4. Download the structure file for the paper we read in journal club from the NCBI Structure Database.
    1. Kwong et al. (1998) structure 1GC1.
    2. You may also be interested in these other structures from related gp120 structure articles:
  5. These files can be opened with the Cn3D software site that is installed on the computers in the lab (this software is free, so you can download it and use it at home, too.) Alternately, you may choose to use the Star Biochem program to do this portion of your work. Answer the following:
    1. Find the N-terminus and C-terminus of each polypeptide tertiary structure.
    2. Locate all the secondary structure elements. Do these match the predictions made by the PredictProtein server?
    3. Locate the V3 region and figure out the location of the Markham et al. (1998) sequences in the structure.

Data and Files

Kwong Structure File


Defining Your HIV Structure Research Project

  1. What is your question?
    • What amino acid type changes most effected the success of viral membrane fusion.
  2. Make a prediction about the answer to your question before you begin your analysis.
    • Changes from hydrophobic to charged hydrophilic will have the greatest effect on fusion.
  3. Which subjects, visits, and clones will you use to answer your question?
    • 4, 11, 9, 14. These four subjects have identical dS/dN ratios but are in different progressor groups. Sequences will be taken from visits 1 and 4, as subjects 2 and 11 only have 4 visits total.



While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source.

  • Dr. Dahlquist

Methods section pulled heavily from Dr. Dahlquist's Weel 9 instructions

Useful links

Course Home Page