BIOL368/F14:Nicole Anguiano Week 8
- 1 Defining Your HIV Structure Research Project
- 2 Working with Protein Sequences In-class Activity
- 2.1 Reading a SWISS-PROT Entry
- 2.2 ORFing your DNA sequence
- 2.3 Working with a single protein sequence
- 2.4 Predicting the Secondary Structure of a Protein
- 2.5 Crystal Structure Comparison
- 3 Links
Defining Your HIV Structure Research Project
Project going to be worked on in conjugation with Isabel Gonzaga and Chloe Jones. The text below is taken from Isabel Gonzaga Week 8, but the project we are working on uses the same question, hypothesis, and subject data.
How does HIV status (diagnosed, progressing or non-trending) affect the structure of the V3 protein region?
We hypothesize that diagnosed groups will express greater variability in the V3 region in their protein structure, in comparison to the non-trending groups. Initial comparisons show that diagnosed groups and progressing groups expressed greater genetic variability than non-trending groups. These changes may affect the third variable region, affecting the host's ability to adapt to the changes and generate sufficient immune response.
According to the BEDROCK HIV Sequence Data Table, I was able to determine which of the subjects used within my study actually developed aids. All 3 AIDS diagnosed were confirmed with the disease by their final visit. In the AIDS progressing groups, subjects developed AIDS within 1 year after their final visit. The Non-Trending groups all maintained high CD4 T Cell Counts above the threshold, even after the study was conducted. Sequences were for each visit and subject were chosen using a Random Integer Generator, to eliminate selection bias.
The following sequences was taken from the BEDROCK HIV Problem Space Database, from the Markham et al. (1998) study.
Table 1: Sequences analyzed
|AIDS Diagnosed|| 3
| 1, 2, 4|
3, 4, 5
3, 6, 7
2, 4, 8
2, 3, 4
5, 8, 10
|AIDS Progressing|| 7
| 2, 3, 9|
2, 8, 9
1, 4, 5
1, 6, 7
2, 3, 4
9, 10, 11
|No Trend|| 5
| 1, 3, 8|
4, 5, 2
1, 2, 3
6, 7, 9
1, 3, 4
3, 5, 4
Working with Protein Sequences In-class Activity
Reading a SWISS-PROT Entry
- I navigated to UniProt. I searched for "Q75760". Here is a portion of the results from the protein that came up from the search.
Entry Name: Q75760_9HIV1
Primary (citable) accession number: Q75760
Integrated into UniProtKB/TrEMBL: November 1, 1996
Last sequence update: November 1, 1996
Last modified: October 1, 2014
Names & Taxonomy
Protein names: Envelope glycoprotein gp160
Gene names: env
Organism: Human immunodeficiency virus 1
Taxonomic identifier: 11676
Taxonomic lineage: Viruses › Retro-transcribing viruses › Retroviridae › Orthoretrovirinae › Lentivirus › Primate lentivirus group
The envelope glyprotein gp160 precursor down-modulates cell surface CD4 antigen by interacting with it in the endoplasmic reticulum and blocking its transport to the cell surface.
The gp120-gp41 heterodimer allows rapid transcytosis of the virus through CD4 negative cells such as simple epithelial monolayers of the intestinal, rectal and endocervical epithelial barriers. Both gp120 and gp41 specifically recognize glycosphingolipids galactosyl-ceramide (GalCer) or 3' sulfo-galactosyl-ceramide (GalS) present in the lipid rafts structures of epithelial cells. Binding to these alternative receptors allows the rapid transcytosis of the virus through the epithelial cells. This transcytotic vesicle-mediated transport of virions from the apical side to the basolateral side of the epithelial cells does not involve infection of the cells themselves.
|P84801||2||EBI-8453491,EBI-8453570||From a different organism.|
|ath||Q9KWN0||2||EBI-8453491,EBI-8453511||From a different organism.|
|UDA1||P11218||2||EBI-8453491,EBI-8453649||From a different organism.|
Protein-protein interaction databases
|IntAct||Q75760. 3 interactions.|
Virion membrane; Single-pass type I membrane protein. Host cell membrane; Single-pass type I membrane protein. Host endosome membrane; Single-pass type I membrane protein
PTM / Processing
Amino Acid modifications
Keywords - Technical term
- This section contained a variety of references. There were sequence databases (EMBL, GenBank, DDBJ, PIR), 3D structure databases (PDBe, RCSB PDB, PDBj, ProteinModelPortal, SMR, ModBase, ModiDB), protein-protein interaction databases (DIP, IntAct, MINT), protocols and materials databases (Structural Biology Knowledgebase), miscellaneous databases (EvolutionaryTrace), and family and domain databases (Gene3D, InterPro, Pfam, SUBFAM, Protonet).
See table under PTM / Processing.
- If you search on the keywords "HIV" and "gp120", how many results do you get?
- Searching "hiv" returns 600,415 results. Searching "gp120" returns 182,286 results. Searching "hiv AND gp120" returned 180,227 results.
ORFing your DNA sequence
- I chose to ORF the DNA sequence from subject 15, visit 4, clone 3. Using Translate, I inputted the DNA sequence and obtained the 6 possible open reading frames.
- Comparing to the fasta sequence of the Uniprot protein above, I can see that the first open reading frame is most likely the first. The amino acid sequence, "EVVIRSENFTNNAKIIIVHLNESVVINCTRPNNNTRRKIPIGPGSSFYTTGIIGDIRQAHCNISGSKWNNTLKQIVNKLREQFVNKTIIFNQSS", is extremely similar to the sequence contained in the Uniprot protein, "EVVIRSDNFTNNAKTIIVQLKESVEINCTRPNNNTRKSIHIGPGRAFYTTGEIIGDIRQAHCNISRAKWNDTLKQIVIKLREQFENKTIVFNHSS". There are very few differences between them, indicating that likely the env gene is located in that location in the overall protein.
Working with a single protein sequence
- I navigated to ProtParam, and inputted the sequence from the clone above, then selected "Compute Parameters". The result was as follows:
Number of amino acids: 94
Molecular weight: 10625.1
Theoretical pI: 10.14
Amino acid composition:
Total number of negatively charged residues (Asp + Glu): 5
Total number of positively charged residues (Arg + Lys): 12
Total number of atoms: 13540
Extinction coefficients are in units of M-1 cm-1, at 280 nm measured in water.
Ext. coefficient 7115
Abs 0.1% (=1 g/l) 0.670, assuming all pairs of Cys residues form cystines
Ext. coefficient 6990
Abs 0.1% (=1 g/l) 0.658, assuming all Cys residues are reduced
The N-terminal of the sequence considered is M (Met).
The estimated half-life is: 1 hours (mammalian reticulocytes, in vitro), 30 min (yeast, in vivo), >10 hours (Escherichia coli, in vivo).
The instability index (II) is computed to be 45.96
This classifies the protein as unstable.
Aliphatic index: 94.26
Grand average of hydropathicity (GRAVY): -0.362
- I navigated to ProtScale and entered the amino acid sequence. I changed the "Window Size" dropdown to 19, then hit Submit. I saved the image as a .gif (Fig. 2).
- Next, I navigated to TMHMM, pasted in the sequence, then hit submit, then saved the image (Fig. 3).
- I navigated to ScanProsite and inputted the amino acid sequence. I deselected "Exclude motifs with a high probability of occurrence from the scan", and then hit "START THE SCAN".
- I navigated to InterProScan and inputted the amino acid sequence and hit submit.
- I navigated to CD Server and inputted the amino acid sequence. Then I changed the Expect Value Threshlod to 1, and hit submit.
Predicting the Secondary Structure of a Protein
- I navigated to PsiPred. I inputted the amino acid sequence and gave it the identifier "S15V4C3", then hit Predict. I waited about 15 minutes until it finished the prediction.
- I navigated to Predict Protein. I created an account so I could utilize the service. Then I validated my account and returned to the site. I logged in and inputted the amino acid sequence. I then resubmitted the job to get current results. The detailed results are visible here.
Crystal Structure Comparison
- I navigated to NCBI and downloaded the structure as a CN3D file. I opened the file in CN3D (Fig. 11), and selected the amino acid sequence that corresponded to the similar sequence to what Translate returned on the given amino acid sequence (Fig. 1). I selected "Show Selected Residues" to display only what was selected (Fig. 12). The presence of a smaller alpha helix in both PsiPred (Fig. 8) and PredictProtein (Fig. 10) indicates that mutations in the protein may have caused an alpha helix to form. However, the one large alpha helix is likely the alpha helix present in the original crystal structure. The presence of many beta sheets goes alongside the presence of beta sheets as seen from PsiPref (Fig. 8).
- Week 1 Assignment
- Week 2 Assignment
- Week 3 Assignment
- Week 4 Assignment
- Week 5 Assignment
- Week 6 Assignment
- Week 7 Assignment
- Week 8 Assignment
- Week 9 Assignment
- Week 10 Assignment
- Week 11 Assignment
- Week 12 Assignment
- Week 13 Assignment
- Week 15 Assignment
- Individual Journal Week 2
- Individual Journal Week 3
- Individual Journal Week 4
- Individual Journal Week 5
- Individual Journal Week 6
- Individual Journal Week 7
- Individual Journal Week 8
- Individual Journal Week 9
- Individual Journal Week 10
- Individual Journal Week 11
- Individual Journal Week 12
- Individual Journal Week 13
- Individual Journal Week 15