Shivum Desai Journal Week 4

From OpenWetWare
Jump to navigationJump to search

Exploring HIV Evolution: An Opportunity For Research- Week 4 Continuation


The purpose of this study is to see if the HIV selected from four random subjects was originally from the same source of the virus.


  • Activity 2/Part 1
  1. First off the data files "visit_1_S1_S9.txt" and "visit_1_S10_S15.txt" were uploaded onto the Biology Workbench. In the nucleic acid section
  2. The next step was to select 3 clones from 4 different subject, resulting in 12 total sequences.
  3. These sequences then were analyzed using the CLUSTALw tool which allowed for multiple sequence alignment and the generation of a distance tree comparing all 12 sequences.
  4. The main piece of data that resulted from this analysis was a distance tree that presented the relationship amongst the four subjects and their clones.
  5. The tree was analyzed and recorded.
  • Activity 2/Part 2
  1. The first step in part two was to select three different subjects and determine the number of clones in each subject, the "S" value (number of positions that have nucleotide differences across all clones), and theta which is an estimate of the average pairwise genetic distance.
  2. This was repeated for a total of 3 data subjects and their respective clones.
  3. Then the Clustdist tool was used to calculate the minimum and maximum's for all of these subjects. A matrix was formed for each subject and the lowest and highest numbers were multiplied by the number of base pairs (285bp) then rounded to the closest integer to acquire the mins and maxs.
  4. This information was recorded.
  5. Next an analysis using Clustdist tool was conducted but this time using the clones from the three previously used subjects and comparing the subjects too each other, not their clones. The maxs and mins were then calculated and recorded.


Activity 2/Part 1

The results from portion of the project yielded that clones from each subject do cluster together. However there are subjects like S12 and S15 that show an extremely large amount of diversity in comparison to S11 and S14. Additionally, none of the subjects cluster together except for S14 and S15, which is peculiar and yet it supports the idea that some of the subjects may have a virus that is from a common source. The tree overall shows that subjects 15, 14, and 12 are much closer related to each other than subject 11 is to any of the other subjects. At first it could be deduced that there could have been a relationship amongst the sequences due to the order in which they were retrieved for the subject because they are all within five subjects of each other. However, that could not be true becuase S11 is very different from S12. The only conclusion that can be drawn from the tree is that S12, S15, and S14 all have a more closely related version of HIV compared to S11.

Activity 2/Part 2

From the calculations made, the only noticeable difference came with the sigma and resulting mins and maxs for subject 1's data. This is most likely due to the larger number of clones in subject 1 (13), compared to subject 2 and 3 which each had 6 and 4 clones, respectively.The chart below shows the data recorded when subjects 1, 2, and 3 were analyzed. The data chart below that presents the min's and max's for when each of the subjects were compared to each others as wholes, not as clones.


The purpose of todays procedure was to find out if the subjects contained viruses that were possibly from the same source. Based upon my findings I would have to say that this experiment showed that the viruses in subjects 1, 2, and 3 did not arise from the same source. The distance tree and the other quantitative data all support this conclusion. Thus, todays purpose was fulfilled because the analysis process did show that the subjects viruses' were not related. Remembering that the purpose of this experiment was to figure out whether the viruses were from the same origin, never claiming that they were or not.

Data Files



HIV Evolution Project

  1. What is your question?
    • Does classification of subjects (2 groups) by dS/dN ratio instead of T cell count clarify the type of selection for or against certain types of mutations by the viral strains
  2. Make a prediction as to the answer of your question before you begin your analysis.
    • It is hypothesized that if a comparison between dS/dN values is conducted, the analysis will show a correlation between the presence of non synonymous mutations and the difficulty/inability for a human immune system to prevent disease progression.
  3. Which subjects, visits, and clones will you use to answer your question?
    • For this experiment, 54 sequences were selected in total. 18 from each of the three categories of progressor: rapid progressed, moderate progressor, and slow progressor. This is to ensure that there is an adequeate pool to compare the dS/dN ratios to see the differences in between the three types of progressor groups. Thus, the following sequences are going to be used (accession numbers):
    • Rapid progressors
    AFO16760, AFO16761, AFO16762-subject 1 visit 1
    AFO16773, AFO16774, AFO16775-subject 1 visit 2
    AFO89109, AFO89110, AFO08911-subject 3 visit 1
    AFO89148, AFO89149, AFO89150-subject 4 visit 1
    AFO89151, AFO89152, AFO89153-subject 4 visit 2
    AFO89164, AFO89165, AFO89166-subject 4 visit 3
    • Moderate Progessors
    AF089195, AF089196, AFO89197-subject 5 visit 1
    AFO89203, AFO89204, AFO89205-subject 5 visit 2
    AFO89238, AFO89239, AFO89240-subject 6 visit 1
    AFO89241, AFO89242, AFO89243-subject 6 visit 2
    AFO89292, AFO89293, AFO89294-subject 7 visit 1
    AFO89335, AFO89336, AFO89337-subject 8 visit 1
    • Non-progessors
    AFO89529, AFO89530, AFO89531-subject 12 visit 1
    AFO89533, AFO89534, AFO89535-subject 12 visit 2
    AFO89566, AFO89567, AFO89568-subject 13 visit 1
    AFO89572, AFO89573, AFO89574-subject 13 visit 3
    AFO89579, AFO89580, AFO89581-subject 13 visit 4
    AFO89586, AFO89587, AFO89588-subject 13 visit 5


Subramaniam, S. (1998) The Biology Workbench--a seamless database and analysis environment for the biologist. Proteins, 32, 1-2.

Donovan, S., & Weissstein, A. (2003). Exploring HIV Evolution: An Opportunity for Research.

Vlahov, D., Anthony, J.C., Munoz, A., Margolick, J., Nelson, K.E., Celentano, D.D., Solomon, L., Polk, B.F. (1991). The ALIVE study, a longitudinal study of HIV-1 infection in intravenous drug users: description of methods and characteristics of participants. NIDA Res Monogr 109, 75-100.

Week 4 Assignment Page

  • This link contained all the essential information to complete this assignment

HIV Sequence Database. (2016, March 15). Retrieved September 24, 2016.

Useful Links


I would like to thank Dr. Dahlquist for her help on this assignment. As well as the help of Zachary T. Goldstein who was my partner in this experiment who I collaborated with to finish our lab assignment for the day as well as prepare the beginnings of our future investigation into HIV. I would also like to acknowledge the Biology Workbench, which helped me to complete my data analysis. I also used HIV sequences stored in the Los Alamos National Lab database, for which I am grateful. Lastly, while I worked with the people noted above, this individual journal entry was completed by me and not copied from another source.