Zachary T. Goldstein Week 6

From OpenWetWare
Jump to navigationJump to search

HIV Evolution Project Electronic Notebook


  • The purpose of this weeks investigations was to work on our HIV project in class. We were supposed to select our subjects, run sequence analysis, and create trees to identify relationships between our new progressor and nonprogressor groups derived from dN/dS ratios.


  • dN/dS ratios and subjects from the Markham paper were extracted to a new excel file INSERT TABLE!!!
  • We decided that we will exclude the subjects with dN/dS data in the middle because it is irrelevant for trends we will be observing; it also limits the number of sequences we will be aligning.
  • We decided to include a total of 8 subjects in our study, 4 demonstrating the lowest ratios and 4 demonstrating the highest ratios.
  • Located Markham paper in PubMed, scrolled down below and selected link to subject and visit summary information PubMed Link
  • Selected all visits and clones from subject 2 and hit "download sequence on the top of the page"
  • Sequences were downloaded in "FASTA" format and were "squeezed" to get files to appear properly in a word document
  • The word document was then uploaded under "Nucleotide Tools" in the Biology Workbench Website
  • All subjects and clones from subject 2 were reselected and the sequences were aligned using the "CLUSTALW" function
  • Differences in sequences (highlighted in blue) were counted, S values were counted and θ calculations were performed
  • The same procedure above was performed for all subjects in our study: 2, 4, 5, 7, 9, 11, 13, 14
  • Theta values were calculated using the Math is Fun website
  • In calculating theta values we included all visits and clones from the desired subjects
  • In constructing the rooted and unrooted tree we used only the final visit of the desired subjects
  • A 2 way T-Test was run using to compare theta values between proposed groups
  • P-value=0.146 therefore it can be concluded that there is no statistically significant difference between the two data sets
  • Data was shared between my partner and I using email


Subject 2

  • S value: 36
  • θ value: 9.6

Subject 4

  • S value: 70
  • θ value: 15.8

Subject 5

  • S value: 58
  • θ value: 13.6

Subject 7

  • S value: 55
  • θ value: 12.7

Subject 9

  • S value: 69
  • θ value: 14.6

Subject 11

  • S value: 41
  • θ value: 10.2

Subject 13

  • S value: 25
  • θ value: 6.6

Subject 14

  • S value: 77
  • θ value: 15.7

Presentation Slides:Media:HIV1Presentaionzachandshivum.pdf

File:Rooted tree shiv.pdf

File:Unrooted tree shiv.pdf

File:Zach and shivum excel HIV1.xlsx

Scientific Conclusion

Throughout our research we concluded that reorganizing groups based off of dN/dS ratios did not do a better job of defining virus progression than CD4 T-Cell count. Although there was evidence of a potential relationship, T-test analysis of theta values produced a p-value >0.05 indicating there was no statistically significant difference between our results. Potentially reorganizing groups based off of genetic diversity or divergence could lead to better categorization, however further studies would need to be completed. Overall a lot of work was invested into the project, only to prove our initial hypothesis wrong, however a lot of knowledge was gained and a good (hopefully) presentation has been prepared for presentation in class.


  • I would like to acknowledge my homework partner User: Shivum A Desai for his help in completing this project. Together we worked in class to run sequence analysis, calculate S and θ values, and create trees to show potential relationships between groups. We also met outside of class in the computer lab to complete our presentaion and reserach.
  • I would also like to acknowledge User: Kam D Dahlquist for her assistance in class with explaining how to interpret rooted trees and with the selection of our subject's ratio cut-offs. We also recieved help from her via email on formatting slides and refrences.
  • While I received help on this assignment everything completed was my work and not copied from anyone else

Zachary T. Goldstein 19:42, 4 October 2016 (EDT)Zachary T. Goldstein


  • Markham, R. B., Wang, W., Weisstein, A. E., Wang, Z., Munoz, A., Templeton, A., . . . Yu, X. (1998). Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline. Proceedings of the National Academy of Sciences, 95(21), 12568-12573. doi:10.1073/pnas.95.21.12568
  • Spielman, S. J., & Wilke, C. O. (2015). The Relationship between dN/dS and Scaled Selection Coefficients. Molecular Biology and Evolution, 32(4), 1097-1108. doi:10.1093/ molbev/msv003
  • Subramaniam, S. (1998) The Biology Workbench--a seamless database and analysis environment for the biologist. Nucleic Acid Tools


All class assignments:

All individual assignments:

All shared journals:

User: Zachary T. Goldstein