AninditaVarshneya BIOL368 Week 4

From OpenWetWare
Revision as of 16:01, 23 September 2016 by Anindita Varshneya (talk | contribs) (→‎Part 2: added table 2 and 3)
Jump to navigationJump to search

Electronic Lab Notebook

Purpose

This activity taught us about S and theta values as well as more practice using ClustalW Multiple Alignment and ClustalWDist. Experience with these tools is especially valuable since these tools will be used for the following weeks as a part of our research project using HIV sequences.

Methods and Results

Part 1

  • Maneuver back to the Biology Workbench website.
  • Download the "visit_1_S1_S9.txt" and "visit_1_S10_S15.txt" files from the Week 4 Assignment Page
  • Upload both sheets to the Biology WorkBench website.
    • Select "Add Nucleic Sequences" and hit Run
    • Hit "Run" from the resulting page
  • Generate a multiple sequence alignment and distance tree for 12 of these sequences (3 clones from each of 4 subjects)
Subjects and clone numbers used for the distance tree.
    • Select 12 sequences. Select ClustalW Multiple Sequence Alignment from the scrolling menu and select "Run"
    • On the next page, make sure the tree is set to "Unrooted Trees" and hit Submit
  • Clones from subject 3 and subject 5 cluster together and separate from the other clones, but clones from subject 1 and 2 are clustered together.
  • Clones from subject 3 are more genetically diverse than clones from other subjects as they have the greatest genetic distance from all other subjects. In contrast, the clones from subject 2 are all very closely related and therefore do not have as much genetic diversity.
  • Analyzing this tree makes it clear that clones associated with subject 5 and subject 3 have the greatest genetic distance from subject 1 and 2, and therefore shared a much older common ancestor that the clones that subject 1 and subject 2 share. This means that the HIV sequences in subject 1 and subject two are more closely related as their most common ancestor appears to be relatively close. We can therefore expect clones from subject 1 and subject 2 to have more genetic similarities than clones from either subject 5 or subject 3 have with any of the other clones in this analysis.

Part 2

  • Select all of the clones for one and align them using the ClustalW tool as described earlier.
  • Count the number of positions where at least one nucleotide is different across all of the clones (the number of columns with black font) to calculate S.
  • Select "Import alignment" to save these alignments for the next step.
  • Return to the "Nucleotide Tools" tab.
  • Repeat this procedure for 2 more subjects.
  • Calculate theta using Wolfram Alpha using the following formula where S is the same number you calculated in the previous step and n is the total number of clones for that subject:
   S/((1/n)+(1/(n-1))...(1/1))
  • Switch to the "Alignment Tools" tab.
  • Select one of the alignments and run the ClustalWDist alignment tool
  • To calculate the min and max difference, find the lowest and highest pairwise values and multiply them with the length of the sequence
    • The length of the sequence can be found by scrolling past the pairwise numbers and is reported as the number of base pairs in the clone. Round the number to the nearest integer.
Table 2. Data collected using ClustalW multiple alignment tool. The value S was calculated by counting the number of positions where at least one nucleotide is different across all clones, and theta is calculated using the following formula where S is the same number calculated earlier, and n is the total number of clones collected from that subject: S/((1/n)+(1/(n-1))...(1/1)). Min and max difference were calculated according to number of basepairs and pairwise numbers as presented by ClustalWDist
  • Create new alignments with all of the sequences from 2 subjects using the ClustalW tool in the "Nucleic Tools" tab
  • Use ClustWDist to generate another pairwise distance matrix and calculate the min and max differences as done in the previous steps.
Table 3. Data collected using ClustalWDist. Min and max differences are based on data from both of the subjects indicated in the first column and were calculated with the number of base pairs and the pairwise numbers as presented by ClustalWDist.

Conclusion

Data and Files

Defining the Research Project

Acknowledgements

Thank you Mia Huddleston for working with me on the procedures outlined above. While I worked with the people noted above, this individual journal entry was completed by me and not copied from another source.

References

  1. Donovan S and Weisstein AE (2003) Exploring HIV Evolution: An Opportunity for Research. In Jungck JR, Fass MR, and Stanley ED, eds. Microbes Count! West Chester, Pennsylvania: Keystone Digital Press.
  2. Markham, R.B., Wang, W.C., Weisstein, A.E., Wang, Z., Munoz, A., Templeton, A., Margolick, J., Vlahov, D., Quinn, T., Farzadegan, H., & Yu, X.F. (1998). Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline. Proc Natl Acad Sci U S A. 95, 12568-12573. doi: 10.1073/pnas.95.21.12568
  3. Vlahov, D., Anthony, J.C., Munoz, A., Margolick, J., Nelson, K.E., Celentano, D.D., Solomon, L., Polk, B.F. (1991). The ALIVE study, a longitudinal study of HIV-1 infection in intravenous drug users: description of methods and characteristics of participants. NIDA Res Monogr 109, 75-100.
  4. Week 4 Assignment Page

Other Links

User Page: Anindita Varshneya

Bioinfomatics Lab: Fall 2016

Class Page: BIOL 368-01: Bioinfomatics Laboratory, Fall 2016

Weekly Assignments Individual Journal Assignments Shared Journal Assignments

SURP 2015

Links: Electronic Lab Notebook