Activity 1: Looking at the NCBI Resources and HIV sequence data
Part 1: PubMed
How did you search for the PubMed entry?
Next to the search bar, I selected "PubMed" from the drop down menu from which database I would like to search. In the search bar, I typed in "Markham et al. HIV-1 evolution" and scrolled down until I found the article that we were assigned in class.
What other ways might you have searched?
I could have used the "Advanced" search option, in which I could have narrowed down the searches by selecting two categories, including "Author," "Date," "Journal," "Title," or "ISBN" as examples. For the Markham piece, I could have selected for "Author" (Markham et al.) and "Journal" (PNAS).
What other types of related information are available?
Once on the article page, other information available to browse include:
"similar articles" (though I am unsure what similar means...similar in terms of topic? In terms of findings?),
other articles that cited this one,
related information (where I found the nucleotide sequences),
images from the publication in a magnified view,
a glossary of terms used in the paper,
and my recent activity and interactions with PubMed in terms of searching other articles for other classes.
Part 2: GenBank
What was the accession number of the sequence you chose?
I chose HIV-1 isolate S3V3-9.
Which subject of the study was that HIV sequence from?
This HIV sequence is from Subject 3 on their third visit. This is their ninth clone.
Which section of the record contains information about who the HIV was collected from?
The "definition" section of the record contains the information about from whom the HIV was collected.
Activity 2: Looking at the sources of HIV across subjects
Part 1: Looking at clustering across subjects
Table 1. Four subjects and three of their clones from Markham et al.
Subject
Clone #
2
1
2
3
5
4
5
6
9
2
3
4
2
1
2
3
Do the clones from each subject cluster together?
It does seem that the clones from each subject cluster together.
Do some subjects' clones show more diversity than others?
The clones of Subjects 9 and 15 have more branching than the other subjects, which shows a greater amount of diversity. The more branching off, the more diverse the clones are from the original.
Do some of the subjects cluster together?
Subjects 2, 5, and 9 appear to cluster together, while Subject 15's clones are pretty spread apart. Error creating thumbnail: File missingFigure 2. Phylogenetic tree for twelve sequences (three clones from each of four subjects).
Write a brief description of your tree and how you interpret the clustering pattern with respect to the similarities and potential evolutionary relationships between subjects' HIV sequences.
This phylogenetic tree shows both clustering and spatial distance, as well as genetic diversity. Subject 15 does not cluster, and the greater the genetic distance infers the longer the time since they shared a common ancester. Subjects 2, 5, and 9, though, are fairly clustered together which means the clones shared a common ancester possibly quite recently. Subjects 9 and 15 branch off more than the other subjects, which also shows that their clones have become more genetically diverse than Subjects 2 or 5.
Part 2: Quantifying diversity within and between subjects
Table 2. Quantifying diversity within and between Subjects 2, 4, and 5
Subject
Number of Clones
S
Theta
2
24
36
8.63
4
47
62
14.0
5
43
58
13.3
Activity 3: Defining your HIV evolution research project
What is your question?
We are interested in determining if there is a relationship between CD4 levels and average pairwise genetic distance (theta) and furthermore, if there is statistical significance between these two values within each progressor group and between groups.
Make a hypothesis before analysis.
We predict that there is a relationship between genetic diversity and CD4 levels and between groups, though there may not be significance within groups.
Which subjects, visits, and clones will you use?
Subjects
We will use subjects 2, 12, 13, 6, 5, 8, 1, 15, 10, 9, 11, and 4.
Visits
We will use all visits for each subject.
Clones
We will use all clones for each subject.
We chose these subjects, visits, and clones on the basis of ensuring each progression group was represented and enough visits and clones were included.
In order to prepare for analyzing the genetic data from Markham et. al., it was crucial to understand how to manipulate phylogenetic trees, clustering sequences, and calcuate statistical averages in this week's lab. For the future, these skills will be important in understanding how the original authors analyzed and discussed their findings.
Markham, R.B., Wang, W.C., Weisstein, A.E., Wang, Z., Munoz, A., Templeton, A., Margolick, J., Vlahov, D., Quinn, T., Farzadegan, H., & Yu, X.F. (1998). Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline. Proc Natl Acad Sci U S A. 95, 12568-12573. doi: 10.1073/pnas.95.21.12568 (PubMed ID: 9770526)