Alex J. George Week 4

Part 2: Genbank
[[Media:FASTA sequences.doc| FASTA sequences]]
 * Accessed Genbank records from the right-hand links of the Pubmed version of the article.
 * Accession Number of sequence: AF016767.2
 * This sequence was taken from the first visit of Subject 1. This information is given in the title of the sequence.

Part 3: Biology Workbench

 * Registered for Workbench, followed "Nucleic Tools" link to add a sequence.
 * Uploaded sequences successfully, realizing it is best to keep the file saved as a ".fasta" file
 * CLUSTALW Tool performs multiple sequence alignment on protein or nucleic sequences
 * Example: gi:33187149 and gi:33187151 are on the same side of the tree because their differences from the other sequences are most similar. For example, in the 21st and 22nd base pairs, these sequences show "AA" whereas the other three show "CC"

Activity 2: Looking at sources of HIV across subjects

 * Goal: Determine if the isolated HIV from subgroups comes from a common source
 * Multiple sequence alignment: comparing nucleotide positions for many sequences
 * Unrooted Trees:
 * Represent genetic distance b/t pairs of sequences--> Total length b/t nodes represents genetic distance between 2 sequences
 * B/c unrooted- can't make inferences about direction of evolutionary change
 * Long internal branches separate clusters of sequences that a dissimilar

Part 1: Clustering across subjects
Selected Sequence(s) S1V1-3 S1V1-4 S1V1-5 S7V1-3 S7V1-4 S7V1-5 S8V1-3 S8V1-4 S8V1-5 S9V1-3 S9V1-4 S9V1-5
 * Selected Sequences:


 * Resulting Unrooted Tree: [[Media:Unrooted Tree- V1-S1,S7,S8,S9-Clones3,4,5.pdf]]
 * The clones from each subject do cluster together, but none of the subjects cluster together. They seem to be all equally diverse from the other subjects.
 * The clones from Subject 7 definitely show more diversity than the other subjects. Subject 9 shows the least diversity between clones.
 * My tree shows that all of the subjects are equally distinct from the others. Subject's 8 clones 3 and 4 are nearly identical to each other, overlapping on the tree.  This most likely indicates a minor mutation between the two clones.

Part 2: Quantifying Diversity
Values of Subjects 6,8,13 for Visit 1 Comparison of Difference Between Pairs of Subjects
 * S= number of positions that vary across all sequences
 * Theta= estimate of average pairwise genetic distance
 * Min. and Max. Differences= look at extremes of similarity and difference