BIOL368/F14:Isabel Gonzaga Week 4

From OpenWetWare
Revision as of 16:44, 17 September 2014 by Isabel Gonzaga (talk | contribs) (Published Page - inputted answers to activity)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Exploring HIV Evolution In-Class Activity

Methods

Activity 1 // Part 2: GenBank

A search was conducted in the GenBank database within the NCBI website for the Markham et al. (1998) HIV-1 sequences. A sequence was selected at random to analyze the full record and FASTA formatted sequence.
5 sequences were then selected from the Summary search view and downloaded in a single file in FASTA format. This file was then opened on Microsoft Word to screen for errors.

Activity 1 // Part 3: Introduction to Biology Workbench

The 5 randomly selected sequences from Activity 1 Part 2 were uploaded and saved onto Biology Workbench under the Nucleic tools tab. The 5 sequences were then selected and ClustalW was used to perform a multiple sequence alignment.

Activity 2 // Part 1: Looking at Clustering Across Subjects

The provided Visit 1 Sequence Files were downloaded and separately uploaded onto Biology workbench. ClustalW was used to perform a multiple sequence alignment and distance tree for 12 sequences: S1V1-1, S1V1-2, S1V1-3, S2V1-1, S2V1-2, S3V1-3, S3V1-1, S3V1-2, and S3V1-3. This used three clones from each of the first four subjects. The unrooted trees generated were analyzed to view potential evolutionary relationships.

Activity 2 // Part 2: Quantifying Diversity Within and Between Subjects

ClustalW was used to perform multiple sequence alignment for all clones from subject 1. S value was determined by counting the number of positions of nucleotide difference. Theta was determined by dividing the subject's S-value by the harmonic sum of n-1, where n equals the number of clones. The alignment was then imported for further analysis.
Under the alignment tab on Biology Workbench, the alignment for Subject 1 was selected and the Clustaldist tool was used to generate a distance matrix. The maximum and minimum percentage values were then determined within this matrix, and multiplied by the total number of base pairs (285), to determine the raw difference. This process was repeated for the clones for subjects 2 and 3.

Differences between subjects were then compared. Under the Nucleic Tools tab, every sequence for subjects 1 and 2 were selected for ClustalW multiple sequence alignment. This alignment was then imported, and a Clustaldist was performed. The minimum and maximum percentage values were determined and multiplied by the total base pairs (285) to determine the raw minimum and maximum differences between subjects. This process was repeated to compare Subjects 1 and 3, and then 2 and 3.

Results

Activity 1 // Part 2: GenBank

  1. Accession Number: AF089142
  2. Subject: 3; Determined by 'Definition' section of record

Activity 1 // Part 3: Introduction to Biology Workbench

Activity 2 // Part 1: Looking at Clustering Across Subjects

Unrooted tree of HIV-1 viral strains for subjects 1,2,3 and 4 for visit 1‎

  1. Yes, the clones from each subject cluster together.
  2. Subject 3 shows some viral diversity within it's clones, as Clone 3 as the node separating clone 3 from clones 1 and 2 is fairly large, in comparison to node lengths in other strains.
  3. Viral clones for subjects 1 and 2 clustered together, showing similar genetic viral identities between the two subjects
  4. This rootless phylogenetic tree displays the evolutionary relationships between three viral clones taken from subjects 1, 2, 3 and 4 at the time of their first visit. Through rootless trees, conclusions may be drawn on the genetic distances between sequences. Based on the length of the lines, Subject 3 and Subject 4 are genetically distinct, whereas Subject 1 and 2 are more similar.Subject 3 shows the greatest distance from the rest of the sequences, implying greater difference in sequence. Additionally, as Subject 3 maintains a large node separating Clone 3 from 1 and 2, subject 3 maintains the greatest diversity of viral HIV-1 clones. The clustering of subjects 1 and 2 may indicate infection by the same (or similar) strains of the HIV-1 virus. As these viruses were sequenced from the point of the first visit (after initial seroconversion), these strains are likely to be more similar to the initial forms of the viruses introduced, as they have had less time to respond and mutate in response to selection factors.

Activity 2 // Part 2: Quantifying Diversity Within and Between Subjects

Table 1. Clustadist Analysis of Distance Within Subjects

Subject Number of Clones S Theta Min Difference Max Difference
1 13 26 8.4 0 14
2 6 5 2.4 0 3
3 4 6 2.6 0 5



Table 2. Clustadist Analysis of Distance Between Subject Pairs

Subjects Compared Min Difference Max Difference
1 & 2 1 11
1 & 3 34 41
2 & 3 35 40

Conclusion

Defining Your Research Project

  1. What is your question?
  2. Make a prediction (hypothesis) about the answer to your question before you begin your analysis.
    1. Which subjects, visits, and clones will you use to answer your question?

You should choose a combination of subjects, visits, and clones that will add up to approximately 50 sequences. You will need about that many sequences to answer a reasonably complex question. However, you cannot use more because the multiple sequence alignment tool cannot handle more than that many sequences.

  1. Justify why you chose the subjects, visits, and clones you did.

Weekly Assignments

Class Journals

Electronic Lab Notebook