Ryan N. Willhite Week 4

In-Class Activity
BIOL398-01/S10:Week 4

Activity 1, Part 2

 * The accession number from the sequence I chose was AF089153.
 * The subject of the study was HIV-1 isolate S4V2-3 from USA envelope glycoprotein. The section of the record that contained information about who the HIV was collected from was in the subject.
 * In order to get to the FASTA format, on he NCBI website the displau drop down menu allowed me to view the selected sequences in FASTA format.
 * uploaded file as (.txt), it is the FASTA file sequences from downloading sequences to the local hard drive.
 * 1) [[Media:Wk4sequences.txt|FASTA sequences]]
 * Ran into some trouble opening the FASTA file since there was no type it recognized. In order to overcome this issue, i saved as a .txt file extension and then opened it in notepad.

Activity 1, Part 3

 * 1) [[Image:ClustWweek4treediagram.gif|Clustal W tree diagram]]

Activity 2, Part 1

 * To find first visit info. go to sequence data under Bedrock.




 * It gets tight where there is a lot of confusion on the tree because it shows that these particular sequences are much more similar than the others across from them or further in distance.




 * 1) Do the clones from each subject cluster together? Yes
 * 2) Do some subjects' clones show more diversity than others? Yes
 * 3) Do some of the subjects cluster together? Yes

Activity 2, Part 2
This aspect of the activity focuses on ways to quantify sequence similarity and difference. An S statistic is used to quantify the diversity of sequence in a population. This part was extremely confusing and need more help in this area. Mainly, doing a pairwise distance matrix.

[[media:2.doc|Tables]]