KP Ramirez Week 4

come up with a question to email about what you felt should be asked in this paper?

I began by loading up the Markham paper from PubMed then clicking on the nucleotides links. From there I could access GenBank. I noted that the links seemed to be invalid however I was able to access them through a different browser.

Assignment
• What was the accession number of the sequence you chose? • Which subject of the study was that HIV sequence from? Which section of the record contains information about who the HIV was collected from? • Download several (4 to 6) sequences in FASTA format to your local hard drive by selecting several at the same time in the summary view so they are saved into a single text file. Be careful to remember where you put the file and what you name it so that you can find it later. • Open the file that you saved with a word processor to confirm that you have the sequences and that they are in the FASTA format. In the FASTA format each sequence is preceeded by a label which begins with the greater than sign (>).
 * AF016768 (number 1)
 * AF089153 (Number 2)
 * AF089140 (Number 7)
 * AF089540 (number 41
 * Subject 1 visit 1 clone 9
 * Subject 4 visit 2-3
 * Subject 3 visit 5-9
 * Subject 12 visit 3-4
 * Completed
 * Completed

Activity 2 Part 1
This process involved working with the Biology Workbench and adding the first visit nucleic sequences


 * Data Table
 * 1) * Subject:      Clone:
 * 2) *      1  --- S1V1-1, S1V1-2, S1V1-3
 * 3) *      4S4V1-1, S4V1-2, S4V1-3
 * 4) *      3-S3V1-1, S3V1-2, S3V1-3
 * 5) *     12-S12v12-1, S12V12-2, S12v12-3



The clones from these subjects cluster together. Subject 12 has the shortest branch of the three closest. This shows that my subjects are distinct from one another even if they appear to be similar due to the distance between them.

Activity 2 part 2


In order to conduct this part, this involved compressing the raw clustal data matrices into xls. However, due to having a Mac, this was difficult as TextEdit didn't allow this to run smoothly. A list of the raw data Xcel file is here [[Media:Bioinformatics.xls]]