Taylor Makela Journal Week 4
From OpenWetWare
Jump to navigationJump to search
Taylor Makela Week 4
Purpose
- The purpose of this assignment was to practice using GenBank in order to find specific genetic sequences. After exploring and analyzing our assigned nucleotide sequence, we were then able create a phylogenetic tree of the class data using Phylogeny.fr in order to compare our results to the results provided in the Wan et al (2020) paper.
Methods and Results
Part 1: GenBank
- First, I chose the following GenBank records:
- Complete Genome Accession Number: MN908947 MN908947: Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome
- Spike Protein Accession Number: QHD43416.1 QHD43416.1: spike protein
- The GenBank records provide information regarding various viruses such as the locus, gene sequence, and protein and translation information.
- Spike Protein Accession Number: QHD43416.1 QHD43416.1: spike protein
- Complete Genome Accession Number: MN908947 MN908947: Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, complete genome
- I then downloaded the complete nucleotide sequence in FASTA format to my computer's hard drive
- To download the sequence I first clicked the "Send to:" drop down towards the top right of the page.
- I then set the destination to file and set the format to FASTA and clicked "create file"
- The file was then opened to confirm that I had the sequence and that it was in the FASTA format.
- In the FASTA format each sequence is preceded by a label which begins with the greater than sign (>).
- I then searched through the GenBank records to find my assigned nucleotide sequence
- Accession Number: KC881005
- KC881005: Bat SARS-like coronavirus RsSHC014, complete genome
- Next, I located the spike protein within the record and clicked on the link for this protein sequence
- I then downloaded the spike protein sequence in FASTA format to my computer's hard drive
- To download the sequence I first clicked the "Send to:" drop down towards the top right of the page.
- I then set the destination to file and set the format to FASTA and clicked "create file"
- The file was then opened to confirm that I had the sequence and that it was in the FASTA format.
- In the FASTA format each sequence is preceded by a label which begins with the greater than sign (>).
- Spike Protein Accession Number: AGZ48806
- AGZ48806: spike protein [Bat SARS-like coronavirus RsSHC014
- Spike Protein Sequence:
>AGZ48806.1 spike protein [Bat SARS-like coronavirus RsSHC014] MKLLVLVFATLVSSYTIEKCLDFDDRTPPANTQFLSSHRGVYYPDDIFRSNVLHLVQDHFLPFDSNVTRF ITFGLNFDNPIIPFRDGIYFAATEKSNVIRGWVFGSTMNNKSQSVIIMNNSTNLVIRACNFELCDNPFFV VLKSNNTQIPSYIFNNAFNCTFEYVSKDFNLDLGEKPGNFKDLREFVFRNKDGFLHVYSGYQPISAASGL PTGFNALKPIFKLPLGINITNFRTLLTAFPPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCS QNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPSVYAWERKRISNCV ADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQTGVIADYNYKLPDDFLGC VLAWNTNSKDSSTSGNYNYLYRWVRRSKLNPYERDLSNDIYSPGGQSCSAVGPNCYNPLRPYGFFTTAGV GHQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQFGRDVSDFT DSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAIHADQLTPSWRVYSTGNNVF QTQAGCLIGAEHVDTSYECDIPIGAGICASYHTVSSLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTN FSISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ MYKTPTLKDFGGFNFSQILPDPLKPTKRSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNG LTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQIANQF NKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLI TGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTY VPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGSCDVVIGIINNTV YDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYE QYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT
Part 2: Creating a phylogenetic tree with Phylogeny.fr
- I first went to the website www.phylogeny.fr and scrolled down on the page to the section labeled ‘Phylogeny analysis’, and clicked on the text ‘One Click’.
- I then clicked in the large text field labeled ‘Upload your set of sequences in FASTA, EMBL, or NEXUS format’ and copied the list of sequences from the talk page and used command-V to paste my sequences here, then clicked the “Submit” button.
- Once the results finished rendering, I clicked on the tab labeled 3. Alignment, found at the top of the page.
- I then clicked on Alignment in Clustal format found under Outputs. This displayed my alignment in a text-only format in which each position's conservation is indicated by a symbol underneath the alignment block (“*” for invariant, “:” for highly conserved, “.” for weakly conserved, and a space for not conserved).
- Alignment:
CLUSTAL FORMAT: MUSCLE (3.8) multiple sequence alignment ALK02457.1 ---------MFIFLF------FLTLTSGSDLESCTT-------FDDVQAPNYPQHSSSRR AAS10463.1 ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR AAP13441.1 ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR AAP13567.1 ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR AGZ48806.1 --------MKLLVLV------FATLVSSYTIEKCLD-------FDDRTPPANTQFLSSHR QDF43825.1 --------MKLLVLV------FATLVSSYTIEKCTD-------FDDRTPPSNTQFLSSHR AGZ48818.1 --------MKLLVLV------FATLVSSYTIEKCLD-------FDDRTPPANTQFLSSHR QHD43416.1 ---------MFVFLV------LLPLVSS----QCVN-------LTTRTQLPPAYTNSFTR AVP78031.1 ----------MLFFL------FLQFALVN--SQCVN-------LTGRTPLNPNYTNSSQR ABD75323.1 --------MKILIFA------FL-VTLVKAQEGCGV-------INLRTQPKLTQVSSSRR QDF43835.1 --------MKVLIVL------LC-LGLVTAQDGCGH-------ISTKPQPLLDKFSSSRR ABD75332.1 --------MKVLIFA------LL-FSLAKAQEGCGI-------ISRKPQPKMEKVSSSRR QDF43820.1 --------MKILIFA------FL-VTLVEAQEGCGI-------ISRKPQPKMAQVSSSRR AAZ67052.1 --------MKILILA------FL-ASLAKAQEGCGI-------ISRKPQPKMAQVSSSRR AFS88936.1 ----MIHSVFLLMFLLTPTESYVDVGPDSVKSACIEVDIQQT-FFDKTWPRPIDVSKAD- YP_0010399 MTLLMCLLMSLLIFVRGCDSQFVDMSPASNTSECLESQVDAAAFSKLMWPYPIDPSKVD- .:.. * : . ALK02457.1 GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHR----------------FDNPVIP AAS10463.1 GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FDDPVIP AAP13441.1 GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FGNPVIP AAP13567.1 GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FDNPVIP AGZ48806.1 GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP QDF43825.1 GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP AGZ48818.1 GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP QHD43416.1 GVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGT---------KRFDNPVLP AVP78031.1 GVYYPDTIYRSDTLVLSQGYFLPFYSNVSWYYSLTTN-NAAT---------KRTDNPILD ABD75323.1 GVYYNDDIFRSDVLHLTQDYFLPFHSNLTQYFSLNIE-SDKI---------VYFDNPILK QDF43835.1 GVYYNDDIFRSDVLHLTQDYFLPFDTNLTRYLSFNMD-SATK---------VYFDNPTLP ABD75332.1 GVYYNDDIFRSDVLHLTQDYFLPFDSNLTQYFSLNID-SNKY---------TYFDNPILD QDF43820.1 GVYYNDDIFRSDVLHLTQDYFLPFDSNLTQYFSLNVD-SDRY---------TYFDNPILD AAZ67052.1 GVYYNDDIFRSNVLHLTQDYFLPFDSNLTQYFSLNVD-SDRF---------TYFDNPILD AFS88936.1 GIIYPQGRTYSNITITYQGLF-PYQGDHGDMYVYSAGHATGT--TPQKLFVANYSQDVKQ YP_0010399 GIIYPLGRTYSNITLAYTGLF-PLQGDLGSQYLYSVSHAVGHDGDPTKAYISNYSLLVND *: * *. . * * : . ALK02457.1 FKDGVYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN AAS10463.1 FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN AAP13441.1 FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN AAP13567.1 FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN AGZ48806.1 FRDGIYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN QDF43825.1 FRDGVYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN AGZ48818.1 FKDGIYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN QHD43416.1 FNDGVYF----ASTEKSNIIRG-------------WIFGTTLDSKTQS-LL--IVNNATN AVP78031.1 FKDGIYF----AATEHSNIIRG-------------WIFGTTLDNTSQS-LL--IVNNATN ABD75323.1 FGDGVYF----AATEKSNVIRG-------------WVFGSTFDNTTQS-AI--IVNNSTH QDF43835.1 FGDGIYF----AATEKSNVVRG-------------WIFGSTMDNTTQS-AI--IVNNSTH ABD75332.1 FGDGVYF----AATEKSNVIRG-------------WIFGSSFDNTTQS-AI--IVNNSTH QDF43820.1 FGDGVYF----AATEKSNVIRG-------------WIFGSTFDNTTQS-AV--IVNNSTH AAZ67052.1 FGDGVYF----AATEKSNVIRG-------------WIFGSTFDNTTQS-AV--IVNNSTH AFS88936.1 FANGFVVRIGAAANSTGTVIISPSTSATIRKIYPAFMLGSSVGNFSDG-KMGRFFNHTLV YP_0010399 FDNGFVVRIGAAANSTGTIVISPSVNTKIKKAYPAFILGSSLTNTSAGQPL--YANYSLT * :*. . *:.. ..:: . :::*::. . : . : * : ALK02457.1 VVIRACNFELCDNPFFAVSKP-TGTQTHTM----IFDNAFNCTFEYISDS----FSLDVA AAS10463.1 VVIRACNFELCDNPFFVVSKP-MGTRTHTM----IFDNAFNCTFEYISDA----FSLDVS AAP13441.1 VVIRACNFELCDNPFFAVSKP-MGTQTHTM----IFDNAFNCTFEYISDA----FSLDVS AAP13567.1 VVIRACNFELCDNPFFAVSKP-MGTQTHTM----IFDNAFNCTFEYISDA----FSLDVS AGZ48806.1 LVIRACNFELCDNPFFVVLKS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDLG QDF43825.1 LVIRACNFELCDNPFFVVLRS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDIG AGZ48818.1 LVIRACNFELCDNPFFVVLKS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDLG QHD43416.1 VVIKVCEFQFCNDPFLGVYYH-KNNKSWMESEFRVYSSANNCTFEYVSQP----FLMDLE AVP78031.1 VIIKVCNFDFCYDP-YLSGYY-HNNKTWSIREFAVYSSYANCTFEYVSKS----FMLNIS ABD75323.1 IIIRVCYFNLCKDPMYTVSAG-TQKSSW------VYQSAFNCTYDRVEKS----FQLDTS QDF43835.1 IIIRVCYFNLCKEPMYAISNE-QHYKSW------VYQNAYNCTYDRVEQS----FQLDTA ABD75332.1 IIIRVCNFNLCKEPMYTVSKG-TQQSSW------VYQSAFNCTYDRVEKS----FQLDTA QDF43820.1 IIIRVCNFNLCKEPMYTVSRG-TQQSSW------VYQSAFNCTYDRVERS----FQLDTA AAZ67052.1 IIIRVCNFNLCKEPMYTVSRG-AQQSSW------VYQSAFNCTYDRVEKS----FQLDTA AFS88936.1 LLPDGCGTLLR--AFYCILEPRSGNHCPAGNSYTSFATYHTPATDCSDGN----YNRNAS YP_0010399 IIPDGCGTVLH--AFYCILKPRTVNRCPSGT---GYVSYF--IYETVHNDCQSTINRNAS :: * : . : . : : ALK02457.1 EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NILKPIFKL AAS10463.1 EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL AAP13441.1 EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL AAP13567.1 EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL AGZ48806.1 EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL QDF43825.1 EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL AGZ48818.1 EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL QHD43416.1 GKQ-GNFKNLREFVFKNIDG--------YFKIYSKHTPINLVRDLPQGF--SALEPLVDL AVP78031.1 GNG-GLFNTLREFVFRNVDG--------HFKIYSKFTPVNLNRGLPTGL--SVLQPLVEL ABD75323.1 PKT-GNFTDLREFVFKNRDG--------FFTAYQTYTPVNLLRGLPSGL--SVLKPILKL QDF43835.1 PQT-GNFKDLREYVFKNKDG--------FLSVYNAYSPIDIPRGLPVGF--SVLKPILKL ABD75332.1 PKT-GNFKDLREYVFKNKGG--------FLRVYQTYTAVNLPRGFPAGF--SVLRPILKL QDF43820.1 PKT-GNFKDLREYVFKNRDG--------FLSVYQTYTAVNLPRGLPIGF--SVLRPILKL AAZ67052.1 PKT-GNFKDLREYVFKNRDG--------FLSVYQTYTAVNLPRGLPIGF--SVLRPILKL AFS88936.1 LNSFKEYFNLRNCTFMYTYNITEDEILEWFGITQTAQGVHLFSSRYVDLYGGNMFQFATL YP_0010399 LNSFKSFFDLVNCTFFNSWDITADETKEWFGITQDTQGVHLYSSRKGDLYGGNMFRFATL : : * : .* . : . : . .: . : : * ALK02457.1 PLGINITNFRAILTAF------LPAQDTWGTSAAAYFVGYLKPATFMLKYDENGTITDAV AAS10463.1 PLGINITNFRAILTAF------SPAQDTWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV AAP13441.1 PLGINITNFRAILTAF------SPAQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV AAP13567.1 PLGINITNFRAILTAF------SPAQDTWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV AGZ48806.1 PLGINITNFRTLLTAF------PPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV QDF43825.1 PLGINITNFRTLLTAF------PPNPGYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV AGZ48818.1 PLGINITNFRTLLTAF------PPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV QHD43416.1 PIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAV AVP78031.1 PVSINITKFRTLLTIHRGD---PMPNNGWTAFSAAYFVGYLKPRTFMLKYNENGTITDAV ABD75323.1 PFGINITSFRVVMAMF------SKTTSNYVPESAAYYVGNLKQSTFMLSFNQNGTIVDAV QDF43835.1 PIGINITSFKVVMSMF------SRTTSNFLPEVAAYFVGNLKYSTFMLNFNENGTITDAI ABD75332.1 PFGINITSYRVVMTMF------SQFNSNFLPESAAYYVGNLKYTTFMLSFNENGTITDAV QDF43820.1 PFGINITSYRVVMAMF------SQTTSNFLPESAAYYVGNLKYTTFMLRFNENGTITDAI AAZ67052.1 PFGINITSYRVVMAMF------SQTTSNFLPESAAYYVGNLKYTTFMLSFNENGTITNAI AFS88936.1 PVYDTIKYYSIIPHSIRSI---QSDRKAW----AAFYVYKLQPLTFLLDFSVDGYIRRAI YP_0010399 PVYEGIKYYTVIPRSFRSK---ANKREAW----AAFYVYKLHQLTYLLDFSVDGYIRRAI *. *. : : : **::* *: *::* :. :* * *: ALK02457.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS AAS10463.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS AAP13441.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS AAP13567.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS AGZ48806.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS QDF43825.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS AGZ48818.1 DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS QHD43416.1 DCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFAS AVP78031.1 DCALDPLSETKCTLKSLTVQKGIYQTSNFRVQPTQSVVRFPNITNVCPFHKVFNATRFPS ABD75323.1 DCSQDPLAELKCTTKSFNVSKGIYQTSNFRVSPVTEVVRFPNITNLCPFDKVFNATRFPS QDF43835.1 DCAQNPLSELKCTIKNFNVSKGIYQTSNFRVSPTHEVIRFPNITNRCPFDKVFNASRFPN ABD75332.1 DCSQNPLAELKCTIKNFNVSKGIYQTSNFRVTPTQEVVRFPNITNRCPFDKVFNASRFPN QDF43820.1 DCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVVRFPNITNRCPFDKVFNASRFPN AAZ67052.1 DCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVIRFPNITNRCPFDKVFNATRFPN AFS88936.1 DCGFNDLSQLHCSYESFDVESGVYSVSSFEAKPSGSVVEQAEGVE-CDFSPLLSGTP-PQ YP_0010399 DCGHDDLSQLHCSYTSFEVDTGVYSVSSYEASATGTFIEQPNATE-CDFSPMLTGVA-PQ **. : *:: :*: .: :..*:*..*.: . . .: .: .: * * ::.. .. ALK02457.1 VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ AAS10463.1 VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ AAP13441.1 VYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ AAP13567.1 VYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ AGZ48806.1 VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ QDF43825.1 VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ AGZ48818.1 VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ QHD43416.1 VYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQ AVP78031.1 VYAWERTKISDCIADYTVFYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRFSEVRQ ABD75323.1 VYAWERTKISDCVADYTVFYNSTSFSTFNCYGVSPSKLIDLCFTSVYADTFLIRFSEVRQ QDF43835.1 VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ ABD75332.1 VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ QDF43820.1 VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ AAZ67052.1 VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ AFS88936.1 VYNFKRLVFTNCNYNLTKLLSLFSVNDFTCSQISPAAIASNCYSSLILDYFSYPLSMKSD YP_0010399 VYNFKRLVFSNCNYNLTKLLSLFAVDEFSCNGISPDSIARGCYSTLTVDYFAYPLSMKSY ** ::* :::* : : : . .. *.* :*. : *::.: * * . ALK02457.1 IAPGQTGVIADYNYKLPDDFTGC-VLAWNTRNIDATQTGNYNYKYRSLRHGKLRPFER-D AAS10463.1 IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D AAP13441.1 IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D AAP13567.1 IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D AGZ48806.1 IAPGQTGVIADYNYKLPDDFLGC-VLAWNTNSKDSSTSGNYNYLYRWVRRSKLNPYER-D QDF43825.1 IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRSLRHGKLRPFER-D AGZ48818.1 IAPGQTGVIADYNYKLPDDFTGC-VLAWNTRNIDATQTGNYNYKYRSLRHGKLRPFER-D QHD43416.1 IAPGQTGKIADYNYKLPDDFTGC-VIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFER-D AVP78031.1 VAPGQTGVIADYNYKLPDDFTGC-VIAWNTAKQD---VGNYF--YRSHRSTKLKPFER-D ABD75323.1 VAPGQTGVIADYNYKLPDDFTGC-VIAWNTAKQD---VGSYF--YRSHRSSKLKPFER-D QDF43835.1 VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---QGQYY--YRSSRKTKLKPFER-D ABD75332.1 VAPGETGVIADYNYKLPDDFTGC-VIAWNTAQQD---QGQYY--YRSYRKEKLKPFER-D QDF43820.1 VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---TGHYY--YRSHRKTKLKPFER-D AAZ67052.1 VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---QGQYY--YRSHRKTKLKPFER-D AFS88936.1 LSVSSAGPISQFNYKQSFSNPTC-LILATVPHNLTTITKPLKYSYIN-KCSRLLSDDRTE YP_0010399 IRPGSAGNIPLYNYKQSFANPTCRVMASVLANVTITKPHAYG--YIS-KCSRLTGANQ-D : ..:* *. :*** . * :: * . .* :. : ALK02457.1 ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFYITNGIGYQPYRVVVLS AAS10463.1 ISNVPFSPDGK--PCTPP-APNCYW-----------PLNGYGFYTTSGIGYQPYRVVVLS AAP13441.1 ISNVPFSPDGK--PCTPP-ALNCYW-----------PLNDYGFYTTTGIGYQPYRVVVLS AAP13567.1 ISNVPFSPDGK--PCTPP-ALNCYW-----------PLNDYGFYTTTGIGYQPYRVVVLS AGZ48806.1 LSNDIYSPGGQ--SCSAV-GPNCYN-----------PLRPYGFFTTAGVGHQPYRVVVLS QDF43825.1 ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFFTTNGIGYQPYRVVVLS AGZ48818.1 ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFYITNGIGYQPYRVVVLS QHD43416.1 ISTEIYQAGST--PCNGVEGFNCYF-----------PLQSYGFQPTNGVGYQPYRVVVLS AVP78031.1 LSS----------------DENGVR-----------TLSTYDFNPNVPLEYQATRVVVLS ABD75323.1 LSS----------------EENGVR-----------TLSTYDFNQNVPLEYQATRVVVLS QDF43835.1 LTS----------------DENGVR-----------TLSTYDFYPNVPIEYQATRVVVLS ABD75332.1 LSS----------------DENGVY-----------TLSTYDFYPSIPVEYQATRVVVLS QDF43820.1 LSSD---------------DGNGVY-----------TLSTYDFNPNVPVAYQATRVVVLS AAZ67052.1 LSS----------------DENGVR-----------TLSTYDFYPSVPVAYQATRVVVLS AFS88936.1 VPQLVNANQYS--PCVSI-VPSTVWEDGDYYRKQLSPLEGGGWLVASGSTVAMTEQLQMG YP_0010399 VETPLYINPGEYSICRDF-SPGGFSEDGQVFKRTLTQFEGGGLLIGVGTRVPMTDNLQMS : . : . : :. ALK02457.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF AAS10463.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF AAP13441.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF AAP13567.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF AGZ48806.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF QDF43825.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF AGZ48818.1 FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF QHD43416.1 FELL----HAPATVC-----GPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQF AVP78031.1 FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTDSSKRFQSFQQF ABD75323.1 FELL----NAPATVC-----GPKLSTSLVKNQCVNFNFNGFKGTGVLTDSSKTFQSFQQF QDF43835.1 FELL----NAPATVC-----GPKLSTGLVKNQCVNFNFNGLRGTGVLTDSSKRFQSFQQF ABD75332.1 FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLRGTGVLTTSSKRFQSFQQF QDF43820.1 FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTDSSKRFQSFQQF AAZ67052.1 FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTESSKRFQSFQQF AFS88936.1 FGITVQYGTDTNSVCPKLEFANDTKIASQLGNCVEYSLYGVSGRGVFQNCTAVGVRQQRF YP_0010399 FIISVQYGTGTDSVCPMLDLGDSLTITNRLGKCVDYSLYGVTGRGVFQNCTAVGVKQQRF * : . :** . . . .:**::.: *. * **: .. *.* ALK02457.1 GRDVLD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI AAS10463.1 GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTLI AAP13441.1 GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAI AAP13567.1 GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAI AGZ48806.1 GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI QDF43825.1 GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI AGZ48818.1 GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI QHD43416.1 GRDIAD-TTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAI AVP78031.1 GKDASD-FIDSVRDPQTLEILDITPCSFGGVSVITPGTNTSLEVAVLYQDVNCTDVPTTI ABD75323.1 GRDASD-FTDSVRDPQTLRILDISPCSFGGVSVITPGTNTSSAVAVLYQDVNCTDVPRTI QDF43835.1 GRDTSD-FTDSVRDPQTLEILDITPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTAI ABD75332.1 GRDTSD-FTDSVRDPQTLEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTSI QDF43820.1 GRDTSD-FTDSVRDPQTLEILDITPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTAI AAZ67052.1 GRDTSD-FTDSVRDPQTLEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPAAI AFS88936.1 VYDAYQNLVGYYSDDGNYYCLR--ACVSVPVSVIY--DKETKTHATLFGSVACEHISSTM YP_0010399 VYDSFDNLVGYYSDDGNYYCVR--PCVSVPVSVIY--DKSTNLHATLFGSVACEHVTTMM * : . * . : .* **** : : *.*: .* * :. : ALK02457.1 -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS AAS10463.1 -HAEQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS AAP13441.1 -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSL AAP13567.1 -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSL AGZ48806.1 -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS QDF43825.1 -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS AGZ48818.1 -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS QHD43416.1 -HADQLTPT-WRVYSTGSNVFQTRAGCLIGAEHVNNSY---ECDIPIGAGICASYQTQTN AVP78031.1 -HADQLTPA-WRIYATGTNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTASI ABD75323.1 -QADQLAPS-WRVYTTGPYVFQTQAGCLIGAEHVNASY---QCDIPIGAGICASYHTASH QDF43835.1 -RADQLTPA-WRVYSTGINVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST ABD75332.1 -HADQLTPA-WRVYSTGVNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTASV QDF43820.1 -RADQLTPA-WRVYSTGVNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST AAZ67052.1 -HADQLTPA-WRVYSTGTNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST AFS88936.1 SQYSRSTRSMLKRRDSTYGPLQTPVGCVLGL--VNSSLFVEDCKLPLGQSLCALPDTPST YP_0010399 SQFSRLTQS-NLRRRDSNIPLQTAVGCVIGLS--NNSLVVSDCKLPLGQSLCAVPP-VST . .. : : :** .**::* : * :*.:*:* .:** : ALK02457.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM AAS10463.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM AAP13441.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM AAP13567.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM AGZ48806.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM QDF43825.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM AGZ48818.1 ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM QHD43416.1 SPRRARSVA----SQSI--------IAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEIL AVP78031.1 ----LRSTS----QKAI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM ABD75323.1 ----LRSTG----QKSI--------VAYTMSLGAENSVAYANNSIAIPTNFSISVTTEVM QDF43835.1 ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM ABD75332.1 ----LRSTG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM QDF43820.1 ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM AAZ67052.1 ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM AFS88936.1 ----LTPRS----VRSVPGEMRLASIAFNHPIQVDQ-LNSSYFKLSIPTNFSFGVTQEYI YP_0010399 ----FRSYSASQFQLAV--------LNYTSPI-VVTPINSSGFTAAIPTNFSFSVTQEYI . . :: : :. .: . : : . :*****::.:* * : ALK02457.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ AAS10463.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCRQLNRALSGIAAEQDRNTREVFVQVKQ AAP13441.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQ AAP13567.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQ AGZ48806.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ QDF43825.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ AGZ48818.1 PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ QHD43416.1 PVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQ AVP78031.1 PVSMAKTSVDCTMYICGDSIECSNLLLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQ ABD75323.1 PVSMAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALSGIAVEQDKNTQEVFAQVKQ QDF43835.1 PVSMSKTSVDCTMYICGDSQECSNLLLQYGSFCTQLNRALTGIAIEQDKNTQEVFAQVKQ ABD75332.1 PVSIAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALTGIAIEQDKNTQEVFAQVKQ QDF43820.1 PVSMAKTSVDCTMYICGDSQECSNLLLQYGSFCTQLNRALTGVALEQDKNTQEVFAQVKQ AAZ67052.1 PVSMAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQ AFS88936.1 QTTIQKVTVDCKQYVCNGFQKCEQLLREYGQFCSKINQALHGANLRQDDSVRNLFASVKS YP_0010399 ETSIQKVTVDCKQYVCNGFTRCEKLLVEYGQFCSKINQALHGANLRQDESVYSLYSNIKT .:: *.:***. *:*.. * :** :**.** ::*.** * ** .. .:: .:* ALK02457.1 MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AAS10463.1 MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AAP13441.1 MYKTPTLKYFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AAP13567.1 MYKTPTLKYFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AGZ48806.1 MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- QDF43825.1 MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AGZ48818.1 MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- QHD43416.1 IYKTPPIKDFGG-FNFSQILPDPSKPSKRSF---IEDLLFNKVTLADAGFIKQYGDCL-- AVP78031.1 IYKTPPIKDFGG-FNFSQILPDPSKPSKRSF---IEDLLFNKVTLADAGFIKQYGDCL-- ABD75323.1 MYKTPTIRDFGG-FNFSQILPDPLKPTKRSF---IEDLLYNKVTLADAGFMKQYADCL-- QDF43835.1 MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- ABD75332.1 MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- QDF43820.1 MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AAZ67052.1 MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL-- AFS88936.1 SQSSPIIPGFGGDFNLTLLEPVSISTGSRSARSAIEDLLFDKVTIADPGYMQGYDDCMQQ YP_0010399 TSTQTLEYGLNGDFNLTLLQVPQIGGSSSSYRSAIEDLLFDKVTIADPGYMQGYDDCMKQ . . :.* **:: : . * *****::***:**.*::: * :*: ALK02457.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ AAS10463.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ AAP13441.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ AAP13567.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ AGZ48806.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ QDF43825.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ AGZ48818.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ QHD43416.1 GDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQ AVP78031.1 GGISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALISGTATAGWTFGAGAALQIPFAMQ ABD75323.1 GGINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALISGTATAGWTFGAGAALQIPFAMQ QDF43835.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ ABD75332.1 GDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQ QDF43820.1 GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ AAZ67052.1 GDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQ AFS88936.1 GPASARDLICAQYVAGYKVLPPLMDVNMEAAYTSSLLGSIAGVGWTAGLSSFAAIPFAQS YP_0010399 GPQSARDLICAQYVSGYKVLPPLYDPNMEAAYTSSLLGSIAGAGWTAGLSSFAAIPFAQS * ******** . * .***** :* * **::*:.. *** * .: **** . ALK02457.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AAS10463.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AAP13441.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AAP13567.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AGZ48806.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT QDF43825.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AGZ48818.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT QHD43416.1 MAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNT AVP78031.1 MAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQESLTSTASALGKLQDVVNQNAQALNT ABD75323.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAITQIQESLTTTSTALGKLQDVVNQNAQALNT QDF43835.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT ABD75332.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT QDF43820.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AAZ67052.1 MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT AFS88936.1 IFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTTNEAFQKVQDAVNNNAQALSK YP_0010399 MFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTSNLAFSKVQDAVNANAQALSK : **:**:*:**:** **** ***:**.*: :* .:::: *: *:**.** *****.. ALK02457.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AAS10463.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AAP13441.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AAP13567.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AGZ48806.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS QDF43825.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AGZ48818.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS QHD43416.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AVP78031.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS ABD75323.1 LVKQLSSNFGAISSALNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS QDF43835.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS ABD75332.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS QDF43820.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AAZ67052.1 LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS AFS88936.1 LASELSNTFGAISASIGDIIQRLDVLEQDAQIDRLINGRLTTLNAFVAQQLVRSESAALS YP_0010399 LASELSNTFGAISSSISDILARLDTVEQDAQIDRLINGRLISLNAFVSQQLVRSETAARS *..:**..*****: :.**: *** :* :.******.*** :*:::*:***:*: * ALK02457.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI AAS10463.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI AAP13441.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI AAP13567.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI AGZ48806.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI QDF43825.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI AGZ48818.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI QHD43416.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAI AVP78031.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYIPSQEKNFTTAPAI ABD75323.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPSQEKNFTTAPAI QDF43835.1 ANLAATKMSECVLGQSKRVDFCGRGYHLMSFPQAAPHGVVFLHVTYVPSQEKNFTTAPAI ABD75332.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI QDF43820.1 ANLAATKMSECVLGQSKRVDFCGRGYHLMSFPQAAPHGVVFLHVTYVPSQEKNFTTAPAI AAZ67052.1 ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI AFS88936.1 AQLAKDKVNECVKAQSKRSGFCGQGTHIVSFVVNAPNGLYFMHVGYYPSNHIEVVSAYGL YP_0010399 AQLASDKVNECVKSQSKRNGFCGSGTHIVSFVVNAPNGFYFFHVGYVPTNYTNVTAAYGL *:** *:.*** .**** .*** * *::** **:*. *:** * *:: :..:* .: ALK02457.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI AAS10463.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI AAP13441.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI AAP13567.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI AGZ48806.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI QDF43825.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI AGZ48818.1 CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI QHD43416.1 CHDGK---AHFPREGVFVSNGTH-------WFVTQRNFYEPQIITTDNT-FVSGNCDVVI AVP78031.1 CHEGK---AHFPREGVFVSNGTH-------WFVTQRNFYEPKIITTDNT-FVSGNCDVVI ABD75323.1 CHEGK---AYFPREGVFVSNGSS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI QDF43835.1 CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI ABD75332.1 CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGNCDVVI QDF43820.1 CHEGK---AYFPREGVFVSNGTF-------WFITQRNFYSPQIITTDNT-FVAGNCDVVI AAZ67052.1 CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI AFS88936.1 CDAANPTNCIAPVNGYFIKTNNT--RIVDEWSYTGSSFYAPEPITSLNTKYVA--PQVTY YP_0010399 CNNNNPPLCIAPIDGYFITNQTTTYSVDTEWYYTGSSFYKPEPITQANSRYVS--SDVKF * : . * :* *: . . * * .*: *: ** *: :*: :* ALK02457.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AAS10463.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQEEIDRLN AAP13441.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AAP13567.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AGZ48806.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN QDF43825.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AGZ48818.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEINRLN QHD43416.1 GIVNNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AVP78031.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDIDLGDISGINASVVNIQKEIDRLN ABD75323.1 GIINNTVYDPL---QPELDSFKQELDKYFKNHTSPDVDLGDISGINASVVDIQKEIDRLN QDF43835.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN ABD75332.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN QDF43820.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AAZ67052.1 GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN AFS88936.1 QNISTNLPPPLLGNSTGID-FQDELDEFFKNVSTSIPNFGSLTQINTTLLDLTYEMLSLQ YP_0010399 DKLENNLPPPLLENSTDVD-FKDELEEFFKNVTSHGPNFAEISKINTTLLDLSDEMAMLQ :...: ** .. :* *::**:::*** :: ::..:: **:::::: *: *: ALK02457.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA AAS10463.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA AAP13441.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA AAP13567.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA AGZ48806.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA QDF43825.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA AGZ48818.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA QHD43416.1 EVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGC AVP78031.1 EVARNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGC ABD75323.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLVGLFMAIILLCYFTSCCSCCKGM QDF43835.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMATILLCCMTSCCSCLKGA ABD75332.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA QDF43820.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMATILLCCMTSCCSCLKGA AAZ67052.1 EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA AFS88936.1 QVVKALNESYIDLKELGNYTYYNKWPWYIWLGFIAGLVALALCVFFILCCTGCGTNCMGK YP_0010399 EVVKQLNDSYIDLKELGNYTYYNKWPWYVWLGFIAGLVALLLCVFFLLCCTGCGTSCLGK :*.. **:* ***:***:* * *****:********:.: : ::: *.* : * ALK02457.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AAS10463.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AAP13441.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AAP13567.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AGZ48806.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT QDF43825.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AGZ48818.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT QHD43416.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AVP78031.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT ABD75323.1 CSCGSCC-RFDEDDSEPVLKGVKLHYT QDF43835.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT ABD75332.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT QDF43820.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AAZ67052.1 CSCGSCC-KFDEDDSEPVLKGVKLHYT AFS88936.1 LKCNRCCDRYEEYDLEP----HKVHVH YP_0010399 MKCKNCCDSYEEYDVE------KIHVH .* ** ::* * * *:*
- Next, I went back and clicked on the tab 6. Tree Rendering, and obtained a phylogenetic tree of the five sequences.
- On this tree, horizontal lines (branches) represent individual evolutionary lineages. By contrast, vertical lines (splits) represent mutation events, and the vertical length of each split is drawn purely for visual clarity with no biological meaning. The left-most split is called the root of the tree, and represents a hypothesis about the most recent common ancestor (MRCA) of the sequences within your tree.
- My phylogenetic tree:
- Comparison of Personal tree to multiple sequence alignment:
- Both the tree that I generated and the multiple sequence alignment showed highly conserved regions between the AGZ48806 Spike Protein and the AGZ48818 Spike protein. In my generated tree, these two spike proteins had a branch support value of 92% which indicates that these sequences are closely related and share a common ancestor. In addition, in the multiple sequence alignment, these two sequences showed to be very similar.
- Comparison between class alignment and Figure 3 of the Wan et al. (2020) paper:
- The class alignment and the data in Figure 3 had many differences. The data provided in Figure 3 shows much more conservation than our class alignment. Furthermore, the data in Figure 3 showed much more invariance than that of the class.
- Comparison between class alignment and Figure 2 of the Wan et al. (2020) paper:
- The class alignment and the data in Figure 2 had both differences and similarities. For one, both the class alignment and the Figure 2 data had the the same primary branches. However, the two alignments had different conservation patterns.
- Is enough information provided by Wan et al. (2020) paper:
- There is not enough information provided by the Wan et al. (2020) paper to reproduce the analysis. The paper does not provide enough detail for the methods used to be able to reproduce their analysis.
Data and Files
- Taylor Makela Generated Phylogenetic Tree
- AGZ48806: Bat SARS-like Coronavirus RsSHC014, Spike Protein
- QHD43416.1: Severe Acute Respiratory Syndrome Coronavirus 2 Isolate Wuhan-Hu-1, Spike Protein
Scientific Conclusion
- This week's assignment allowed me to get more comfortable using bioinformatics databases such as GenBank and Phylogeny.fr. After using the GenBank database to find my assigned nucleotide sequence, I was then able to use Phylogeny.fr to explore the relationship between my data and the class data. The phylogenetic tree that was made using Phylogeny.fr was then used to make comparisons between the tree I generated and the data provided in the Wan et al (2020) paper.
Acknowledgments
- I acknowledge my homework partner Nida Patel, who I consulted for several hours regarding syntax, formatting, and content questions.
- I acknowledge that I copied and modified the protocol shown on the Week 4 assignment page for this course.
Except for what is noted above, this individual entry was completed by me and not copied from another source. Taylor Makela (talk) 20:59, 30 September 2020 (PDT)
References
- OpenWetWare. (2020). BIOL368/F20:Week 4. Accessed 30 September 2020, from https://openwetware.org/wiki/BIOL368/F20:Week_4
- OpenWetWare. (2020). Talk:BIOL368/F20:Week 4. Accessed 30 September 2020, from https://openwetware.org/wiki/Talk:BIOL368/F20:Week_4
- NCBI GenBank. (2020). Severe Acute Respiratory Syndrome Coronavirus 2 Isolate Wuhan-Hu-1, Complete Genome. Retrieved 30 September 2020, from https://www.ncbi.nlm.nih.gov/nuccore/MN908947
- NCBI GenBank. (2020). Spike Protein [Bat SARS-like Coronavirus RsSHC014].Accessed 30 September 2020, from https://www.ncbi.nlm.nih.gov/protein/556015117
- Phylogeny.fr: "One Click" Mode. (2020). Accessed 30 September 2020, from http://www.phylogeny.fr/simple_phylogeny.cgi?workflow_id=b9c0813cbbe9695d63cf7e31da5f026d&tab_index=1
- Wan, Y., Shang, J., Graham, R., Baric, R., & Li, F. (2020). Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus. Journal Of Virology, 94(7). doi: 10.1128/jvi.00127-20
Template
Template Links
Assignment Pages
- Week 1 Assignment Page
- Week 2 Assignment Page
- Week 3 Assignment Page
- Week 4 Assignment Page
- Week 5 Assignment Page
- Week 6 Assignment Page
- Week 7 Assignment Page
- Week 8 Assignment Page
- Week 9 Assignment Page
- Week 10 Assignment Page
- Week 11 Assignment Page
- Week 12 Assignment Page
- Week 14 Assignment Page
Individual Journal Pages
- Taylor Makela Journal Week 2
- Taylor Makela Journal Week 3
- Taylor Makela Journal Week 4
- Taylor Makela Journal Week 5
- Taylor Makela Journal Week 6
- Taylor Makela Journal Week 7
- FoldamerDB Review
- Taylor Makela Journal Week 9
- Taylor Makela Journal Week 10
- Taylor Makela Journal Week 11
- Comparison of Human and Hamster ACE2 Receptors for SARS-CoV-2 Week 12
- Comparison of Human and Hamster ACE2 Receptors for SARS-CoV-2 Week 14