Taylor Makela Journal Week 4

From OpenWetWare
Jump to navigationJump to search

Taylor Makela Week 4

Purpose

  • The purpose of this assignment was to practice using GenBank in order to find specific genetic sequences. After exploring and analyzing our assigned nucleotide sequence, we were then able create a phylogenetic tree of the class data using Phylogeny.fr in order to compare our results to the results provided in the Wan et al (2020) paper.

Methods and Results

Part 1: GenBank

  1. First, I chose the following GenBank records:
  2. I then downloaded the complete nucleotide sequence in FASTA format to my computer's hard drive
    • To download the sequence I first clicked the "Send to:" drop down towards the top right of the page.
    • I then set the destination to file and set the format to FASTA and clicked "create file"
  3. The file was then opened to confirm that I had the sequence and that it was in the FASTA format.
    • In the FASTA format each sequence is preceded by a label which begins with the greater than sign (>).
  4. I then searched through the GenBank records to find my assigned nucleotide sequence
  5. Next, I located the spike protein within the record and clicked on the link for this protein sequence
  6. I then downloaded the spike protein sequence in FASTA format to my computer's hard drive
    • To download the sequence I first clicked the "Send to:" drop down towards the top right of the page.
    • I then set the destination to file and set the format to FASTA and clicked "create file"
  7. The file was then opened to confirm that I had the sequence and that it was in the FASTA format.
  8. Spike Protein Sequence:
>AGZ48806.1 spike protein [Bat SARS-like coronavirus RsSHC014]
MKLLVLVFATLVSSYTIEKCLDFDDRTPPANTQFLSSHRGVYYPDDIFRSNVLHLVQDHFLPFDSNVTRF
ITFGLNFDNPIIPFRDGIYFAATEKSNVIRGWVFGSTMNNKSQSVIIMNNSTNLVIRACNFELCDNPFFV
VLKSNNTQIPSYIFNNAFNCTFEYVSKDFNLDLGEKPGNFKDLREFVFRNKDGFLHVYSGYQPISAASGL
PTGFNALKPIFKLPLGINITNFRTLLTAFPPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCS
QNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPSVYAWERKRISNCV
ADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQTGVIADYNYKLPDDFLGC
VLAWNTNSKDSSTSGNYNYLYRWVRRSKLNPYERDLSNDIYSPGGQSCSAVGPNCYNPLRPYGFFTTAGV
GHQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQFGRDVSDFT
DSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAIHADQLTPSWRVYSTGNNVF
QTQAGCLIGAEHVDTSYECDIPIGAGICASYHTVSSLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTN
FSISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
MYKTPTLKDFGGFNFSQILPDPLKPTKRSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNG
LTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQIANQF
NKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLI
TGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTY
VPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGSCDVVIGIINNTV
YDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYE
QYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT

Part 2: Creating a phylogenetic tree with Phylogeny.fr

  1. I first went to the website www.phylogeny.fr and scrolled down on the page to the section labeled ‘Phylogeny analysis’, and clicked on the text ‘One Click’.
  2. I then clicked in the large text field labeled ‘Upload your set of sequences in FASTA, EMBL, or NEXUS format’ and copied the list of sequences from the talk page and used command-V to paste my sequences here, then clicked the “Submit” button.
  3. Once the results finished rendering, I clicked on the tab labeled 3. Alignment, found at the top of the page.
  4. I then clicked on Alignment in Clustal format found under Outputs. This displayed my alignment in a text-only format in which each position's conservation is indicated by a symbol underneath the alignment block (“*” for invariant, “:” for highly conserved, “.” for weakly conserved, and a space for not conserved).
  5. Alignment:
CLUSTAL FORMAT: MUSCLE (3.8) multiple sequence alignment
ALK02457.1      ---------MFIFLF------FLTLTSGSDLESCTT-------FDDVQAPNYPQHSSSRR
AAS10463.1      ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR
AAP13441.1      ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR
AAP13567.1      ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR
AGZ48806.1      --------MKLLVLV------FATLVSSYTIEKCLD-------FDDRTPPANTQFLSSHR
QDF43825.1      --------MKLLVLV------FATLVSSYTIEKCTD-------FDDRTPPSNTQFLSSHR
AGZ48818.1      --------MKLLVLV------FATLVSSYTIEKCLD-------FDDRTPPANTQFLSSHR
QHD43416.1      ---------MFVFLV------LLPLVSS----QCVN-------LTTRTQLPPAYTNSFTR
AVP78031.1      ----------MLFFL------FLQFALVN--SQCVN-------LTGRTPLNPNYTNSSQR
ABD75323.1      --------MKILIFA------FL-VTLVKAQEGCGV-------INLRTQPKLTQVSSSRR
QDF43835.1      --------MKVLIVL------LC-LGLVTAQDGCGH-------ISTKPQPLLDKFSSSRR
ABD75332.1      --------MKVLIFA------LL-FSLAKAQEGCGI-------ISRKPQPKMEKVSSSRR
QDF43820.1      --------MKILIFA------FL-VTLVEAQEGCGI-------ISRKPQPKMAQVSSSRR
AAZ67052.1      --------MKILILA------FL-ASLAKAQEGCGI-------ISRKPQPKMAQVSSSRR
AFS88936.1      ----MIHSVFLLMFLLTPTESYVDVGPDSVKSACIEVDIQQT-FFDKTWPRPIDVSKAD-
YP_0010399      MTLLMCLLMSLLIFVRGCDSQFVDMSPASNTSECLESQVDAAAFSKLMWPYPIDPSKVD-
                          .:..                   *         :            .    
ALK02457.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHR----------------FDNPVIP
AAS10463.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FDDPVIP
AAP13441.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FGNPVIP
AAP13567.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FDNPVIP
AGZ48806.1      GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP
QDF43825.1      GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP
AGZ48818.1      GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP
QHD43416.1      GVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGT---------KRFDNPVLP
AVP78031.1      GVYYPDTIYRSDTLVLSQGYFLPFYSNVSWYYSLTTN-NAAT---------KRTDNPILD
ABD75323.1      GVYYNDDIFRSDVLHLTQDYFLPFHSNLTQYFSLNIE-SDKI---------VYFDNPILK
QDF43835.1      GVYYNDDIFRSDVLHLTQDYFLPFDTNLTRYLSFNMD-SATK---------VYFDNPTLP
ABD75332.1      GVYYNDDIFRSDVLHLTQDYFLPFDSNLTQYFSLNID-SNKY---------TYFDNPILD
QDF43820.1      GVYYNDDIFRSDVLHLTQDYFLPFDSNLTQYFSLNVD-SDRY---------TYFDNPILD
AAZ67052.1      GVYYNDDIFRSNVLHLTQDYFLPFDSNLTQYFSLNVD-SDRF---------TYFDNPILD
AFS88936.1      GIIYPQGRTYSNITITYQGLF-PYQGDHGDMYVYSAGHATGT--TPQKLFVANYSQDVKQ
YP_0010399      GIIYPLGRTYSNITLAYTGLF-PLQGDLGSQYLYSVSHAVGHDGDPTKAYISNYSLLVND
                *: *      *.      . * *   :                           .     
ALK02457.1      FKDGVYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AAS10463.1      FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AAP13441.1      FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AAP13567.1      FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AGZ48806.1      FRDGIYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN
QDF43825.1      FRDGVYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN
AGZ48818.1      FKDGIYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN
QHD43416.1      FNDGVYF----ASTEKSNIIRG-------------WIFGTTLDSKTQS-LL--IVNNATN
AVP78031.1      FKDGIYF----AATEHSNIIRG-------------WIFGTTLDNTSQS-LL--IVNNATN
ABD75323.1      FGDGVYF----AATEKSNVIRG-------------WVFGSTFDNTTQS-AI--IVNNSTH
QDF43835.1      FGDGIYF----AATEKSNVVRG-------------WIFGSTMDNTTQS-AI--IVNNSTH
ABD75332.1      FGDGVYF----AATEKSNVIRG-------------WIFGSSFDNTTQS-AI--IVNNSTH
QDF43820.1      FGDGVYF----AATEKSNVIRG-------------WIFGSTFDNTTQS-AV--IVNNSTH
AAZ67052.1      FGDGVYF----AATEKSNVIRG-------------WIFGSTFDNTTQS-AV--IVNNSTH
AFS88936.1      FANGFVVRIGAAANSTGTVIISPSTSATIRKIYPAFMLGSSVGNFSDG-KMGRFFNHTLV
YP_0010399      FDNGFVVRIGAAANSTGTIVISPSVNTKIKKAYPAFILGSSLTNTSAGQPL--YANYSLT
                * :*. .    *:.. ..:: .             :::*::. . : .  :    * :  
ALK02457.1      VVIRACNFELCDNPFFAVSKP-TGTQTHTM----IFDNAFNCTFEYISDS----FSLDVA
AAS10463.1      VVIRACNFELCDNPFFVVSKP-MGTRTHTM----IFDNAFNCTFEYISDA----FSLDVS
AAP13441.1      VVIRACNFELCDNPFFAVSKP-MGTQTHTM----IFDNAFNCTFEYISDA----FSLDVS
AAP13567.1      VVIRACNFELCDNPFFAVSKP-MGTQTHTM----IFDNAFNCTFEYISDA----FSLDVS
AGZ48806.1      LVIRACNFELCDNPFFVVLKS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDLG
QDF43825.1      LVIRACNFELCDNPFFVVLRS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDIG
AGZ48818.1      LVIRACNFELCDNPFFVVLKS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDLG
QHD43416.1      VVIKVCEFQFCNDPFLGVYYH-KNNKSWMESEFRVYSSANNCTFEYVSQP----FLMDLE
AVP78031.1      VIIKVCNFDFCYDP-YLSGYY-HNNKTWSIREFAVYSSYANCTFEYVSKS----FMLNIS
ABD75323.1      IIIRVCYFNLCKDPMYTVSAG-TQKSSW------VYQSAFNCTYDRVEKS----FQLDTS
QDF43835.1      IIIRVCYFNLCKEPMYAISNE-QHYKSW------VYQNAYNCTYDRVEQS----FQLDTA
ABD75332.1      IIIRVCNFNLCKEPMYTVSKG-TQQSSW------VYQSAFNCTYDRVEKS----FQLDTA
QDF43820.1      IIIRVCNFNLCKEPMYTVSRG-TQQSSW------VYQSAFNCTYDRVERS----FQLDTA
AAZ67052.1      IIIRVCNFNLCKEPMYTVSRG-AQQSSW------VYQSAFNCTYDRVEKS----FQLDTA
AFS88936.1      LLPDGCGTLLR--AFYCILEPRSGNHCPAGNSYTSFATYHTPATDCSDGN----YNRNAS
YP_0010399      IIPDGCGTVLH--AFYCILKPRTVNRCPSGT---GYVSYF--IYETVHNDCQSTINRNAS
                ::   *   :   .                     : .      :            :  
ALK02457.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NILKPIFKL
AAS10463.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL
AAP13441.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL
AAP13567.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL
AGZ48806.1      EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL
QDF43825.1      EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL
AGZ48818.1      EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL
QHD43416.1      GKQ-GNFKNLREFVFKNIDG--------YFKIYSKHTPINLVRDLPQGF--SALEPLVDL
AVP78031.1      GNG-GLFNTLREFVFRNVDG--------HFKIYSKFTPVNLNRGLPTGL--SVLQPLVEL
ABD75323.1      PKT-GNFTDLREFVFKNRDG--------FFTAYQTYTPVNLLRGLPSGL--SVLKPILKL
QDF43835.1      PQT-GNFKDLREYVFKNKDG--------FLSVYNAYSPIDIPRGLPVGF--SVLKPILKL
ABD75332.1      PKT-GNFKDLREYVFKNKGG--------FLRVYQTYTAVNLPRGFPAGF--SVLRPILKL
QDF43820.1      PKT-GNFKDLREYVFKNRDG--------FLSVYQTYTAVNLPRGLPIGF--SVLRPILKL
AAZ67052.1      PKT-GNFKDLREYVFKNRDG--------FLSVYQTYTAVNLPRGLPIGF--SVLRPILKL
AFS88936.1      LNSFKEYFNLRNCTFMYTYNITEDEILEWFGITQTAQGVHLFSSRYVDLYGGNMFQFATL
YP_0010399      LNSFKSFFDLVNCTFFNSWDITADETKEWFGITQDTQGVHLYSSRKGDLYGGNMFRFATL
                 :    :  * : .*    .         :   .    :    .   .:  . :  :  *
ALK02457.1      PLGINITNFRAILTAF------LPAQDTWGTSAAAYFVGYLKPATFMLKYDENGTITDAV
AAS10463.1      PLGINITNFRAILTAF------SPAQDTWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AAP13441.1      PLGINITNFRAILTAF------SPAQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AAP13567.1      PLGINITNFRAILTAF------SPAQDTWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AGZ48806.1      PLGINITNFRTLLTAF------PPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
QDF43825.1      PLGINITNFRTLLTAF------PPNPGYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AGZ48818.1      PLGINITNFRTLLTAF------PPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
QHD43416.1      PIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAV
AVP78031.1      PVSINITKFRTLLTIHRGD---PMPNNGWTAFSAAYFVGYLKPRTFMLKYNENGTITDAV
ABD75323.1      PFGINITSFRVVMAMF------SKTTSNYVPESAAYYVGNLKQSTFMLSFNQNGTIVDAV
QDF43835.1      PIGINITSFKVVMSMF------SRTTSNFLPEVAAYFVGNLKYSTFMLNFNENGTITDAI
ABD75332.1      PFGINITSYRVVMTMF------SQFNSNFLPESAAYYVGNLKYTTFMLSFNENGTITDAV
QDF43820.1      PFGINITSYRVVMAMF------SQTTSNFLPESAAYYVGNLKYTTFMLRFNENGTITDAI
AAZ67052.1      PFGINITSYRVVMAMF------SQTTSNFLPESAAYYVGNLKYTTFMLSFNENGTITNAI
AFS88936.1      PVYDTIKYYSIIPHSIRSI---QSDRKAW----AAFYVYKLQPLTFLLDFSVDGYIRRAI
YP_0010399      PVYEGIKYYTVIPRSFRSK---ANKREAW----AAFYVYKLHQLTYLLDFSVDGYIRRAI
                *.   *. :  :                :    **::*  *:  *::* :. :* *  *:
ALK02457.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
AAS10463.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS
AAP13441.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS
AAP13567.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS
AGZ48806.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
QDF43825.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
AGZ48818.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
QHD43416.1      DCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFAS
AVP78031.1      DCALDPLSETKCTLKSLTVQKGIYQTSNFRVQPTQSVVRFPNITNVCPFHKVFNATRFPS
ABD75323.1      DCSQDPLAELKCTTKSFNVSKGIYQTSNFRVSPVTEVVRFPNITNLCPFDKVFNATRFPS
QDF43835.1      DCAQNPLSELKCTIKNFNVSKGIYQTSNFRVSPTHEVIRFPNITNRCPFDKVFNASRFPN
ABD75332.1      DCSQNPLAELKCTIKNFNVSKGIYQTSNFRVTPTQEVVRFPNITNRCPFDKVFNASRFPN
QDF43820.1      DCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVVRFPNITNRCPFDKVFNASRFPN
AAZ67052.1      DCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVIRFPNITNRCPFDKVFNATRFPN
AFS88936.1      DCGFNDLSQLHCSYESFDVESGVYSVSSFEAKPSGSVVEQAEGVE-CDFSPLLSGTP-PQ
YP_0010399      DCGHDDLSQLHCSYTSFEVDTGVYSVSSYEASATGTFIEQPNATE-CDFSPMLTGVA-PQ
                **. : *:: :*:  .: :..*:*..*.: . .   .:  .: .: * *  ::..   ..
ALK02457.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AAS10463.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AAP13441.1      VYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AAP13567.1      VYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AGZ48806.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
QDF43825.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AGZ48818.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
QHD43416.1      VYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQ
AVP78031.1      VYAWERTKISDCIADYTVFYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRFSEVRQ
ABD75323.1      VYAWERTKISDCVADYTVFYNSTSFSTFNCYGVSPSKLIDLCFTSVYADTFLIRFSEVRQ
QDF43835.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
ABD75332.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
QDF43820.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
AAZ67052.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
AFS88936.1      VYNFKRLVFTNCNYNLTKLLSLFSVNDFTCSQISPAAIASNCYSSLILDYFSYPLSMKSD
YP_0010399      VYNFKRLVFSNCNYNLTKLLSLFAVDEFSCNGISPDSIARGCYSTLTVDYFAYPLSMKSY
                ** ::*  :::*  : : : .   .. *.*  :*.  :   *::.:  * *    .    
ALK02457.1      IAPGQTGVIADYNYKLPDDFTGC-VLAWNTRNIDATQTGNYNYKYRSLRHGKLRPFER-D
AAS10463.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D
AAP13441.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D
AAP13567.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D
AGZ48806.1      IAPGQTGVIADYNYKLPDDFLGC-VLAWNTNSKDSSTSGNYNYLYRWVRRSKLNPYER-D
QDF43825.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRSLRHGKLRPFER-D
AGZ48818.1      IAPGQTGVIADYNYKLPDDFTGC-VLAWNTRNIDATQTGNYNYKYRSLRHGKLRPFER-D
QHD43416.1      IAPGQTGKIADYNYKLPDDFTGC-VIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFER-D
AVP78031.1      VAPGQTGVIADYNYKLPDDFTGC-VIAWNTAKQD---VGNYF--YRSHRSTKLKPFER-D
ABD75323.1      VAPGQTGVIADYNYKLPDDFTGC-VIAWNTAKQD---VGSYF--YRSHRSSKLKPFER-D
QDF43835.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---QGQYY--YRSSRKTKLKPFER-D
ABD75332.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAQQD---QGQYY--YRSYRKEKLKPFER-D
QDF43820.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---TGHYY--YRSHRKTKLKPFER-D
AAZ67052.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---QGQYY--YRSHRKTKLKPFER-D
AFS88936.1      LSVSSAGPISQFNYKQSFSNPTC-LILATVPHNLTTITKPLKYSYIN-KCSRLLSDDRTE
YP_0010399      IRPGSAGNIPLYNYKQSFANPTCRVMASVLANVTITKPHAYG--YIS-KCSRLTGANQ-D
                :  ..:* *. :*** .     * ::                  *   .  .*   :. :
ALK02457.1      ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFYITNGIGYQPYRVVVLS
AAS10463.1      ISNVPFSPDGK--PCTPP-APNCYW-----------PLNGYGFYTTSGIGYQPYRVVVLS
AAP13441.1      ISNVPFSPDGK--PCTPP-ALNCYW-----------PLNDYGFYTTTGIGYQPYRVVVLS
AAP13567.1      ISNVPFSPDGK--PCTPP-ALNCYW-----------PLNDYGFYTTTGIGYQPYRVVVLS
AGZ48806.1      LSNDIYSPGGQ--SCSAV-GPNCYN-----------PLRPYGFFTTAGVGHQPYRVVVLS
QDF43825.1      ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFFTTNGIGYQPYRVVVLS
AGZ48818.1      ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFYITNGIGYQPYRVVVLS
QHD43416.1      ISTEIYQAGST--PCNGVEGFNCYF-----------PLQSYGFQPTNGVGYQPYRVVVLS
AVP78031.1      LSS----------------DENGVR-----------TLSTYDFNPNVPLEYQATRVVVLS
ABD75323.1      LSS----------------EENGVR-----------TLSTYDFNQNVPLEYQATRVVVLS
QDF43835.1      LTS----------------DENGVR-----------TLSTYDFYPNVPIEYQATRVVVLS
ABD75332.1      LSS----------------DENGVY-----------TLSTYDFYPSIPVEYQATRVVVLS
QDF43820.1      LSSD---------------DGNGVY-----------TLSTYDFNPNVPVAYQATRVVVLS
AAZ67052.1      LSS----------------DENGVR-----------TLSTYDFYPSVPVAYQATRVVVLS
AFS88936.1      VPQLVNANQYS--PCVSI-VPSTVWEDGDYYRKQLSPLEGGGWLVASGSTVAMTEQLQMG
YP_0010399      VETPLYINPGEYSICRDF-SPGGFSEDGQVFKRTLTQFEGGGLLIGVGTRVPMTDNLQMS
                :                    .               :   .              : :.
ALK02457.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AAS10463.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AAP13441.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AAP13567.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AGZ48806.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
QDF43825.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AGZ48818.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
QHD43416.1      FELL----HAPATVC-----GPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQF
AVP78031.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTDSSKRFQSFQQF
ABD75323.1      FELL----NAPATVC-----GPKLSTSLVKNQCVNFNFNGFKGTGVLTDSSKTFQSFQQF
QDF43835.1      FELL----NAPATVC-----GPKLSTGLVKNQCVNFNFNGLRGTGVLTDSSKRFQSFQQF
ABD75332.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLRGTGVLTTSSKRFQSFQQF
QDF43820.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTDSSKRFQSFQQF
AAZ67052.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTESSKRFQSFQQF
AFS88936.1      FGITVQYGTDTNSVCPKLEFANDTKIASQLGNCVEYSLYGVSGRGVFQNCTAVGVRQQRF
YP_0010399      FIISVQYGTGTDSVCPMLDLGDSLTITNRLGKCVDYSLYGVTGRGVFQNCTAVGVKQQRF
                * :       . :**     . . .     .:**::.: *. * **:  ..      *.*
ALK02457.1      GRDVLD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
AAS10463.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTLI
AAP13441.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAI
AAP13567.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAI
AGZ48806.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
QDF43825.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
AGZ48818.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
QHD43416.1      GRDIAD-TTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAI
AVP78031.1      GKDASD-FIDSVRDPQTLEILDITPCSFGGVSVITPGTNTSLEVAVLYQDVNCTDVPTTI
ABD75323.1      GRDASD-FTDSVRDPQTLRILDISPCSFGGVSVITPGTNTSSAVAVLYQDVNCTDVPRTI
QDF43835.1      GRDTSD-FTDSVRDPQTLEILDITPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTAI
ABD75332.1      GRDTSD-FTDSVRDPQTLEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTSI
QDF43820.1      GRDTSD-FTDSVRDPQTLEILDITPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTAI
AAZ67052.1      GRDTSD-FTDSVRDPQTLEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPAAI
AFS88936.1      VYDAYQNLVGYYSDDGNYYCLR--ACVSVPVSVIY--DKETKTHATLFGSVACEHISSTM
YP_0010399      VYDSFDNLVGYYSDDGNYYCVR--PCVSVPVSVIY--DKSTNLHATLFGSVACEHVTTMM
                  *  :   .   *  .   :   .*    ****    : :   *.*: .* *  :.  :
ALK02457.1      -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
AAS10463.1      -HAEQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
AAP13441.1      -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSL
AAP13567.1      -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSL
AGZ48806.1      -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
QDF43825.1      -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
AGZ48818.1      -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
QHD43416.1      -HADQLTPT-WRVYSTGSNVFQTRAGCLIGAEHVNNSY---ECDIPIGAGICASYQTQTN
AVP78031.1      -HADQLTPA-WRIYATGTNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTASI
ABD75323.1      -QADQLAPS-WRVYTTGPYVFQTQAGCLIGAEHVNASY---QCDIPIGAGICASYHTASH
QDF43835.1      -RADQLTPA-WRVYSTGINVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST
ABD75332.1      -HADQLTPA-WRVYSTGVNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTASV
QDF43820.1      -RADQLTPA-WRVYSTGVNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST
AAZ67052.1      -HADQLTPA-WRVYSTGTNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST
AFS88936.1      SQYSRSTRSMLKRRDSTYGPLQTPVGCVLGL--VNSSLFVEDCKLPLGQSLCALPDTPST
YP_0010399      SQFSRLTQS-NLRRRDSNIPLQTAVGCVIGLS--NNSLVVSDCKLPLGQSLCAVPP-VST
                 . .. : :           :** .**::*    : *    :*.:*:* .:**     : 
ALK02457.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AAS10463.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AAP13441.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AAP13567.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AGZ48806.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
QDF43825.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AGZ48818.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
QHD43416.1      SPRRARSVA----SQSI--------IAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEIL
AVP78031.1      ----LRSTS----QKAI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
ABD75323.1      ----LRSTG----QKSI--------VAYTMSLGAENSVAYANNSIAIPTNFSISVTTEVM
QDF43835.1      ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
ABD75332.1      ----LRSTG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
QDF43820.1      ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
AAZ67052.1      ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
AFS88936.1      ----LTPRS----VRSVPGEMRLASIAFNHPIQVDQ-LNSSYFKLSIPTNFSFGVTQEYI
YP_0010399      ----FRSYSASQFQLAV--------LNYTSPI-VVTPINSSGFTAAIPTNFSFSVTQEYI
                      . .      ::        : :. .: .   :  :  . :*****::.:* * :
ALK02457.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
AAS10463.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCRQLNRALSGIAAEQDRNTREVFVQVKQ
AAP13441.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQ
AAP13567.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQ
AGZ48806.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
QDF43825.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
AGZ48818.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
QHD43416.1      PVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQ
AVP78031.1      PVSMAKTSVDCTMYICGDSIECSNLLLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQ
ABD75323.1      PVSMAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALSGIAVEQDKNTQEVFAQVKQ
QDF43835.1      PVSMSKTSVDCTMYICGDSQECSNLLLQYGSFCTQLNRALTGIAIEQDKNTQEVFAQVKQ
ABD75332.1      PVSIAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALTGIAIEQDKNTQEVFAQVKQ
QDF43820.1      PVSMAKTSVDCTMYICGDSQECSNLLLQYGSFCTQLNRALTGVALEQDKNTQEVFAQVKQ
AAZ67052.1      PVSMAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQ
AFS88936.1      QTTIQKVTVDCKQYVCNGFQKCEQLLREYGQFCSKINQALHGANLRQDDSVRNLFASVKS
YP_0010399      ETSIQKVTVDCKQYVCNGFTRCEKLLVEYGQFCSKINQALHGANLRQDESVYSLYSNIKT
                 .:: *.:***. *:*..   * :** :**.** ::*.** *    ** .. .:: .:* 
ALK02457.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAS10463.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAP13441.1      MYKTPTLKYFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAP13567.1      MYKTPTLKYFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AGZ48806.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
QDF43825.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AGZ48818.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
QHD43416.1      IYKTPPIKDFGG-FNFSQILPDPSKPSKRSF---IEDLLFNKVTLADAGFIKQYGDCL--
AVP78031.1      IYKTPPIKDFGG-FNFSQILPDPSKPSKRSF---IEDLLFNKVTLADAGFIKQYGDCL--
ABD75323.1      MYKTPTIRDFGG-FNFSQILPDPLKPTKRSF---IEDLLYNKVTLADAGFMKQYADCL--
QDF43835.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
ABD75332.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
QDF43820.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAZ67052.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AFS88936.1      SQSSPIIPGFGGDFNLTLLEPVSISTGSRSARSAIEDLLFDKVTIADPGYMQGYDDCMQQ
YP_0010399      TSTQTLEYGLNGDFNLTLLQVPQIGGSSSSYRSAIEDLLFDKVTIADPGYMQGYDDCMKQ
                  . .    :.* **:: :        . *    *****::***:**.*::: * :*:  
ALK02457.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAS10463.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAP13441.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAP13567.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AGZ48806.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
QDF43825.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AGZ48818.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
QHD43416.1      GDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQ
AVP78031.1      GGISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALISGTATAGWTFGAGAALQIPFAMQ
ABD75323.1      GGINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALISGTATAGWTFGAGAALQIPFAMQ
QDF43835.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
ABD75332.1      GDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQ
QDF43820.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAZ67052.1      GDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQ
AFS88936.1      GPASARDLICAQYVAGYKVLPPLMDVNMEAAYTSSLLGSIAGVGWTAGLSSFAAIPFAQS
YP_0010399      GPQSARDLICAQYVSGYKVLPPLYDPNMEAAYTSSLLGSIAGAGWTAGLSSFAAIPFAQS
                *   ******** . * .*****   :* * **::*:..    *** * .:   **** .
ALK02457.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAS10463.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAP13441.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAP13567.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AGZ48806.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
QDF43825.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AGZ48818.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
QHD43416.1      MAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNT
AVP78031.1      MAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQESLTSTASALGKLQDVVNQNAQALNT
ABD75323.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAITQIQESLTTTSTALGKLQDVVNQNAQALNT
QDF43835.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
ABD75332.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
QDF43820.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAZ67052.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AFS88936.1      IFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTTNEAFQKVQDAVNNNAQALSK
YP_0010399      MFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTSNLAFSKVQDAVNANAQALSK
                : **:**:*:**:** **** ***:**.*:  :* .::::  *: *:**.** *****..
 ALK02457.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAS10463.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAP13441.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAP13567.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AGZ48806.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QDF43825.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AGZ48818.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QHD43416.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AVP78031.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
ABD75323.1      LVKQLSSNFGAISSALNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QDF43835.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
ABD75332.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QDF43820.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAZ67052.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AFS88936.1      LASELSNTFGAISASIGDIIQRLDVLEQDAQIDRLINGRLTTLNAFVAQQLVRSESAALS
YP_0010399      LASELSNTFGAISSSISDILARLDTVEQDAQIDRLINGRLISLNAFVSQQLVRSETAARS
                *..:**..*****: :.**: *** :* :.******.*** :*:::*:***:*:     *
ALK02457.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AAS10463.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AAP13441.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AAP13567.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AGZ48806.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
QDF43825.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AGZ48818.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
QHD43416.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAI
AVP78031.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYIPSQEKNFTTAPAI
ABD75323.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPSQEKNFTTAPAI
QDF43835.1      ANLAATKMSECVLGQSKRVDFCGRGYHLMSFPQAAPHGVVFLHVTYVPSQEKNFTTAPAI
ABD75332.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
QDF43820.1      ANLAATKMSECVLGQSKRVDFCGRGYHLMSFPQAAPHGVVFLHVTYVPSQEKNFTTAPAI
AAZ67052.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AFS88936.1      AQLAKDKVNECVKAQSKRSGFCGQGTHIVSFVVNAPNGLYFMHVGYYPSNHIEVVSAYGL
YP_0010399      AQLASDKVNECVKSQSKRNGFCGSGTHIVSFVVNAPNGFYFFHVGYVPTNYTNVTAAYGL
                *:**  *:.*** .**** .*** * *::**   **:*. *:** * *::  :..:* .:
ALK02457.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
AAS10463.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI
AAP13441.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI
AAP13567.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI
AGZ48806.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
QDF43825.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
AGZ48818.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
QHD43416.1      CHDGK---AHFPREGVFVSNGTH-------WFVTQRNFYEPQIITTDNT-FVSGNCDVVI
AVP78031.1      CHEGK---AHFPREGVFVSNGTH-------WFVTQRNFYEPKIITTDNT-FVSGNCDVVI
ABD75323.1      CHEGK---AYFPREGVFVSNGSS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI
QDF43835.1      CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI
ABD75332.1      CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGNCDVVI
QDF43820.1      CHEGK---AYFPREGVFVSNGTF-------WFITQRNFYSPQIITTDNT-FVAGNCDVVI
AAZ67052.1      CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI
AFS88936.1      CDAANPTNCIAPVNGYFIKTNNT--RIVDEWSYTGSSFYAPEPITSLNTKYVA--PQVTY
YP_0010399      CNNNNPPLCIAPIDGYFITNQTTTYSVDTEWYYTGSSFYKPEPITQANSRYVS--SDVKF
                *   :   .  * :* *: . .        *  *  .*: *: **  *: :*:   :*  
ALK02457.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AAS10463.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQEEIDRLN
AAP13441.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AAP13567.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AGZ48806.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
QDF43825.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AGZ48818.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEINRLN
QHD43416.1      GIVNNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AVP78031.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDIDLGDISGINASVVNIQKEIDRLN
ABD75323.1      GIINNTVYDPL---QPELDSFKQELDKYFKNHTSPDVDLGDISGINASVVDIQKEIDRLN
QDF43835.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
ABD75332.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
QDF43820.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AAZ67052.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AFS88936.1      QNISTNLPPPLLGNSTGID-FQDELDEFFKNVSTSIPNFGSLTQINTTLLDLTYEMLSLQ
YP_0010399      DKLENNLPPPLLENSTDVD-FKDELEEFFKNVTSHGPNFAEISKINTTLLDLSDEMAMLQ
                  :...:  **   .. :* *::**:::*** ::   ::..:: **::::::  *:  *:
ALK02457.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AAS10463.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AAP13441.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AAP13567.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AGZ48806.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
QDF43825.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AGZ48818.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
QHD43416.1      EVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGC
AVP78031.1      EVARNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGC
ABD75323.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLVGLFMAIILLCYFTSCCSCCKGM
QDF43835.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMATILLCCMTSCCSCLKGA
ABD75332.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
QDF43820.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMATILLCCMTSCCSCLKGA
AAZ67052.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AFS88936.1      QVVKALNESYIDLKELGNYTYYNKWPWYIWLGFIAGLVALALCVFFILCCTGCGTNCMGK
YP_0010399      EVVKQLNDSYIDLKELGNYTYYNKWPWYVWLGFIAGLVALLLCVFFLLCCTGCGTSCLGK
                :*.. **:* ***:***:*  * *****:********:.: :  :::   *.* :   *  
ALK02457.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAS10463.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAP13441.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAP13567.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AGZ48806.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
QDF43825.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AGZ48818.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
QHD43416.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AVP78031.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
ABD75323.1      CSCGSCC-RFDEDDSEPVLKGVKLHYT
QDF43835.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
ABD75332.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
QDF43820.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAZ67052.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AFS88936.1      LKCNRCCDRYEEYDLEP----HKVHVH
YP_0010399      MKCKNCCDSYEEYDVE------KIHVH
                 .*  **  ::* * *      *:*  
  1. Next, I went back and clicked on the tab 6. Tree Rendering, and obtained a phylogenetic tree of the five sequences.
    • On this tree, horizontal lines (branches) represent individual evolutionary lineages. By contrast, vertical lines (splits) represent mutation events, and the vertical length of each split is drawn purely for visual clarity with no biological meaning. The left-most split is called the root of the tree, and represents a hypothesis about the most recent common ancestor (MRCA) of the sequences within your tree.
  2. My phylogenetic tree:

  1. Comparison of Personal tree to multiple sequence alignment:
    • Both the tree that I generated and the multiple sequence alignment showed highly conserved regions between the AGZ48806 Spike Protein and the AGZ48818 Spike protein. In my generated tree, these two spike proteins had a branch support value of 92% which indicates that these sequences are closely related and share a common ancestor. In addition, in the multiple sequence alignment, these two sequences showed to be very similar.
  2. Comparison between class alignment and Figure 3 of the Wan et al. (2020) paper:
    • The class alignment and the data in Figure 3 had many differences. The data provided in Figure 3 shows much more conservation than our class alignment. Furthermore, the data in Figure 3 showed much more invariance than that of the class.
  3. Comparison between class alignment and Figure 2 of the Wan et al. (2020) paper:
    • The class alignment and the data in Figure 2 had both differences and similarities. For one, both the class alignment and the Figure 2 data had the the same primary branches. However, the two alignments had different conservation patterns.
  4. Is enough information provided by Wan et al. (2020) paper:
    • There is not enough information provided by the Wan et al. (2020) paper to reproduce the analysis. The paper does not provide enough detail for the methods used to be able to reproduce their analysis.

Data and Files

  1. Taylor Makela Generated Phylogenetic Tree
  2. AGZ48806: Bat SARS-like Coronavirus RsSHC014, Spike Protein
  3. QHD43416.1: Severe Acute Respiratory Syndrome Coronavirus 2 Isolate Wuhan-Hu-1, Spike Protein

Scientific Conclusion

  • This week's assignment allowed me to get more comfortable using bioinformatics databases such as GenBank and Phylogeny.fr. After using the GenBank database to find my assigned nucleotide sequence, I was then able to use Phylogeny.fr to explore the relationship between my data and the class data. The phylogenetic tree that was made using Phylogeny.fr was then used to make comparisons between the tree I generated and the data provided in the Wan et al (2020) paper.

Acknowledgments

  • I acknowledge my homework partner Nida Patel, who I consulted for several hours regarding syntax, formatting, and content questions.
  • I acknowledge that I copied and modified the protocol shown on the Week 4 assignment page for this course.

Except for what is noted above, this individual entry was completed by me and not copied from another source. Taylor Makela (talk) 20:59, 30 September 2020 (PDT)

References

  1. OpenWetWare. (2020). BIOL368/F20:Week 4. Accessed 30 September 2020, from https://openwetware.org/wiki/BIOL368/F20:Week_4
  2. OpenWetWare. (2020). Talk:BIOL368/F20:Week 4. Accessed 30 September 2020, from https://openwetware.org/wiki/Talk:BIOL368/F20:Week_4
  3. NCBI GenBank. (2020). Severe Acute Respiratory Syndrome Coronavirus 2 Isolate Wuhan-Hu-1, Complete Genome. Retrieved 30 September 2020, from https://www.ncbi.nlm.nih.gov/nuccore/MN908947
  4. NCBI GenBank. (2020). Spike Protein [Bat SARS-like Coronavirus RsSHC014].Accessed 30 September 2020, from https://www.ncbi.nlm.nih.gov/protein/556015117
  5. Phylogeny.fr: "One Click" Mode. (2020). Accessed 30 September 2020, from http://www.phylogeny.fr/simple_phylogeny.cgi?workflow_id=b9c0813cbbe9695d63cf7e31da5f026d&tab_index=1
  6. Wan, Y., Shang, J., Graham, R., Baric, R., & Li, F. (2020). Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus. Journal Of Virology, 94(7). doi: 10.1128/jvi.00127-20

Template

Template Links

Assignment Pages

Individual Journal Pages

Class Journal Pages