Nida Patel Journal Week 4

From OpenWetWare
Jump to navigationJump to search

Purpose

The purpose of this assignment is to familiarize ourselves with genetic code and phylogenetic trees to be able to assume relationships between species, thhis skill will be used in future analyses of data and genetic relations.

Methods/Results

GenBank

  • I chose MK211376: Coronavirus BtRs-BetaCoVto analyze
    • The accession number:YN2018B
  • The page provides the complete genome of Coronavirus BtRs-BetaCoV, the source organism, the virus, the locus of the sequence, and referenced authors of the page.
  • The assigned sequence for spike protein Bat SARS-like coronavirus RsSHC014 was saved onto a word processor and the spike protein sequence was as follows:
  • >AGZ48806.1 spike protein [Bat SARS-like coronavirus RsSHC014]
MKLLVLVFATLVSSYTIEKCLDFDDRTPPANTQFLSSHRGVYYPDDIFRSNVLHLVQDHFLPFDSNVTRF
ITFGLNFDNPIIPFRDGIYFAATEKSNVIRGWVFGSTMNNKSQSVIIMNNSTNLVIRACNFELCDNPFFV
VLKSNNTQIPSYIFNNAFNCTFEYVSKDFNLDLGEKPGNFKDLREFVFRNKDGFLHVYSGYQPISAASGL
PTGFNALKPIFKLPLGINITNFRTLLTAFPPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCS
QNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPSVYAWERKRISNCV
ADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQTGVIADYNYKLPDDFLGC
VLAWNTNSKDSSTSGNYNYLYRWVRRSKLNPYERDLSNDIYSPGGQSCSAVGPNCYNPLRPYGFFTTAGV
GHQPYRVVVLSFELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQFGRDVSDFT
DSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAIHADQLTPSWRVYSTGNNVF
QTQAGCLIGAEHVDTSYECDIPIGAGICASYHTVSSLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTN
FSISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
MYKTPTLKDFGGFNFSQILPDPLKPTKRSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNG
LTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQIANQF
NKAISQIQESLTTTSTALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLI
TGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTY
VPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGSCDVVIGIINNTV
YDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYE
QYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT

Creating a phylogenetic tree with Phylogeny.fr

  • I used www.phylogeny.frto do a phylogeny analysis on the sequences from the talk page.
    • The steps were as listed:
        • Click Phylogeny Analysis
        • Click on One Click Mode
        • Paste the talk page spike sequences
        • Submit
        • Click on tab 3.Alignment and under output reformat sequence into Alignment in Clustal Format
ALK02457.1      ---------MFIFLF------FLTLTSGSDLESCTT-------FDDVQAPNYPQHSSSRR
AAS10463.1      ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR
AAP13441.1      ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR
AAP13567.1      ---------MFIFLL------FLTLTSGSDLDRCTT-------FDDVQAPNYTQHTSSMR
AGZ48806.1      --------MKLLVLV------FATLVSSYTIEKCLD-------FDDRTPPANTQFLSSHR
QDF43825.1      --------MKLLVLV------FATLVSSYTIEKCTD-------FDDRTPPSNTQFLSSHR
AGZ48818.1      --------MKLLVLV------FATLVSSYTIEKCLD-------FDDRTPPANTQFLSSHR
QHD43416.1      ---------MFVFLV------LLPLVSS----QCVN-------LTTRTQLPPAYTNSFTR
AVP78031.1      ----------MLFFL------FLQFALVN--SQCVN-------LTGRTPLNPNYTNSSQR 
ABD75323.1      --------MKILIFA------FL-VTLVKAQEGCGV-------INLRTQPKLTQVSSSRR
QDF43835.1      --------MKVLIVL------LC-LGLVTAQDGCGH-------ISTKPQPLLDKFSSSRR
ABD75332.1      --------MKVLIFA------LL-FSLAKAQEGCGI-------ISRKPQPKMEKVSSSRR
QDF43820.1      --------MKILIFA------FL-VTLVEAQEGCGI-------ISRKPQPKMAQVSSSRR
AAZ67052.1      --------MKILILA------FL-ASLAKAQEGCGI-------ISRKPQPKMAQVSSSRR
AFS88936.1      ----MIHSVFLLMFLLTPTESYVDVGPDSVKSACIEVDIQQT-FFDKTWPRPIDVSKAD-
YP_0010399      MTLLMCLLMSLLIFVRGCDSQFVDMSPASNTSECLESQVDAAAFSKLMWPYPIDPSKVD-
                          .:..                   *         :            .   
ALK02457.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHR----------------FDNPVIP
AAS10463.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FDDPVIP
AAP13441.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FGNPVIP
AAP13567.1      GVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHTINHT----------------FDNPVIP
AGZ48806.1      GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP
QDF43825.1      GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP
AGZ48818.1      GVYYPDDIFRSNVLHLVQDHFLPFDSNVTRFITFGLN----------------FDNPIIP
QHD43416.1      GVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGT---------KRFDNPVLP
AVP78031.1      GVYYPDTIYRSDTLVLSQGYFLPFYSNVSWYYSLTTN-NAAT---------KRTDNPILD
ABD75323.1      GVYYNDDIFRSDVLHLTQDYFLPFHSNLTQYFSLNIE-SDKI---------VYFDNPILK
QDF43835.1      GVYYNDDIFRSDVLHLTQDYFLPFDTNLTRYLSFNMD-SATK---------VYFDNPTLP
ABD75332.1      GVYYNDDIFRSDVLHLTQDYFLPFDSNLTQYFSLNID-SNKY---------TYFDNPILD
QDF43820.1      GVYYNDDIFRSDVLHLTQDYFLPFDSNLTQYFSLNVD-SDRY---------TYFDNPILD
AAZ67052.1      GVYYNDDIFRSNVLHLTQDYFLPFDSNLTQYFSLNVD-SDRF---------TYFDNPILD
AFS88936.1      GIIYPQGRTYSNITITYQGLF-PYQGDHGDMYVYSAGHATGT--TPQKLFVANYSQDVKQ
YP_0010399      GIIYPLGRTYSNITLAYTGLF-PLQGDLGSQYLYSVSHAVGHDGDPTKAYISNYSLLVND
                *: *      *.      . * *   :                           .     
ALK02457.1      FKDGVYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AAS10463.1      FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AAP13441.1      FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AAP13567.1      FKDGIYF----AATEKSNVVRG-------------WVFGSTMNNKSQS-VI--IINNSTN
AGZ48806.1      FRDGIYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN
QDF43825.1      FRDGVYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN
AGZ48818.1      FKDGIYF----AATEKSNVIRG-------------WVFGSTMNNKSQS-VI--IMNNSTN
QHD43416.1      FNDGVYF----ASTEKSNIIRG-------------WIFGTTLDSKTQS-LL--IVNNATN
AVP78031.1      FKDGIYF----AATEHSNIIRG-------------WIFGTTLDNTSQS-LL--IVNNATN
ABD75323.1      FGDGVYF----AATEKSNVIRG-------------WVFGSTFDNTTQS-AI--IVNNSTH
QDF43835.1      FGDGIYF----AATEKSNVVRG-------------WIFGSTMDNTTQS-AI--IVNNSTH
ABD75332.1      FGDGVYF----AATEKSNVIRG-------------WIFGSSFDNTTQS-AI--IVNNSTH
QDF43820.1      FGDGVYF----AATEKSNVIRG-------------WIFGSTFDNTTQS-AV--IVNNSTH
AAZ67052.1      FGDGVYF----AATEKSNVIRG-------------WIFGSTFDNTTQS-AV--IVNNSTH
AFS88936.1      FANGFVVRIGAAANSTGTVIISPSTSATIRKIYPAFMLGSSVGNFSDG-KMGRFFNHTLV
YP_0010399      FDNGFVVRIGAAANSTGTIVISPSVNTKIKKAYPAFILGSSLTNTSAGQPL--YANYSLT
                * :*. .    *:.. ..:: .             :::*::. . : .  :    * :  
ALK02457.1      VVIRACNFELCDNPFFAVSKP-TGTQTHTM----IFDNAFNCTFEYISDS----FSLDVA
AAS10463.1      VVIRACNFELCDNPFFVVSKP-MGTRTHTM----IFDNAFNCTFEYISDA----FSLDVS
AAP13441.1      VVIRACNFELCDNPFFAVSKP-MGTQTHTM----IFDNAFNCTFEYISDA----FSLDVS
AAP13567.1      VVIRACNFELCDNPFFAVSKP-MGTQTHTM----IFDNAFNCTFEYISDA----FSLDVS
AGZ48806.1      LVIRACNFELCDNPFFVVLKS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDLG
QDF43825.1      LVIRACNFELCDNPFFVVLRS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDIG
AGZ48818.1      LVIRACNFELCDNPFFVVLKS-NNTQIPSY----IFNNAFNCTFEYVSKD----FNLDLG
QHD43416.1      VVIKVCEFQFCNDPFLGVYYH-KNNKSWMESEFRVYSSANNCTFEYVSQP----FLMDLE
AVP78031.1      VIIKVCNFDFCYDP-YLSGYY-HNNKTWSIREFAVYSSYANCTFEYVSKS----FMLNIS
ABD75323.1      IIIRVCYFNLCKDPMYTVSAG-TQKSSW------VYQSAFNCTYDRVEKS----FQLDTS
QDF43835.1      IIIRVCYFNLCKEPMYAISNE-QHYKSW------VYQNAYNCTYDRVEQS----FQLDTA
ABD75332.1      IIIRVCNFNLCKEPMYTVSKG-TQQSSW------VYQSAFNCTYDRVEKS----FQLDTA
QDF43820.1      IIIRVCNFNLCKEPMYTVSRG-TQQSSW------VYQSAFNCTYDRVERS----FQLDTA
AAZ67052.1      IIIRVCNFNLCKEPMYTVSRG-AQQSSW------VYQSAFNCTYDRVEKS----FQLDTA
AFS88936.1      LLPDGCGTLLR--AFYCILEPRSGNHCPAGNSYTSFATYHTPATDCSDGN----YNRNAS
YP_0010399      IIPDGCGTVLH--AFYCILKPRTVNRCPSGT---GYVSYF--IYETVHNDCQSTINRNAS
                ::   *   :   .                     : .      :            :  
ALK02457.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NILKPIFKL
AAS10463.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL
AAP13441.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL
AAP13567.1      EKS-GNFKHLREFVFKNKDG--------FLYVYKGYQPIDVVRDLPSGF--NTLKPIFKL
AGZ48806.1      EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL
QDF43825.1      EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL
AGZ48818.1      EKP-GNFKDLREFVFRNKDG--------FLHVYSGYQPISAASGLPTGF--NALKPIFKL
QHD43416.1      GKQ-GNFKNLREFVFKNIDG--------YFKIYSKHTPINLVRDLPQGF--SALEPLVDL
AVP78031.1      GNG-GLFNTLREFVFRNVDG--------HFKIYSKFTPVNLNRGLPTGL--SVLQPLVEL
ABD75323.1      PKT-GNFTDLREFVFKNRDG--------FFTAYQTYTPVNLLRGLPSGL--SVLKPILKL
QDF43835.1      PQT-GNFKDLREYVFKNKDG--------FLSVYNAYSPIDIPRGLPVGF--SVLKPILKL
ABD75332.1      PKT-GNFKDLREYVFKNKGG--------FLRVYQTYTAVNLPRGFPAGF--SVLRPILKL
QDF43820.1      PKT-GNFKDLREYVFKNRDG--------FLSVYQTYTAVNLPRGLPIGF--SVLRPILKL
AAZ67052.1      PKT-GNFKDLREYVFKNRDG--------FLSVYQTYTAVNLPRGLPIGF--SVLRPILKL
AFS88936.1      LNSFKEYFNLRNCTFMYTYNITEDEILEWFGITQTAQGVHLFSSRYVDLYGGNMFQFATL
YP_0010399      LNSFKSFFDLVNCTFFNSWDITADETKEWFGITQDTQGVHLYSSRKGDLYGGNMFRFATL
                 :    :  * : .*    .         :   .    :    .   .:  . :  :  *
ALK02457.1      PLGINITNFRAILTAF------LPAQDTWGTSAAAYFVGYLKPATFMLKYDENGTITDAV
AAS10463.1      PLGINITNFRAILTAF------SPAQDTWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AAP13441.1      PLGINITNFRAILTAF------SPAQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AAP13567.1      PLGINITNFRAILTAF------SPAQDTWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AGZ48806.1      PLGINITNFRTLLTAF------PPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
QDF43825.1      PLGINITNFRTLLTAF------PPNPGYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
AGZ48818.1      PLGINITNFRTLLTAF------PPRPDYWGTSAAAYFVGYLKPTTFMLKYDENGTITDAV
QHD43416.1      PIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAV
AVP78031.1      PVSINITKFRTLLTIHRGD---PMPNNGWTAFSAAYFVGYLKPRTFMLKYNENGTITDAV
ABD75323.1      PFGINITSFRVVMAMF------SKTTSNYVPESAAYYVGNLKQSTFMLSFNQNGTIVDAV
QDF43835.1      PIGINITSFKVVMSMF------SRTTSNFLPEVAAYFVGNLKYSTFMLNFNENGTITDAI
ABD75332.1      PFGINITSYRVVMTMF------SQFNSNFLPESAAYYVGNLKYTTFMLSFNENGTITDAV
QDF43820.1      PFGINITSYRVVMAMF------SQTTSNFLPESAAYYVGNLKYTTFMLRFNENGTITDAI
AAZ67052.1      PFGINITSYRVVMAMF------SQTTSNFLPESAAYYVGNLKYTTFMLSFNENGTITNAI  
AFS88936.1      PVYDTIKYYSIIPHSIRSI---QSDRKAW----AAFYVYKLQPLTFLLDFSVDGYIRRAI
YP_0010399      PVYEGIKYYTVIPRSFRSK---ANKREAW----AAFYVYKLHQLTYLLDFSVDGYIRRAI
                *.   *. :  :                :    **::*  *:  *::* :. :* *  *:
ALK02457.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
AAS10463.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS
AAP13441.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS
AAP13567.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVVPSGDVVRFPNITNLCPFGEVFNATKFPS
AGZ48806.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
QDF43825.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
AGZ48818.1      DCSQNPLAELKCSVKSFEIDKGIYQTSNFRVAPSKEVVRFPNITNLCPFGEVFNATTFPS
QHD43416.1      DCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFAS
AVP78031.1      DCALDPLSETKCTLKSLTVQKGIYQTSNFRVQPTQSVVRFPNITNVCPFHKVFNATRFPS
ABD75323.1      DCSQDPLAELKCTTKSFNVSKGIYQTSNFRVSPVTEVVRFPNITNLCPFDKVFNATRFPS
QDF43835.1      DCAQNPLSELKCTIKNFNVSKGIYQTSNFRVSPTHEVIRFPNITNRCPFDKVFNASRFPN
ABD75332.1      DCSQNPLAELKCTIKNFNVSKGIYQTSNFRVTPTQEVVRFPNITNRCPFDKVFNASRFPN
QDF43820.1      DCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVVRFPNITNRCPFDKVFNASRFPN
AAZ67052.1      DCAQNPLAELKCTIKNFNVSKGIYQTSNFRVSPTQEVIRFPNITNRCPFDKVFNATRFPN
AFS88936.1      DCGFNDLSQLHCSYESFDVESGVYSVSSFEAKPSGSVVEQAEGVE-CDFSPLLSGTP-PQ
YP_0010399      DCGHDDLSQLHCSYTSFEVDTGVYSVSSYEASATGTFIEQPNATE-CDFSPMLTGVA-PQ
                **. : *:: :*:  .: :..*:*..*.: . .   .:  .: .: * *  ::..   ..
ALK02457.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AAS10463.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AAP13441.1      VYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AAP13567.1      VYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AGZ48806.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
QDF43825.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
AGZ48818.1      VYAWERKRISNCVADYSVLYNSTSFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQ
QHD43416.1      VYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQ
AVP78031.1      VYAWERTKISDCIADYTVFYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRFSEVRQ
ABD75323.1      VYAWERTKISDCVADYTVFYNSTSFSTFNCYGVSPSKLIDLCFTSVYADTFLIRFSEVRQ
QDF43835.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
ABD75332.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
QDF43820.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
AAZ67052.1      VYAWERTKISDCVADYTVLYNSTSFSTFKCYGVSPSKLIDLCFTSVYADTFLIRSSEVRQ
AFS88936.1      VYNFKRLVFTNCNYNLTKLLSLFSVNDFTCSQISPAAIASNCYSSLILDYFSYPLSMKSD
YP_0010399      VYNFKRLVFSNCNYNLTKLLSLFAVDEFSCNGISPDSIARGCYSTLTVDYFAYPLSMKSY
                ** ::*  :::*  : : : .   .. *.*  :*.  :   *::.:  * *    .    
ALK02457.1      IAPGQTGVIADYNYKLPDDFTGC-VLAWNTRNIDATQTGNYNYKYRSLRHGKLRPFER-D
AAS10463.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D
AAP13441.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D
AAP13567.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRYLRHGKLRPFER-D
AGZ48806.1      IAPGQTGVIADYNYKLPDDFLGC-VLAWNTNSKDSSTSGNYNYLYRWVRRSKLNPYER-D
QDF43825.1      IAPGQTGVIADYNYKLPDDFMGC-VLAWNTRNIDATSTGNYNYKYRSLRHGKLRPFER-D
AGZ48818.1      IAPGQTGVIADYNYKLPDDFTGC-VLAWNTRNIDATQTGNYNYKYRSLRHGKLRPFER-D
QHD43416.1      IAPGQTGKIADYNYKLPDDFTGC-VIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFER-D
AVP78031.1      VAPGQTGVIADYNYKLPDDFTGC-VIAWNTAKQD---VGNYF--YRSHRSTKLKPFER-D
ABD75323.1      VAPGQTGVIADYNYKLPDDFTGC-VIAWNTAKQD---VGSYF--YRSHRSSKLKPFER-D
QDF43835.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---QGQYY--YRSSRKTKLKPFER-D
ABD75332.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAQQD---QGQYY--YRSYRKEKLKPFER-D
QDF43820.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---TGHYY--YRSHRKTKLKPFER-D
AAZ67052.1      VAPGETGVIADYNYKLPDDFTGC-VIAWNTAKQD---QGQYY--YRSHRKTKLKPFER-D
AFS88936.1      LSVSSAGPISQFNYKQSFSNPTC-LILATVPHNLTTITKPLKYSYIN-KCSRLLSDDRTE
YP_0010399      IRPGSAGNIPLYNYKQSFANPTCRVMASVLANVTITKPHAYG--YIS-KCSRLTGANQ-D
                :  ..:* *. :*** .     * ::                  *   .  .*   :. :
ALK02457.1      ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFYITNGIGYQPYRVVVLS
AAS10463.1      ISNVPFSPDGK--PCTPP-APNCYW-----------PLNGYGFYTTSGIGYQPYRVVVLS
AAP13441.1      ISNVPFSPDGK--PCTPP-ALNCYW-----------PLNDYGFYTTTGIGYQPYRVVVLS
AAP13567.1      ISNVPFSPDGK--PCTPP-ALNCYW-----------PLNDYGFYTTTGIGYQPYRVVVLS
AGZ48806.1      LSNDIYSPGGQ--SCSAV-GPNCYN-----------PLRPYGFFTTAGVGHQPYRVVVLS
QDF43825.1      ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFFTTNGIGYQPYRVVVLS
AGZ48818.1      ISNVPFSPDGK--PCTPP-AFNCYW-----------PLNDYGFYITNGIGYQPYRVVVLS
QHD43416.1      ISTEIYQAGST--PCNGVEGFNCYF-----------PLQSYGFQPTNGVGYQPYRVVVLS
AVP78031.1      LSS----------------DENGVR-----------TLSTYDFNPNVPLEYQATRVVVLS
ABD75323.1      LSS----------------EENGVR-----------TLSTYDFNQNVPLEYQATRVVVLS
QDF43835.1      LTS----------------DENGVR-----------TLSTYDFYPNVPIEYQATRVVVLS
ABD75332.1      LSS----------------DENGVY-----------TLSTYDFYPSIPVEYQATRVVVLS
QDF43820.1      LSSD---------------DGNGVY-----------TLSTYDFNPNVPVAYQATRVVVLS
AAZ67052.1      LSS----------------DENGVR-----------TLSTYDFYPSVPVAYQATRVVVLS
AFS88936.1      VPQLVNANQYS--PCVSI-VPSTVWEDGDYYRKQLSPLEGGGWLVASGSTVAMTEQLQMG
YP_0010399      VETPLYINPGEYSICRDF-SPGGFSEDGQVFKRTLTQFEGGGLLIGVGTRVPMTDNLQMS
                :                    .               :   .              : :.
ALK02457.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AAS10463.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AAP13441.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AAP13567.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AGZ48806.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
QDF43825.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
AGZ48818.1      FELL----NAPATVC-----GPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKRFQPFQQF
QHD43416.1      FELL----HAPATVC-----GPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQF
AVP78031.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTDSSKRFQSFQQF
ABD75323.1      FELL----NAPATVC-----GPKLSTSLVKNQCVNFNFNGFKGTGVLTDSSKTFQSFQQF
QDF43835.1      FELL----NAPATVC-----GPKLSTGLVKNQCVNFNFNGLRGTGVLTDSSKRFQSFQQF
ABD75332.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLRGTGVLTTSSKRFQSFQQF
QDF43820.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTDSSKRFQSFQQF
AAZ67052.1      FELL----NAPATVC-----GPKLSTQLVKNQCVNFNFNGLKGTGVLTESSKRFQSFQQF
AFS88936.1      FGITVQYGTDTNSVCPKLEFANDTKIASQLGNCVEYSLYGVSGRGVFQNCTAVGVRQQRF
YP_0010399      FIISVQYGTGTDSVCPMLDLGDSLTITNRLGKCVDYSLYGVTGRGVFQNCTAVGVKQQRF
                * :       . :**     . . .     .:**::.: *. * **:  ..      *.*
ALK02457.1      GRDVLD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
AAS10463.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTLI
AAP13441.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAI
AAP13567.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAI
AGZ48806.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
QDF43825.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
AGZ48818.1      GRDVSD-FTDSVRDPKTSEILDISPCSFGGVSVITPGTNTSSEVAVLYQDVNCTDVPVAI
QHD43416.1      GRDIAD-TTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQDVNCTEVPVAI
AVP78031.1      GKDASD-FIDSVRDPQTLEILDITPCSFGGVSVITPGTNTSLEVAVLYQDVNCTDVPTTI
ABD75323.1      GRDASD-FTDSVRDPQTLRILDISPCSFGGVSVITPGTNTSSAVAVLYQDVNCTDVPRTI
QDF43835.1      GRDTSD-FTDSVRDPQTLEILDITPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTAI
ABD75332.1      GRDTSD-FTDSVRDPQTLEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTSI
QDF43820.1      GRDTSD-FTDSVRDPQTLEILDITPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPTAI
AAZ67052.1      GRDTSD-FTDSVRDPQTLEILDISPCSFGGVSVITPGTNASSEVAVLYQDVNCTDVPAAI
AFS88936.1      VYDAYQNLVGYYSDDGNYYCLR--ACVSVPVSVIY--DKETKTHATLFGSVACEHISSTM
YP_0010399      VYDSFDNLVGYYSDDGNYYCVR--PCVSVPVSVIY--DKSTNLHATLFGSVACEHVTTMM
                  *  :   .   *  .   :   .*    ****    : :   *.*: .* *  :.  :
ALK02457.1      -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
AAS10463.1      -HAEQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
AAP13441.1      -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSL 
AAP13567.1      -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSL
AGZ48806.1      -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
QDF43825.1      -HADQLTPA-WRIYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
AGZ48818.1      -HADQLTPS-WRVYSTGNNVFQTQAGCLIGAEHVDTSY---ECDIPIGAGICASYHTVSS
QHD43416.1      -HADQLTPT-WRVYSTGSNVFQTRAGCLIGAEHVNNSY---ECDIPIGAGICASYQTQTN
AVP78031.1      -HADQLTPA-WRIYATGTNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTASI
ABD75323.1      -QADQLAPS-WRVYTTGPYVFQTQAGCLIGAEHVNASY---QCDIPIGAGICASYHTASH
QDF43835.1      -RADQLTPA-WRVYSTGINVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST
ABD75332.1      -HADQLTPA-WRVYSTGVNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTASV
QDF43820.1      -RADQLTPA-WRVYSTGVNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST
AAZ67052.1      -HADQLTPA-WRVYSTGTNVFQTQAGCLIGAEHVNASY---ECDIPIGAGICASYHTAST
AFS88936.1      SQYSRSTRSMLKRRDSTYGPLQTPVGCVLGL--VNSSLFVEDCKLPLGQSLCALPDTPST
YP_0010399      SQFSRLTQS-NLRRRDSNIPLQTAVGCVIGLS--NNSLVVSDCKLPLGQSLCAVPP-VST
                 . .. : :           :** .**::*    : *    :*.:*:* .:**     : 
ALK02457.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AAS10463.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AAP13441.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AAP13567.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AGZ48806.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
QDF43825.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
AGZ48818.1      ----LRSTS----QKSI--------VAYTMSLGADSSIAYSNNTIAIPTNFSISITTEVM
QHD43416.1      SPRRARSVA----SQSI--------IAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEIL
AVP78031.1      ----LRSTS----QKAI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
ABD75323.1      ----LRSTG----QKSI--------VAYTMSLGAENSVAYANNSIAIPTNFSISVTTEVM
QDF43835.1      ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
ABD75332.1      ----LRSTG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
QDF43820.1      ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
AAZ67052.1      ----LRSVG----QKSI--------VAYTMSLGAENSIAYANNSIAIPTNFSISVTTEVM
AFS88936.1      ----LTPRS----VRSVPGEMRLASIAFNHPIQVDQ-LNSSYFKLSIPTNFSFGVTQEYI
YP_0010399      ----FRSYSASQFQLAV--------LNYTSPI-VVTPINSSGFTAAIPTNFSFSVTQEYI
                      . .      ::        : :. .: .   :  :  . :*****::.:* * :
ALK02457.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
AAS10463.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCRQLNRALSGIAAEQDRNTREVFVQVKQ
AAP13441.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQ
AAP13567.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQDRNTREVFAQVKQ
AGZ48806.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
QDF43825.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
AGZ48818.1      PVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAVEQDRNTREVFAQVKQ
QHD43416.1      PVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQ
AVP78031.1      PVSMAKTSVDCTMYICGDSIECSNLLLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQ
ABD75323.1      PVSMAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALSGIAVEQDKNTQEVFAQVKQ
QDF43835.1      PVSMSKTSVDCTMYICGDSQECSNLLLQYGSFCTQLNRALTGIAIEQDKNTQEVFAQVKQ
ABD75332.1      PVSIAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALTGIAIEQDKNTQEVFAQVKQ
QDF43820.1      PVSMAKTSVDCTMYICGDSQECSNLLLQYGSFCTQLNRALTGVALEQDKNTQEVFAQVKQ
AAZ67052.1      PVSMAKTSVDCTMYICGDSLECSNLLLQYGSFCTQLNRALSGIAIEQDKNTQEVFAQVKQ
AFS88936.1      QTTIQKVTVDCKQYVCNGFQKCEQLLREYGQFCSKINQALHGANLRQDDSVRNLFASVKS
YP_0010399      ETSIQKVTVDCKQYVCNGFTRCEKLLVEYGQFCSKINQALHGANLRQDESVYSLYSNIKT
                 .:: *.:***. *:*..   * :** :**.** ::*.** *    ** .. .:: .:* 
ALK02457.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAS10463.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAP13441.1      MYKTPTLKYFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAP13567.1      MYKTPTLKYFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AGZ48806.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
QDF43825.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AGZ48818.1      MYKTPTLKDFGG-FNFSQILPDPLKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
QHD43416.1      IYKTPPIKDFGG-FNFSQILPDPSKPSKRSF---IEDLLFNKVTLADAGFIKQYGDCL--
AVP78031.1      IYKTPPIKDFGG-FNFSQILPDPSKPSKRSF---IEDLLFNKVTLADAGFIKQYGDCL--
ABD75323.1      MYKTPTIRDFGG-FNFSQILPDPLKPTKRSF---IEDLLYNKVTLADAGFMKQYADCL--
QDF43835.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
ABD75332.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
QDF43820.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AAZ67052.1      MYKTPAIKDFGG-FNFSQILPDPSKPTKRSF---IEDLLFNKVTLADAGFMKQYGECL--
AFS88936.1      SQSSPIIPGFGGDFNLTLLEPVSISTGSRSARSAIEDLLFDKVTIADPGYMQGYDDCMQQ
YP_0010399      TSTQTLEYGLNGDFNLTLLQVPQIGGSSSSYRSAIEDLLFDKVTIADPGYMQGYDDCMKQ
                  . .    :.* **:: :        . *    *****::***:**.*::: * :*:  
ALK02457.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAS10463.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAP13441.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAP13567.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AGZ48806.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
QDF43825.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AGZ48818.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
QHD43416.1      GDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQ
AVP78031.1      GGISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALISGTATAGWTFGAGAALQIPFAMQ
ABD75323.1      GGINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALISGTATAGWTFGAGAALQIPFAMQ
QDF43835.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
ABD75332.1      GDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQ
QDF43820.1      GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTATAGWTFGAGAALQIPFAMQ
AAZ67052.1      GDISARDLICAQKFNGLTVLPPLLTDEMIAAYTAALVSGTATAGWTFGAGSALQIPFAMQ
AFS88936.1      GPASARDLICAQYVAGYKVLPPLMDVNMEAAYTSSLLGSIAGVGWTAGLSSFAAIPFAQS
YP_0010399      GPQSARDLICAQYVSGYKVLPPLYDPNMEAAYTSSLLGSIAGAGWTAGLSSFAAIPFAQS
                *   ******** . * .*****   :* * **::*:..    *** * .:   **** .
ALK02457.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAS10463.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAP13441.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAP13567.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AGZ48806.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
QDF43825.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AGZ48818.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
QHD43416.1      MAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNT
AVP78031.1      MAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQESLTSTASALGKLQDVVNQNAQALNT
ABD75323.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAITQIQESLTTTSTALGKLQDVVNQNAQALNT
QDF43835.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
ABD75332.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
QDF43820.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AAZ67052.1      MAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDVVNQNAQALNT
AFS88936.1      IFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTTNEAFQKVQDAVNNNAQALSK
YP_0010399      MFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTSNLAFSKVQDAVNANAQALSK
                : **:**:*:**:** **** ***:**.*:  :* .::::  *: *:**.** *****..

ALK02457.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAS10463.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAP13441.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAP13567.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AGZ48806.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QDF43825.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AGZ48818.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QHD43416.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AVP78031.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
ABD75323.1      LVKQLSSNFGAISSALNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QDF43835.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
ABD75332.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
QDF43820.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AAZ67052.1      LVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRAS
AFS88936.1      LASELSNTFGAISASIGDIIQRLDVLEQDAQIDRLINGRLTTLNAFVAQQLVRSESAALS
YP_0010399      LASELSNTFGAISSSISDILARLDTVEQDAQIDRLINGRLISLNAFVSQQLVRSETAARS
                *..:**..*****: :.**: *** :* :.******.*** :*:::*:***:*:     *
ALK02457.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AAS10463.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AAP13441.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AAP13567.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AGZ48806.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
QDF43825.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AGZ48818.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
QHD43416.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAI
AVP78031.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYIPSQEKNFTTAPAI
ABD75323.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPSQEKNFTTAPAI 
QDF43835.1      ANLAATKMSECVLGQSKRVDFCGRGYHLMSFPQAAPHGVVFLHVTYVPSQEKNFTTAPAI
ABD75332.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
QDF43820.1      ANLAATKMSECVLGQSKRVDFCGRGYHLMSFPQAAPHGVVFLHVTYVPSQEKNFTTAPAI
AAZ67052.1      ANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGVVFLHVTYVPSQERNFTTAPAI
AFS88936.1      AQLAKDKVNECVKAQSKRSGFCGQGTHIVSFVVNAPNGLYFMHVGYYPSNHIEVVSAYGL
YP_0010399      AQLASDKVNECVKSQSKRNGFCGSGTHIVSFVVNAPNGFYFFHVGYVPTNYTNVTAAYGL
                *:**  *:.*** .**** .*** * *::**   **:*. *:** * *::  :..:* .:

ALK02457.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
AAS10463.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI
AAP13441.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI
AAP13567.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGNCDVVI
AGZ48806.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
QDF43825.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
AGZ48818.1      CHEGK---AYFPREGVFVFNGTS-------WFITQRNFFSPQIITTDNT-FVSGSCDVVI
QHD43416.1      CHDGK---AHFPREGVFVSNGTH-------WFVTQRNFYEPQIITTDNT-FVSGNCDVVI
AVP78031.1      CHEGK---AHFPREGVFVSNGTH-------WFVTQRNFYEPKIITTDNT-FVSGNCDVVI
ABD75323.1      CHEGK---AYFPREGVFVSNGSS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI
QDF43835.1      CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI
ABD75332.1      CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGNCDVVI
QDF43820.1      CHEGK---AYFPREGVFVSNGTF-------WFITQRNFYSPQIITTDNT-FVAGNCDVVI
AAZ67052.1      CHEGK---AYFPREGVFVSNGTS-------WFITQRNFYSPQIITTDNT-FVAGSCDVVI
AFS88936.1      CDAANPTNCIAPVNGYFIKTNNT--RIVDEWSYTGSSFYAPEPITSLNTKYVA--PQVTY
YP_0010399      CNNNNPPLCIAPIDGYFITNQTTTYSVDTEWYYTGSSFYKPEPITQANSRYVS--SDVKF
                *   :   .  * :* *: . .        *  *  .*: *: **  *: :*:   :*  
ALK02457.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AAS10463.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQEEIDRLN
AAP13441.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AAP13567.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AGZ48806.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
QDF43825.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AGZ48818.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEINRLN
QHD43416.1      GIVNNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AVP78031.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDIDLGDISGINASVVNIQKEIDRLN
ABD75323.1      GIINNTVYDPL---QPELDSFKQELDKYFKNHTSPDVDLGDISGINASVVDIQKEIDRLN
QDF43835.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
ABD75332.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
QDF43820.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AAZ67052.1      GIINNTVYDPL---QPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLN
AFS88936.1      QNISTNLPPPLLGNSTGID-FQDELDEFFKNVSTSIPNFGSLTQINTTLLDLTYEMLSLQ
YP_0010399      DKLENNLPPPLLENSTDVD-FKDELEEFFKNVTSHGPNFAEISKINTTLLDLSDEMAMLQ
                  :...:  **   .. :* *::**:::*** ::   ::..:: **::::::  *:  *:
ALK02457.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AAS10463.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AAP13441.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AAP13567.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AGZ48806.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA

QDF43825.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AGZ48818.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
QHD43416.1      EVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGC
AVP78031.1      EVARNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGC
ABD75323.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLVGLFMAIILLCYFTSCCSCCKGM
QDF43835.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMATILLCCMTSCCSCLKGA
ABD75332.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
QDF43820.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMATILLCCMTSCCSCLKGA
AAZ67052.1      EVAKNLNESLIDLQELGKYEQYIKWPWYVWLGFIAGLIAIVMVTILLCCMTSCCSCLKGA
AFS88936.1      QVVKALNESYIDLKELGNYTYYNKWPWYIWLGFIAGLVALALCVFFILCCTGCGTNCMGK
YP_0010399      EVVKQLNDSYIDLKELGNYTYYNKWPWYVWLGFIAGLVALLLCVFFLLCCTGCGTSCLGK
                :*.. **:* ***:***:*  * *****:********:.: :  :::   *.* :   * 
ALK02457.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAS10463.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAP13441.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAP13567.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AGZ48806.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
QDF43825.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AGZ48818.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
QHD43416.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AVP78031.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
ABD75323.1      CSCGSCC-RFDEDDSEPVLKGVKLHYT
QDF43835.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
ABD75332.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
QDF43820.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AAZ67052.1      CSCGSCC-KFDEDDSEPVLKGVKLHYT
AFS88936.1      LKCNRCCDRYEEYDLEP----HKVHVH
YP_0010399      MKCKNCCDSYEEYDVE------KIHVH
                .*  **  ::* * *      *:*  
    • The “*” represents invariant regions, “:” represents highly conserved regions, “.” for weakly conserved, and a space for not conserved regions
  • Click on tab 6.Tree Rendering so the data is reformatted as a phylogeny tree

    • On the phylogenetic tree, horizontal lines (branches) represents individual evolutionary lineages and vertical lines (splits) represents mutation events
  • Comparing the Multiple Sequence Alignment to the Phylogenetic Tree:
    • The tree and the sequence alignment both depict highly conserved regions between many of the groups, but specifically between AGZ48806.1 spike protein [Bat SARS-like coronavirus RsSHC014] and AGZ48818.1 spike protein [Bat SARS-like coronavirus Rs3367] there are a few conserved regions within the two sequences that align and on the phylogenetic tree they occupy the same clades sister taxa with a 94% confidence within that clade. This indicates that both sequences are relatively related and share a most common ancestor. Another observation would be the outgroups AFS88936.1 S protein [Human betacoronavirus 2c EMC/2012] and YP_001039953.1 spike glycoprotein [Tylonycteris bat coronavirus HKU4] which are also sister taxa and show the most similar sequences of conservation and variation. When comparing these two groups they depict a lot of variation between one another's conserved regions and variance regions, explaining the distance on the phylogenetic tree.
  • Compare the Multiple Sequence alignment to Figure 3 ofWan et al (2020) paper:
    • THe class's alignment and the figure were not extremely similar to the figure depicted in the Wan et al. (2020) paper. The amount of invariance in the paper depicted is higher than what was portrayed in the class's sequence alignment. Additionally, the amount of conservation in the paper's figure was significantly higher and did not align completely with areas of conservation within the class's alignment.
  • Compare the Multiple Sequence alignment to Figure 2 ofWan et al (2020) paper:
    • The sequences within the class did not directly align to those that were present in the figure of the Wan et al. (2020) paper. This resulted I differences being more prevalent than if exact sequences were compared, but the tree, did have the same 2 primary branches, but within clades, did not depict similar trends of outgroups and evolutionary branches.
  • Is there adequate enough information to reproduce Wan et al (2020) paper analysis?:
    • There is not a sufficient amount of information to completely replicate the results from the Wan et al. (2020) paper. The paper lists an outgroup BtSCoV PDF2386, but the sequences were unavailable on Genbank. Since the exact sequences are unavailable it wouldn't be possible to replicate the exact results of the papers and figures.

Conclusion

The lab compared the sequences of a variety of covid-like and related spike proteins in order to discern the genetic similarities and similarities between multiple species. The purpose of this experiment was to analyze data and build the skills necessary to find species with similarities to the 2019-nCoV in order to better understand the genetic information of 2019-nCoV. The purpose of comparing it to another similar study was to analyze the differences and similarities between our own findings as well as comprehension and application of the knowledge to the information gathered.

Acknowledgements

  • I acknowledge my homework partner Taylor Makela, who I spent several hours with going over formatting and general questions about the assignment.
  • I acknowledge that I copied and modified the protocol shown on the Week 4 assignment page for this course.
  • I acknowledge Anna Horvath who's format for Week 3 I referenced throughout this assignment.
  • Except for what is noted above, this individual entry was completed by me and not copied
  • Nidapatel (talk) 00:06, 1 October 2020 (PDT)

References

  1. OpenWetWare. (2020). BIOL368/F20:Week 4. Accessed 30 September 2020, from https://openwetware.org/wiki/BIOL368/F20:Week_4
  2. OpenWetWare. (2020). Talk:BIOL368/F20:Week 4. Accessed 30 September 2020, from https://openwetware.org/wiki/Talk:BIOL368/F20:Week_4
  3. NCBI GenBank. (2020). Severe Acute Respiratory Syndrome Coronavirus 2 Isolate Wuhan-Hu-1, Complete Genome. Retrieved 30 September 2020, from https://www.ncbi.nlm.nih.gov/nuccore/MN908947
  4. NCBI GenBank. (2020). Spike Protein [Bat SARS-like Coronavirus RsSHC014].Accessed 30 September 2020, from https://www.ncbi.nlm.nih.gov/protein/556015117
  5. Phylogeny.fr: "One Click" Mode. (2020). Accessed 30 September 2020, from http://www.phylogeny.fr/simple_phylogeny.cgi?workflow_id=b9c0813cbbe9695d63cf7e31da5f026d&tab_index=1
  6. Wan, Y., Shang, J., Graham, R., Baric, R., & Li, F. (2020). Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus. Journal Of Virology, 94(7). doi: 10.1128/jvi.00127-20

Template Links

Assignment Pages


Individual Journal Pages

Class Journal Pages