Wikiomics:Bioinfo tutorial (exercises)

From OpenWetWare
Jump to navigationJump to search

Sequence formats

text only/raw sequence

DNA

CGGGAGGCGGCAGCGGCTGCAGCGTTGGTAGCATCAGCATCAGCATCAGCGGCAGCGGCAGCGGCCTCGG
GCGGGGCCGGCCGGACGGACAGGCGGACAGAAGGCGCCAGGGGCGCGCGTCCCGCCCGGGCCGGCCATGG
AGGGCGCCTCCTTCGGCGCGGGCCGCGCAGGGGCCGCCCTGGACCCCGTGAGCTTTGCGCGGCGGCCCCA
GACCCTGCTCCGGGTCGCGTCCTGGGTGTTCTCCATCGCCGTCTTCGGGCCCATCGTCAACGAGGGCTAC
GTGAACACCGACAGCGGCCCCGAGCTGCGCTGCGTGTTCAACGGGAACGCGGGCGCCTGCCGCTTCGGCG


Protein

MEGASFGAGRAGAALDPVSFARRPQTLLRVASWVFSIAVFGPIV
NEGYVNTDSGPELRCVFNGNAGACRFGVALGLGAFLACAAFLLLDVRFQQISSVRDRR
RAVLLDLGFSGLWSFLWFVGFCFLTNQWQRTAPGPATTQAGDAARAAIAFSFFSILSW
VALTVKALQRFRLGTDMSLFATEQLSTGASQAYPGYPVGSGVEGTETYQSPPFTETLD
TSPKGYQVPAY

FASTA

single FASTA

>gi|22091456|ref|NM_004209.4| Homo sapiens synaptogyrin 3 (SYNGR3), mRNA
CGGGAGGCGGCAGCGGCTGCAGCGTTGGTAGCATCAGCATCAGCATCAGCGGCAGCGGCAGCGGCCTCGG
GCGGGGCCGGCCGGACGGACAGGCGGACAGAAGGCGCCAGGGGCGCGCGTCCCGCCCGGGCCGGCCATGG
AGGGCGCCTCCTTCGGCGCGGGCCGCGCAGGGGCCGCCCTGGACCCCGTGAGCTTTGCGCGGCGGCCCCA
GACCCTGCTCCGGGTCGCGTCCTGGGTGTTCTCCATCGCCGTCTTCGGGCCCATCGTCAACGAGGGCTAC
GTGAACACCGACAGCGGCCCCGAGCTGCGCTGCGTGTTCAACGGGAACGCGGGCGCCTGCCGCTTCGGCG


multiple FASTA

>homoSYNGR1a_NP_004702.2
MEGGAYGAGKAGGAFDPYTLVRQPHTILRVVSWLFSIVVFGSIVNEGYLNSASEGEEFCIYNRNPNACSY
GVAVGVLAFLTCLLYLALDVYFPQISSVKDRKKAVLSDIGVSAFWAFLWFVGFCYLANQWQVSKPKDNPL
NEGTDAARAAIAFSFFSIFTWAGQAVLAFQRYQIGADSALFSQDYMDPSQDSSMPYAPYVEPTGPDPAGM
GGTYQQPANTFDTEPQGYQSQGY

>X.laev_syngr1_BAB79596.1
MEGGAYGAGKAGGAFDPQTFIRQPHTILRMVSWVFSIVVFGCIINEGYINSSTEEEEHCIFNRNPSACSY
GVTVGVLAFLTCLLYLAVDIYFPQISSVKDRKKTVISDIAVSALWAFFWFVGFCFLANQWQVSNPNDNPM
NEGADAARAAITFSFFSIFTWAGQAVLAYQQYRLGSDSALFSQDYMDPSQDQGPPYPPYASNEDLDPSAG
YQQPPTEAYDAGSHGYQTQDY

>X.trop_syngr1_NP_001016195.1
MEGGAYGAGKAGGAFDPQTFVRQPHTVLRMVSWVFSIVVFGCIINEGYINASTEAEEHCIFNRNSSACAY
GVTVGVLAFLTCLLYLAVDVYFPQISSVKDRKKTVISDIAVSGLWAFFWFVGFCFLANQWQVSNPNDNPM
NEGADAARAAIAFSFFSIFTWAGQAVLAYQRYRLGSDSALFSQDYMDPSQDQGPPYPPYASNEDLDPSAG
YQQPPSDAYDAGSQGYQTQDY

GenBank format

LOCUS       NM_004209               2054 bp    mRNA    linear   PRI 24-SEP-2005
DEFINITION  Homo sapiens synaptogyrin 3 (SYNGR3), mRNA.
ACCESSION   NM_004209
VERSION     NM_004209.4  GI:22091456
KEYWORDS    .
SOURCE      Homo sapiens (human)
  ORGANISM  Homo sapiens
            Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
            Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
            Catarrhini; Hominidae; Homo.
REFERENCE   1  (bases 1 to 2054)
  AUTHORS   Kedra,D., Pan,H.Q., Seroussi,E., Fransson,I., Guilbaud,C.,
            Collins,J.E., Dunham,I., Blennow,E., Roe,B.A., Piehl,F. and
            Dumanski,J.P.
  TITLE     Characterization of the human synaptogyrin gene family
  JOURNAL   Hum. Genet. 103 (2), 131-141 (1998)
   PUBMED   9760194
COMMENT     REVIEWED REFSEQ: This record has been curated by NCBI staff. The
            reference sequence was derived from BC014087.2 and BG699055.1.
            On Aug 2, 2002 this sequence version replaced gi:21361084.
            
            Summary: This gene encodes an integral membrane protein. The gene
            belongs to the synaptogyrin gene family. Like other members of the
            family the protein contains four transmembrane regions. The exact
            function of this protein is unclear.
            COMPLETENESS: complete on the 3' end.
FEATURES             Location/Qualifiers
     source          1..2054
                     /organism="Homo sapiens"
                     /mol_type="mRNA"
                     /db_xref="taxon:9606"
                     /chromosome="16"
                     /map="16p13"
     gene            1..2054
                     /gene="SYNGR3"
                     /note="synonym: MGC:20003"
                     /db_xref="GeneID:9143"
                     /db_xref="HGNC:11501"
                     /db_xref="HPRD:06801"
                     /db_xref="MIM:603927"
     CDS             137..826
                     /gene="SYNGR3"
                     /go_component="integral to plasma membrane [pmid 9760194];
                     membrane"
                     /codon_start=1
                     /product="synaptogyrin 3"
                     /protein_id="NP_004200.2"
                     /db_xref="GI:6631112"
                     /db_xref="CCDS:CCDS10456.1"
                     /db_xref="GeneID:9143"
                     /db_xref="HGNC:11501"
                     /db_xref="HPRD:06801"
                     /db_xref="MIM:603927"
                     /translation="MEGASFGAGRAGAALDPVSFARRPQTLLRVASWVFSIAVFGPIV
                     NEGYVNTDSGPELRCVFNGNAGACRFGVALGLGAFLACAAFLLLDVRFQQISSVRDRR
                     RAVLLDLGFSGLWSFLWFVGFCFLTNQWQRTAPGPATTQAGDAARAAIAFSFFSILSW
                     VALTVKALQRFRLGTDMSLFATEQLSTGASQAYPGYPVGSGVEGTETYQSPPFTETLD
                     TSPKGYQVPAY"
     STS             1230..1360
                     /gene="SYNGR3"
                     /standard_name="D3S3542"
                     /db_xref="UniSTS:4004"
     STS             1396..2025
                     /gene="SYNGR3"
                     /standard_name="SYNGR3_8904"
                     /db_xref="UniSTS:467928"
     STS             1739..1859
                     /gene="SYNGR3"
                     /standard_name="RH66376"
                     /db_xref="UniSTS:7190"
     STS             1914..2013
                     /gene="SYNGR3"
                     /standard_name="SHGC-61093"
                     /db_xref="UniSTS:61380"
     polyA_signal    2001..2006
                     /gene="SYNGR3"
     polyA_site      2029
                     /gene="SYNGR3"
                     /experiment="experimental evidence, no additional details
                     recorded"
ORIGIN      
        1 cgggaggcgg cagcggctgc agcgttggta gcatcagcat cagcatcagc ggcagcggca
       61 gcggcctcgg gcggggccgg ccggacggac aggcggacag aaggcgccag gggcgcgcgt
      121 cccgcccggg ccggccatgg agggcgcctc cttcggcgcg ggccgcgcag gggccgccct
<snip>

GCG

from [1]

!!NA_SEQUENCE 1.0

H.sapiens fau mRNA

HSFAU  Length: 518  Type: N  Check: 2981 ..

   1 ttcctctttc tcgactccat cttcgcggta gctgggaccg ccgttcagtc

  51 gccaatatgc agctctttgt ccgcgcccag gagctacaca ccttcgaggt

 101 gaccggccag gaaacggtcg cccagatcaa ggctcatgta gcctcactgg

 151 agggcattgc cccggaagat caagtcgtgc tcctggcagg cgcgcccctg

 201 gaggatgagg ccactctggg ccagtgcggg gtggaggccc tgactaccct

 251 ggaagtagca ggccgcatgc ttggaggtaa agttcatggt tccctggccc

 301 gtgctggaaa agtgagaggt cagactccta aggtggccaa acaggagaag

 351 aagaagaaga agacaggtcg ggctaagcgg cggatgcagt acaaccggcg

 401 ctttgtcaac gttgtgccca cctttggcaa gaagaagggc cccaatgcca

 451 actcttaagt cttttgtaat tctggctttc tctaataaaa aagccactta

 501 gttcagtcaa aaaaaaaa

XML

here


EXP

ID   xb61c6.s1
EN   xb61c6.s1
LN   xb61c6.s1.ztr
LT   ZTR
QR   446
AQ   54.180000
SQ
     GCTTTACTGC CTTAGGGTCG ACTCTAGAGG ATCCCCTAAA TTATTATTTT AAATTGACAT
     TTTGAAAATT TCCCCCGTAA TTTTATTGCA ATTTTAATTG AAAGTTTATT AATTGTGAAA
     TGTGCTTTTT AAGATGTTGC AAACACCTAA TTACTATT-T CACTTTTGAG ATATGT-AAT
     T-CTAAA-AA CTTTT-T--A ATTTCCTAA- --GAATTT-C AAAAG-CAAA A-AACAAACG
     AATTT-ATAA AAG--AAAAG -GCAAACT-A CCTG-TAACT -ACA-T-CCG G-CCA-T-AT
     CAAAATTAAT TTTTCCGTTG G-ATTAACTT TTT-G-AT-T TTCCT-GGGG GG-CTTGACT
     TT-CCGC-AT TGGTTGGTT- GGGTTCGCGG AATTTTTTTA CCAGGGGTTC TTTGGG-GGA
     G-G-AAG-C- TT-A-AAC-C GGGTT
//
SF   /home6/jkb/work/course/t/m13mp18.vector
CF   /home6/jkb/work/course/t/lorist6.vector
TN   xb61c6
PR   1
SC   6249
SP   41
SI   1400..2000
CH   0
QR   159
QL   0
SL   36
SR   446

GFF

from [2]

##gff-version   3
   ##sequence-region   ctg123 1 1497228       
   ctg123 . gene            1000  9000  .  +  .  ID=gene00001;Name=EDEN
   ctg123 . TF_binding_site 1000  1012  .  +  .  ID=tfbs00001;Parent=gene00001
   ctg123 . mRNA            1050  9000  .  +  .  ID=mRNA00001;Parent=gene00001;Name=EDEN.1
   ctg123 . five_prime_UTR  1050  1200  .  +  .  Parent=mRNA0001
   ctg123 . CDS             1201  1500  .  +  0  Parent=mRNA0001
   ctg123 . CDS             3000  3902  .  +  0  Parent=mRNA0001
   ctg123 . CDS             5000  5500  .  +  0  Parent=mRNA0001
   ctg123 . CDS             7000  7600  .  +  0  Parent=mRNA0001
   ctg123 . three_prime_UTR 7601  9000  .  +  .  Parent=mRNA0001
   ctg123 . cDNA_match 1050 1500  5.8e-42  +  . ID=match0001;Target=cdna0123+12+462
   ctg123 . cDNA_match 5000 5500  8.1e-43  +  . ID=match0001;Target=cdna0123+463+963
   ctg123 . cDNA_match 7000 9000  1.4e-40  +  . ID=match0001;Target=cdna0123+964+2964

MSF

PileUp



   MSF:    265   Type: P    Check:   214852   ..

Name: homoSYNGR1a_NP_  Len:  265  Check: 17794  Weight:  1.000
Name: X_laev_syngr1_B  Len:  265  Check: 17613  Weight:  1.000
Name: X_trop_syngr1_N  Len:  265  Check: 17610  Weight:  1.000
Name: homo_SYNGR3_NP_  Len:  265  Check: 17422  Weight:  1.000
Name: homo_SYNGR2_AAH  Len:  265  Check: 17128  Weight:  1.000
Name: D_rerio_XP_6969  Len:  265  Check: 18054  Weight:  1.000
Name: T_nigr_CAF89717  Len:  265  Check: 18574  Weight:  1.000
Name: D_mela_NP_61090  Len:  265  Check: 18339  Weight:  1.000
Name: D_rerio_syngr1_  Len:  265  Check: 16771  Weight:  1.000
Name: E_eleg_NP_50923  Len:  265  Check: 18759  Weight:  1.000
Name: A_gamb_EAA05258  Len:  265  Check: 16202  Weight:  1.000
Name: chick_syngr1_XP  Len:  265  Check: 9780  Weight:  1.000
Name: bee_XP_624778_1  Len:  265  Check: 10806  Weight:  1.000

//

homoSYNGR1a_NP_  .........M E.GGAYGAGK AGGAFDPYTL VRQPHTILRV VSWLFSIVVF
X_laev_syngr1_B  .........M E.GGAYGAGK AGGAFDPQTF IRQPHTILRM VSWVFSIVVF
X_trop_syngr1_N  .........M E.GGAYGAGK AGGAFDPQTF VRQPHTVLRM VSWVFSIVVF
homo_SYNGR3_NP_  .........M E.GASFGAGR AGAALDPVSF ARRPQTLLRV ASWVFSIAVF
homo_SYNGR2_AAH  .........M E.SGAYGAAK AGGSFDLRRF LTQPQVVARA VCLVFALIVF
D_rerio_XP_6969  .......MES R.SVAYGASL AGAGFDLVKF IKQPQTVVRF LSWVFAIVVF
T_nigr_CAF89717  .........M DGVGSFGAGR TGSAVDPIAF AKQPQTILRV LSWIFSLVVF
D_mela_NP_61090  MDMLNQILSI NNGGAYGGGK AGGAFDPLTF AMKPQVVIRA LCWLFSVVVF
D_rerio_syngr1_  MDQ....... ....AYGAGK AGGTFDPITF FQQPQTILRI VSWIFSIVIF
E_eleg_NP_50923  .........M ENVRAYGAGL AGANFDKNTF FKKPTVLFRC AALLFGLILW
A_gamb_EAA05258  .NEANR.SKM DIGGAYGGGK AGGAFDPIAF VQRPTVILRA VCWLFAIIVF
chick_syngr1_XP  .......... .......... .......... .......... ...VFSIVVF
bee_XP_624778_1  .......... .......... .......... .......... ..........


homoSYNGR1a_NP_  GSIVNEGYLN SASEGEEFCI YNRNPNACSY GVAVGVLAFL TCLLYLALDV
X_laev_syngr1_B  GCIINEGYIN SSTEEEEHCI FNRNPSACSY GVTVGVLAFL TCLLYLAVDI
X_trop_syngr1_N  GCIINEGYIN ASTEAEEHCI FNRNSSACAY GVTVGVLAFL TCLLYLAVDV
homo_SYNGR3_NP_  GPIVNEGYVN TDSGPELRCV FNGNAGACRF GVALGLGAFL ACAAFLLLDV
homo_SYNGR2_AAH  SCIYGEGYSN AHESKQMYCV FNRNEDACRY GSAIGVLAFL ASAFFLVVDA
D_rerio_XP_6969  SSITAEGYVN STDEAEVRCV FNRNDGACHY GVGIGVIAFL ACVGFLLADA
T_nigr_CAF89717  ASIVNEGYVN IGSE.RLYCV FNKNADACNY GVFVGLVGLL ACSFFGLLDY
D_mela_NP_61090  GCISSEGWTE K.DG.KEYCL YNGDGMACKY GNMVGVFGFL ASMGFMGGEF
D_rerio_syngr1_  GCIANEGYVN RPDEVEEFCI FNRNQNACNY AVGMGALDFL CCAAFLALDI
E_eleg_NP_50923  YSVSKGGWHK PSDAIHPVCL YGRSSSTCSF ATAVGFFAVC GAIVLIVLDA
A_gamb_EAA05258  GCISSEGWRE EANG.KEYCI INRDGNACNY AVGIGVIAFL AAMGFIAGEY
chick_syngr1_XP  GSIVNEGYVN RLDESEEHCI FNRNRNACNY GITVGVLAFL SCLLYLALDA
bee_XP_624778_1  .......... .......... .......... .......... ..........


homoSYNGR1a_NP_  YFPQISSVKD RKKAVLSDIG VSAFWAFLWF VGFCYLANQ. ..WQVSKPKD
X_laev_syngr1_B  YFPQISSVKD RKKTVISDIA VSALWAFFWF VGFCFLANQ. ..WQVSNPND
X_trop_syngr1_N  YFPQISSVKD RKKTVISDIA VSGLWAFFWF VGFCFLANQ. ..WQVSNPND
homo_SYNGR3_NP_  RFQQISSVRD RRRAVLLDLG FSGLWSFLWF VGFCFLTNQ. ..WQRTAPGP
homo_SYNGR2_AAH  YFPQISNATD RKYLVIGDLL FSALWTFLWF VGFCFLTNQ. ..WAVTNPKD
D_rerio_XP_6969  ILPLISNAQE RKYIVMADLA FSGCWTFLWF VCFCLTADQ. ..WSKTSDRS
T_nigr_CAF89717  KFSSISSIKD RKKAVMLEIG FSGFWTFLYF VSFCFLANQ. ..WSRTTPDE
D_mela_NP_61090  LFERMSSVKS RKRYVMADMG FSALWTFMYF VAFLYLWSQ. ..WSSSAPPP
D_rerio_syngr1_  YFPQISSVKD RKKAVLADIG VSAFWSFVWF VGFCFLANQ. ..WQVANPED
E_eleg_NP_50923  KMDQISSVPT RRRAVLADLV VSAIFTAIFL IGFFTFWSKL SAFEVDEDDE
A_gamb_EAA05258  LFEQMSSVKT RKHYVLADLG FSAFWSFLFF IGFCYLTNQ. ..WGKADDPP
chick_syngr1_XP  YFPQISSVKD RKKAVLSDIG VSAFWAFLWF VGFCFLTNQ. ..WQASKEED
bee_XP_624778_1  ....MSSVKT RKHFVLLDLG FSGFWAFLYF VGFCYLTNA. ..WNKSETPK


homoSYNGR1a_NP_  NPLNEGTDAA RAAIAFSFFS IFTWAGQAVL AFQRYQIGAD SAL...FSQD
X_laev_syngr1_B  NPMNEGADAA RAAITFSFFS IFTWAGQAVL AYQQYRLGSD SAL...FSQD
X_trop_syngr1_N  NPMNEGADAA RAAIAFSFFS IFTWAGQAVL AYQRYRLGSD SAL...FSQD
homo_SYNGR3_NP_  ATTQ.AGDAA RAAIAFSFFS ILSWVALTVK ALQRFRLGTD M.....SLFA
homo_SYNGR2_AAH  VLV..GADSV RAAITFSFFS IFSWGVLASL AYQRYKAGVD D.....FIQN
D_rerio_XP_6969  GI...PTDAV HAVIAFSFFS IASWGALTYF AVVRFRQGVE E.....VTQS
T_nigr_CAF89717  VPLDQGADAA RAAIAFSFFS IITWAGLTVR AVQKYLLGTD MSL...FTTE
D_mela_NP_61090  LGI..GAGSM KTAIWFCLFS IVSWALCALM AYKRFLIGAG DEF...TSAF
D_rerio_syngr1_  NPLKEGADAA RAAITFAFFS IFTWAGQAFF GFQRYKLGSS SSL...FSQD
E_eleg_NP_50923  NPI..KTNNA KFGILSALLS FLAWGGAAFF AWRRYEEGNQ ATHEPNYDEH
A_gamb_EAA05258  NGE..GVNNV QASIVFSFFS IFTWAGCAYF AFLRFKAGVD PSF...SSTY
chick_syngr1_XP  NPMNEGGDAA RAAITFSFFS IFTWV..... .......... ..........
bee_XP_624778_1  DNY..GVNNV QSAIAFSFFS IFTWAACAWF AFQRFKQGTD AAF...APSY


homoSYNGR1a_NP_  YMDPSQDSSM PYAP..YVEP TGPDPAGMGG TYQQPANTF. DTEP..QGYQ
X_laev_syngr1_B  YMDPSQDQGP PYPP..YASN EDLDPS...A GYQQPPTEAY DAGS..HGYQ
X_trop_syngr1_N  YMDPSQDQGP PYPP..YASN EDLDPS...A GYQQPPSDAY DAGS..QGYQ
homo_SYNGR3_NP_  TEQLSTGASQ AYPG..YPVG SGVEGT...E TYQSPPFTET LDTSP.KGYQ
homo_SYNGR2_AAH  YVDPTPDPNT AYAS..YPGA S....V...D NYQQPPFTQN AETT..EGYQ
D_rerio_XP_6969  YTDPPPDLSS PYPSTYTPPT YPSFQNTGAD IYQQPPFTPN PDPSGQTSFQ
T_nigr_CAF89717  HLDGAAPTQR LPLQLTCRRH RRDHGGLPEP PLHREQRSTH LPGSHLLDQG
D_mela_NP_61090  ETDPANV... VHQQAYGYSM DNDNDQYSAS PFGQPQQGGM EQQQSGMEYQ
D_rerio_syngr1_  YTDPSQDPAA APSE...... .....GTEYT GYNADMEANY DGS...GGYQ
E_eleg_NP_50923  FGQVSTDVQD GYGY...... .GGDSTGIGH VGAPPPQSSY QSGAAPQTMQ
A_gamb_EAA05258  ESDPSAAQQY AAYP...... ....ASNEND QFQEAPF... ..........
chick_syngr1_XP  .......... .......... .......... .......... ..........
bee_XP_624778_1  EADPVGGTGY TSYP...... .....DATDT GYQEPPFGQQ QQQQQ.QQQQ


homoSYNGR1a_NP_  .SQGY..... .....
X_laev_syngr1_B  .TQDY..... .....
X_trop_syngr1_N  .TQDY..... .....
homo_SYNGR3_NP_  .VPAY..... .....
homo_SYNGR2_AAH  PPPVY..... .....
D_rerio_XP_6969  .PPVY..... .....
T_nigr_CAF89717  .SPRRARSSA K....
D_mela_NP_61090  .QPTY..... .....
D_rerio_syngr1_  .NQDY..... .....
E_eleg_NP_50923  QPPSNPYTQS EGYGY
A_gamb_EAA05258  .......... .....
chick_syngr1_XP  .......... .....
bee_XP_624778_1  QQQQQRMPDF RAPAY


ALN

CLUSTAL FORMAT for T-COFFEE Version_3.93, CPU=12.81 sec, SCORE=52, Nseq=13, Len=276 

A.gamb_EAA05258.2             -NEANRS-KMDIG-GAYGGGKAGGAFDPIAFVQRPTVILRAVCWLFAIIVFGCISSEGWR
bee_XP_624778.1               ------------------------------------------------------------
D.mela_NP_610908.1            MDMLNQILSINNG-GAYGGGKAGGAFDPLTFAMKPQVVIRALCWLFSVVVFGCISSEGWT
E.eleg_NP_509239.1            ---------MENV-RAYGAGLAGANFDKNTFFKKPTVLFRCAALLFGLILWYSVSKGGWH
D.rerio_XP_696934.1           --------MESRS-VAYGASLAGAGFDLVKFIKQPQTVVRFLSWVFAIVVFSSITAEGYV
homo_SYNGR2_AAH00407.1        ----------MES-GAYGAAKAGGSFDLRRFLTQPQVVARAVCLVFALIVFSCIYGEGYS
D.rerio_syngr1_AAH83255.1     ----------M-D-QAYGAGKAGGTFDPITFFQQPQTILRIVSWIFSIVIFGCIANEGYV
X.laev_syngr1_BAB79596.1      ----------MEG-GAYGAGKAGGAFDPQTFIRQPHTILRMVSWVFSIVVFGCIINEGYI
X.trop_syngr1_NP_001016195.1  ----------MEG-GAYGAGKAGGAFDPQTFVRQPHTVLRMVSWVFSIVVFGCIINEGYI
chick_syngr1_XP_423200.1      --------------------------------------------VFSIVVFGSIVNEGYV
homoSYNGR1a_NP_004702.2       ----------MEG-GAYGAGKAGGAFDPYTLVRQPHTILRVVSWLFSIVVFGSIVNEGYL
T.nigr_CAF89717.1             ----------MDGVGSFGAGRTGSAVDPIAFAKQPQTILRVLSWIFSLVVFASIVNEGYV
homo_SYNGR3_NP_004200.2       ----------MEG-ASFGAGRAGAALDPVSFARRPQTLLRVASWVFSIAVFGPIVNEGYV
                                                                                          

A.gamb_EAA05258.2             EEANG-KEYCIINRDGNACNYAVGIGVIAFLAAMGFIAGEYLFEQMSSVKTRKHYVLADL
bee_XP_624778.1               ---------------------------------------------MSSVKTRKHFVLLDL
D.mela_NP_610908.1            EKD-G-KEYCLYNGDGMACKYGNMVGVFGFLASMGFMGGEFLFERMSSVKSRKRYVMADM
E.eleg_NP_509239.1            KPSDAIHPVCLYGRSSSTCSFATAVGFFAVCGAIVLIVLDAKMDQISSVPTRRRAVLADL
D.rerio_XP_696934.1           NSTDEAEVRCVFNRNDGACHYGVGIGVIAFLACVGFLLADAILPLISNAQERKYIVMADL
homo_SYNGR2_AAH00407.1        NAHESKQMYCVFNRNEDACRYGSAIGVLAFLASAFFLVVDAYFPQISNATDRKYLVIGDL
D.rerio_syngr1_AAH83255.1     NRPDEVEEFCIFNRNQNACNYAVGMGALDFLCCAAFLALDIYFPQISSVKDRKKAVLADI
X.laev_syngr1_BAB79596.1      NSSTEEEEHCIFNRNPSACSYGVTVGVLAFLTCLLYLAVDIYFPQISSVKDRKKTVISDI
X.trop_syngr1_NP_001016195.1  NASTEAEEHCIFNRNSSACAYGVTVGVLAFLTCLLYLAVDVYFPQISSVKDRKKTVISDI
chick_syngr1_XP_423200.1      NRLDESEEHCIFNRNRNACNYGITVGVLAFLSCLLYLALDAYFPQISSVKDRKKAVLSDI
homoSYNGR1a_NP_004702.2       NSASEGEEFCIYNRNPNACSYGVAVGVLAFLTCLLYLALDVYFPQISSVKDRKKAVLSDI
T.nigr_CAF89717.1             NIGSE-RLYCVFNKNADACNYGVFVGLVGLLACSFFGLLDYKFSSISSIKDRKKAVMLEI
homo_SYNGR3_NP_004200.2       NTDSGPELRCVFNGNAGACRFGVALGLGAFLACAAFLLLDVRFQQISSVRDRRRAVLLDL
                                                                           :*.   *:  *: ::

A.gamb_EAA05258.2             GFSAFWSFLFFIGFCYLTNQWGK---AD--DPPNGEGVNNVQASIVFSFFSIFTWAGCAY
bee_XP_624778.1               GFSGFWAFLYFVGFCYLTNAWNK---SE--TPKDNYGVNNVQSAIAFSFFSIFTWAACAW
D.mela_NP_610908.1            GFSALWTFMYFVAFLYLWSQWSS---SA--PPPLGIGAGSMKTAIWFCLFSIVSWALCAL
E.eleg_NP_509239.1            VVSAIFTAIFLIGFFTFWSKLSAFEVDE--DDENPIKTNNAKFGILSALLSFLAWGGAAF
D.rerio_XP_696934.1           AFSGCWTFLWFVCFCLTADQWSK---TSDRS---GIPTDAVHAVIAFSFFSIASWGALTY
homo_SYNGR2_AAH00407.1        LFSALWTFLWFVGFCFLTNQWAV---TNPKD--VLVGADSVRAAITFSFFSIFSWGVLAS
D.rerio_syngr1_AAH83255.1     GVSAFWSFVWFVGFCFLANQWQV---ANPEDNPLKEGADAARAAITFAFFSIFTWAGQAF
X.laev_syngr1_BAB79596.1      AVSALWAFFWFVGFCFLANQWQV---SNPNDNPMNEGADAARAAITFSFFSIFTWAGQAV
X.trop_syngr1_NP_001016195.1  AVSGLWAFFWFVGFCFLANQWQV---SNPNDNPMNEGADAARAAIAFSFFSIFTWAGQAV
chick_syngr1_XP_423200.1      GVSAFWAFLWFVGFCFLTNQWQA---SKEEDNPMNEGGDAARAAITFSFFSIFTWV----
homoSYNGR1a_NP_004702.2       GVSAFWAFLWFVGFCYLANQWQV---SKPKDNPLNEGTDAARAAIAFSFFSIFTWAGQAV
T.nigr_CAF89717.1             GFSGFWTFLYFVSFCFLANQWSR---TTPDEVPLDQGADAARAAIAFSFFSIITWAGLTV
homo_SYNGR3_NP_004200.2       GFSGLWSFLWFVGFCFLTNQWQR---TAPGPATTQAG-DAARAAIAFSFFSILSWVALTV
                               .*. :: .::: *    .                   .  :  *  .::*: :*     

A.gamb_EAA05258.2             FAFLRFKAGVDPSFSSTYESDPSAAQQ--------------YAAYP-ASNE-----ND-Q
bee_XP_624778.1               FAFQRFKQGTDAAFAPSYEADPVGGTG--------------YTSYP-DATD-----TG--
D.mela_NP_610908.1            MAYKRFLIGAGDEFTSAFETDPANVVH--QQ----------AYGYS-MDND-----ND-Q
E.eleg_NP_509239.1            FAWRRYEEGNQATHEPNYDEHFGQVSTDVQD----------GYGYGGDSTG-----IG-H
D.rerio_XP_696934.1           FAVVRFRQGVEEV--TQSYTDPPPDLSSPYPS--TYTP----PTYP-SFQNT-GAD---I
homo_SYNGR2_AAH00407.1        LAYQRYKAGVDDF--IQNYVDPTPDPNTAYAS----------------YPGA-SVD---N
D.rerio_syngr1_AAH83255.1     FGFQRYKLGSSSSLFSQDYTDPSQDPAAA------------PSEGT-EYT---------G
X.laev_syngr1_BAB79596.1      LAYQQYRLGSDSALFSQDYMDPSQDQGPP------------YPPYA-SNEDLDPSA---G
X.trop_syngr1_NP_001016195.1  LAYQRYRLGSDSALFSQDYMDPSQDQGPP------------YPPYA-SNEDLDPSA---G
chick_syngr1_XP_423200.1      ------------------------------------------------------------
homoSYNGR1a_NP_004702.2       LAFQRYQIGADSALFSQDYMDPSQDSSMP------------YAPYV-EPTGPDPAGMGGT
T.nigr_CAF89717.1             RAVQKYLLGTDMSLFTTEHLDGAAPTQRLPLQLTCRRHRRDHGGLP-EPPLHREQR---S
homo_SYNGR3_NP_004200.2       KALQRFRLGTDMSLFATEQLSTGASQA--------------YPGYP-VGSGVEGTE---T
                                                                                          

A.gamb_EAA05258.2             FQEAPF------------------------------
bee_XP_624778.1               YQEPPFGQQQQQQQQQQQQQQQQRMPDFRAPA-Y--
D.mela_NP_610908.1            YSASPFGQPQQGGMEQQQSGM-----EYQQPT-Y--
E.eleg_NP_509239.1            VGAPPPQSSYQSGAAPQTMQQPPSNPYTQSEG-YGY
D.rerio_XP_696934.1           YQQPPFTPNPDPSGQ----------TSFQPPV-Y--
homo_SYNGR2_AAH00407.1        YQQPPFTQNAETT------------EGYQPPPVY--
D.rerio_syngr1_AAH83255.1     YNADM-EANYD-GS-----------GGYQNQD-Y--
X.laev_syngr1_BAB79596.1      YQQPP-TEAYDAGS-----------HGYQTQD-Y--
X.trop_syngr1_NP_001016195.1  YQQPP-SDAYDAGS-----------QGYQTQD-Y--
chick_syngr1_XP_423200.1      ------------------------------------
homoSYNGR1a_NP_004702.2       YQQP--ANTFDTEP-----------QGYQSQG-Y--
T.nigr_CAF89717.1             THLPGSHLLDQGSP-----------RRARSSA-K--
homo_SYNGR3_NP_004200.2       YQSPPFTETLDTSP-----------KGYQVPA-Y--
                                                                  

Promoter analysis

  1. go to

[3]

  • humNSG2ex1 + 2000bp 5'
>chromosome:NCBI36:5:173403330:173468885:1
CTCCTGCCTCAGCCTCCCAAGTAGTTAGGATTACAGGCATGCGCCACCATGCCTGGCTAA
TTTTTGTATTTTTAGTAGAGACGGGATTTCACCATGTTGGTCAGGCTGGTCTCGAACTCC
TGACCTTGTGATCCACCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGCCAC
TGCTCCCGGCTGCCAATTTTTAAATCAGATTTTTGTTTTTTGGCTGTTGTTTGAGTTTCT
TATATATTCTGGGTATTAACCCCTGGTCAGATGCATAGTTTGCAGATATTTTCTTCCATT
TTGTAGGTTGTCTCTTTAGTCTGTTAATTATTTCCTTTGCTGTGCAGAAGCTTGGCTTGA
AATTATTAAAAGTGTGATTTCTATGTGTGAATCAATCACAAGACATGTCAAAAGGAATTT
CTTTTTCCTCTGGAGGAAAAAGATTAAACACAAGGGTTCACAATTGACTCCACGTGGTTA
TCCACATTGTGTTGTCACAATGATTGCACCTGAAAACTCATCTAGGTCTTTGGAAAGAAG
AGAATGGAGGACTAACATTTACTGGGACACTCCCATATTATGTATCTGCCTACTTCTATT
AATATAGCCCTGTGCAGCTCAGATGTGCTAGCTTTTGTTGTCTCAGAAAATGTCTCTCTC
TCTGTCTCTCCACACACGTGTCCCCCCCCCCCACACGTTTCATTACTGAACAATATATAT
AATACTTACGTTTTCTCCTGACCTTTTAGCTAGCAGAGTGTATGCTCAGCACATGAGCAC
TAACTAGGGTGACCTAACAATTTATCACATAAACTAGAATACTTTTGAAAGCAAAAGGGG
ACCCTATGACAAATGCCAAGACAGCTGGGCCTCATGAAGTGCATCATAAGCAGCACTGCA
GCTGGGAATGAACAACCATCAGTGACTTGAACCACAGCTTGGTGAAGACAGTGGCAACAG
CCTTGACCTCAGAGGATTTGAAGAGAATCTGTGAGCAATCCCCGGGGATGAAATAATGCA
GCCATTCAGAACCTTGTCCCTGAGCAGTAAGGGGTTTCCTTGAGCACAATCTCTTCTCAT
ATTTACAATCGCTCTCCTGAGCAATAAGGGGTTTCCTTGAGCACAATCTCTTCTCATATT
TACAATCGCTCTGCAAGAGAGATATTATTATTCTTACGCCAGAGAAAAGGAAATGGAGGT
TTAAAGAATAAATGCAAAATGTCCAAAATCCATACAGCCAGAATTTGGTAGAGCAGACAC
TAGGACTTTGCCCTGTCTGATTCCAGGCACCAACTTGGAAGCCATGCAATGTCCTGGGCT
GGATACAAATTTTGGTGGCCACCGATCAGGGTTCAAATTCCAGCTCTGGTTACCGTGGCT
GCCCTGTATGATCACAGTGCAATGTTGAGTGTAGTAGCTCATGTGAATACAGTGCTTGCC
ATAGTCCCTGCTCTGTACTACATGCATTCCCTATGCTATCCCATTTCATGCTTTGTACTA
ACCCCATCATGTGTGTACTATTAGAAACACCATTTTGCAAATGAGGGAGATAGTCTGAAA
GAGGCTGGAAAAGTGCCCACTGCATGAAAGATACACAATAAATATATTTCCCTTCTGTCC
TGACCACAGGGTCAGAGCACAGTAGAGGTCTTGAACCATTCTGTGTTTTATGATATTGTT
CTGGAAGGTCTTAGGAGGCAATGCCCTCTCCTAATGAATGTGCCTGCTTTTCCAGTTTTT
CCAAAGGCTCAAGGATTACTTGTTATATGGTGCATATTAATTATTTTTTTGGTGTGGCCA
AAATCCTTCCCACTTTCTTGAAACCATTTTGTCCCGTGCAGTGGTTGTGGCCATGCATAA
GAAACAGCAGGTGGTGCCAGATGACAGATGGTGTGGGAGCGCGTTCCCCGGTAGGAGGGG
GCGCGAGCGAGCAAGCAGGCAGGCAGCTGCCAGGAGCTCTTCCCTGCTCGCTCACGCCTG
CTCTCAGAAGCTCCGATCCAGACACACGCGAGGCGCTGTCCTTTCAGCACCACAAGCTCG
GGCTGAGGAGGGAGGACTCCTGGCCGTCCTCCTCCTCTTCAAATTGGCTTGAATCTGCTC
TGACCCCCCACGAGTGCAGCACA
  • humSYNGR3 exon1 + 2000bp 5'
>chromosome:NCBI36:16:1977969:1984376:1
GTGTTGGCCAGGCTGGTCTTGAACTCCTGACCTCATGAACCACCCACCTCAGCTTCCCAA
GGTGCTGAGATTACAAGCGTGAGCCACTGCGCCAGGCCAGAGTCTGTTTTTGAAGGCATC
CAGGCCAGTGGAACTCTAGTGCAAGGAAAAGTTCTGGCTTGAGCTGGTGTTCCAGAGCTG
TTGACACGCAGACCAAAGGAGTTAGCACAGAGGGAGAGACCTCCCCAGGATCGCGCCCTG
GGCTCCTAAGGCTCACAGGTCACAGAGGAAGGACCCATGAGGGAGGAGGACCTCCAGGAG
GGGCATAGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAAACCAAGGTGGGCAGATCAC
TTGAAGTCAGGAGTTCGAGACCAGCCTGGTCAACATGGCGAAACCCCGTCTCTACTAAAA
ATACAAAAAAAAAAAAAAAAAAAATTAGCTGGGTGCGGTGGCACACCTGGAATCCCAGCT
ACTCCAGTGGCTGAGGTAGGAGAATCACTTGAGCCCCTGAGGCAAAGCTTGCAGTGAGCC
AAGATTGCGCCACTGCACTCCAGCCTGGGTAACAGAGTGCGACTGTAACTCAAAAAATAA
ATAAATAACTAAAGAAATAAATGAGTCGACAGCAGCATCAGACTTGCCCTTGGGATTGGC
ACGGAGAGGTCAGTGGCAACCTTGAGTTTAGGTGGCGGTGGGAGCTGATTCTGCCGGGTC
AGAGGGAGAGAGAGGAGGGTGTTGCAGGCAGAGTTTGCTGCTAAAGGAAGCAGTAACGTA
AGGCAGCTGGAGGGGAGGTGGGGTCAACAGTGTTTGGCTTTTAGGAGAAAGTGCAGCATG
TCTGGGTGCTGATGTTAGTGATGAAGTAGAGAATGAATGACACAGAGGGGAGGTCTGTGG
CCAACAGGTGAGAGGGGTTGTGATCGGACGCACAGTGTGATGGACTTGGAACACAGTGAG
TAGCGCCTCTGGCAACACCTGCTCTGCCCACCTGTGCCCAGCAAGGCTGAAGAATGAGCT
CCAGGGGGGCTGGGCCACGCAGCACCATCTGTGCACCCGGCTCGTGTAGCAGGGACCTGG
GTTGCTTATTATCTGAAGTACAGGGTGGTAACAGTCTCCGCCACTACACAGGTCAAGCGC
TTGGCGCTGTAAATGTCAGTGCAGACATCTCAGTGCTCTAGACAGAAACCTAGGAGTCAT
CTGAACTTCCAGCCTGCTAGGAACATGAGGAGAGGGACCTTGTTGCACCCAAGGCTGGGG
TCTGTGGAGGGGACTCAGCGGGACACAGGGCAGCCAGGCACCACCCCCACCCGACTCCTA
GTCTCTGATGCCGCTCCCTGCCCTCCATTCCAGACTAGGCGCCGCAAGTACGCTGGGGAG
ACCCAGGAGTAGGGAGGGCATTGGGAGCACCACCACCTGGCCACGGGCAAGGAGCGGAGA
CACCGAAACCAACACTCCCAGCGGCGCTGGCCACGGTGGCTCTGTCCCCTCCCTCAACAG
GTGCTCCTGGGGCCAACGCCTTTCTTCCCACCAGATCCTCCCCGCCAGAGGCTAGAAGCT
GTGATAGCAGCTAGAGCACAGTGGGGGGCCATGAGAAAGACCCCAGTTATCCATCTGGTC
TTCTTGGGACCAGGGCCAGGGAGCCGGTCCCCTCTCCCGTGGGTTGGGGGAAGATGCTGC
AGCCCGTGGACCTCCTACCCCTTTACTCCCTCACCCAAGTGCCCTTCCCAGAGGAGCGGA
CTCCTCCTGTCTGTCCTCCCGGCTCTAGCAAAGTCTGCGCCCAGCACCCGAGCCCCACCC
TGCCCCCGGGGACCTGGCTGGTGGGTTCCTGAGGATGGTCTCCATCTCGGGACCGGGGCA
GGCAGGTGAGGGTGGGGGATGGGAGGTGGGCGCGGCGGAGGGAGAGGAGGGACCCGGCCC
CGCGCGCATGGACCCAGTGGGGGGCGCGGGCGCGGCCCCGCCCCGTCCCGCGCGTCCCCG
CCGCGGCCGGCGCGCGCTCCCGGGAGGCGGCAGCGGCTGCAGCGTTGGTAGCATCAGCAT
CAGCATCAGCGGCAGCGGCAGCGGCCTCGGGCGGGGCCGGCCGGACGGACAGGCGGACAG
AAGGCGCCAGGGGCGCGCGTCCCGCCCGGGCCGGCCATGGAGGGCGCCTCCTTCGGCGCG
GGCCGCGCAGGGGCCGCCCTGGACCCCGTGAGCTTTGCGCGGCGGCCCCAGACCCTGCTC
CGGGTCGCGTCCTGG


  • mus Syngr3 exon1 + 2000bp 5'
>chromosome:NCBIM35:17:22866087:22874033:-1
CTGTGGAAAGCTGGTTTTCACTCTTTCAGCAATTCCTAATCGCGGGCACCTTCTGTGATT
CCTGGCCAGGAGCAACCCTGTCTCAAAAAATCAAAACAAAATAAACAAACAAACAAAAAA
ACCCGACAATAACTTTATTCACTTACTTTTGGTTTTTTGAGACACAGTTTCTCTGCCTAC
ATCTGCCTCCAAAGTGCTGGGATTAAAGGCATGCGTTACCACTGCCCAGCCAATAAATTT
ATTTTTAGAAAGTAAATCAAGCCAGGCGTGGTGACACATGCCTATAATCCCAGCACTTGG
GAGGCAGAAGCCAGTGAATTTCTGAGTTAGAGGCCAGCCTAGTCTACAGAGTTAGTTCCA
GGACAGCCACGGCTACACAGAGAAGAGAAACCCTGTCTTGAAAACCCAAAAAAAAAAAAA
AAAAAAAAAAAAGCTTCATTTTGTCCCTGTTCACAGAGCATCTGGGTCAAGACAAGCAGA
ACTTGTGTGAAGGAAGGTTATGGTTTGAGCTGGCATTTGAGAGCGGTCAGCACAGACAGA
GGGATAAGTACTCAGCAATGAAAAAAGGTAAACAGCAGAAGTGTAAGCTCATGGGTGAAC
CGGAAGCAAACATTTCCAAAATGCAGGGTAGGGGCTTGGTTATACTGGGTTTTGAGGGAA
TGGAGGAAGAGGCACTAGAAATAGATGGGCAACAGCTTGAGTTTGCTGCTAAGGGAGACA
GAAGTATGAGCACGAACTGTTTGTTTTGGTGATAATTGGGAATTTAAAACCCCATGCATG
CTACTAGCTAAGCAGGCATTCTACCACTGAGCTCAATACTCAGCCCAGGAATGGTGTTTT
TCAAAGAATATGCTGCATTGTTTGGGTGTTGATGAAAATGACGGAGGAGAAGGACAAAAT
GGGTGATCACACAGATCTGTTAATAGCTAAAGGAGATGGGATGGGATGCTCACCTGCATC
CAACTAGACTGAAACAAGACAGACCAGAGCCAGTGGTTCCTCCAGCTACACTTTCCTCTG
AAGAATGAACAACAGATCACACAAAGTAAAGTGTGTAGTCTCTATATGTGGGAGCTTTCC
TTATACCCAGCTCTGGTCCCAGGAAGACAGAGATAGGTAACTGGGGCCTTACCCATGATC
AACTACACCCTAGCTTCTGTGACCATGAAGGTGAAGCCATTCAGTCTCTTACCAGCCTGG
CACTGAAAGAGCATCAAGTGGCTCTGTTAGTCAGGCTGTGGGGCAGAGGAATCCCAGCAG
AGGTCAGGATAATTAGGCAACAGCTGGGACAAGATCTTATCTCTGGTAATTCTCCCCTCC
CAGAACTCAGGATTAATTAGCTACAGATCAACCCAGACCAGAGATGGAAAGGGCTTTTGG
AAGAAATGACTTGGCCTGAGGAGAACAGGCATGCTCAACCTTAACTTCTGCTGGCCACAG
TGGACTCTGCTTCCCAGACAGTACCCCTGACAGTGACAGAACTGCCACTCTCCCCACCTG
ACCCTGTTAGGAAGGTACAACCTATGAAGAAAAAGCCAGAATACAGGGGACATGTGAGCC
ACAGACAACACAAGTGTGCACAACACCTCTGAGCTGAGCTTTTCTTGATTCAAGGGCTAG
TGAGAACGCCCCGCCAGAGATTTACCTCTGGTCTTCTGAGGTTGAGGGCTCGTTCTCTCT
TCCTGAATGTAAAGGTCAAGATGCTGGGCCTCAGTTTCCTCTTACATACTCACCAAAAGG
CTCTCCTGATCAGAGAAGCAGGATGCTGCACTTGTCCTCCTGTCGATGCTCTTGGCTATG
ACAAAATCTGAGCTTACCTTCTCTTGCCCACCTCTAAACCCCATAAGGGCTTCGTTCTGT
GTCTCTTGAGAATGTCCCTATCTCCAACTCTGTCATACGGGGGAGAGCGAGTGGGAAGGA
TCCAGGGCAGGGCTCAGACCCCGGCGCATGGACCTAGTCGGGGGCGCTGGCTCAGCCCCG
CCCCGCGCGCCCCCGTCGCAGCCGACGCGCGCTCCCGGGAGGCGGCGGCAGAGGCAGCAT
CCACAGCATCAGCAGCCTCAGCTTCATCCCCGGGCGGTCTCCGGCGGGGAAGGCCGGTGG
GACAAACGGACAGAAGGCAAAGTGCCCGCAATGGAGGGAGCATCCTTTGGCGCGGGCCGT
GCGGGAGCTGCCTTTGATCCCGTGAGCTTTGCGCGGCGGCCCCAGACCCTGTTGCGGGTC
GTGTCCTGG

Batch Entrez Acc nos

AAA40104
AAH05595
AAH12710
AAH14854
AAH24644
AAH28901
AAH31452
AAH46616
AAH49971
AAH50927