Difference between revisions of "Wikiomics:Ensembl tutorials"
(→Getting promoter sequences using BioMart for a list of IDs)
m (1 revision(s))
Revision as of 21:00, 19 November 2007
- Ensembl worked example (PDF) 28 steps of viewing contigs, finding information about a gene, its homologues, performing a blast seaerch etc.
- Workshop in Canada A bit dated (2004?) in a part describing Ensmart but otherwise good tutorial.
Getting promoter sequences using BioMart for a list of IDs
- go to BioMart
- click on Central Server link
- from a pull down menu 'CHOOSE DATABASE' select ENSEMBL 45 GENES
- from a new pull down menu 'CHOOSE DATASET' select organism of interest, i.e. Rattus norvegicus genes
- on a left side a frame appears. Click on Filters
- in the central panel click on + next to Gene
- check a box next to ID list limit
- from pull down menu right to ID list limit select type of your IDs, i.e. Unigene ID(s)
- either cut and paste your IDs (try four UniGene entries below) in a form or select text file with your IDs, one ID per line.
Rn.141972 Rn.34912 Rn.6497 Rn.88235
- click on the Left Panel Atributes
- click on the Central Panel on the top: Radio button (round)Sequences
- in the Central Panel click on + next to SEQUENCES
- depending on what you want to analyse click on Radio button (round) next to Flank (Transcript) (makes sense if there are multiple startng exons) or Flank (Gene) (one flank per gene)
- check box next to Upstream flank, put desired flank size, i.e. 1000
- you will get:
>1|ENSRNOG00000021093|protein_coding GTCACTAGGCTCTTAGACCACGGATGGGCGGTACCTGTATCAGGAGGCGGAGCAGCTGCT <snip> >10|ENSRNOG00000012620|protein_coding ATGATCTCTAACTACACCCTAGCTTCTGTGACCACGAAGATGGAGCCATTCAGTCTCCAA <snip> >7|ENSRNOG00000017108|protein_coding GTTTAATAGCCGCCTCTAAGATTTCTCATGGGTATGGTAACACAGGCCTGAAACTCCATT <snip>
- as you noted one of the entries in input IDs does not have proper entry recognised by ENSEMBL.