User:R. Eric Collins/GenomicsTutorial/Genomics/Mutant
From OpenWetWare
Jump to navigationJump to search
Overview
The purpose of this exercise is to become familiar with the following:
Concepts
- Taxonomy
- Shotgun sequencing
Techniques
- Short read mapping
- Visualization of reads
Software/Databases
- NCBI Taxonomy (taxonomy database)
- Galaxy (web server for genomics and metagenomics analysis)
- Lastz (software for mapping small reads to a reference sequence)
Protocol
Get Genome Sequence
- go to the NCBI Taxonomy Browser
- select your Domain of interest
- click "Display" after filtering for "has genome sequences"
- search or browse for a genome of interest
- click the name of the organism
- click on genome sequences on the right
- click on the RefSeq ID for the genome of interest
- click again on the ID
- http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=2&lvl=3&p=mapview&p=has_linkout&p=blast_url&p=genome_blast&lin=f&keep=1&srchmode=1&unlock&filter=genome_filter
- click on FASTA
- click "Send" and choose Destination --> File and Format --> FASTA and click "Create File"
- file will be downloaded as sequence.fasta
Mutate Sequence
- go the Galaxy and log in if you have an account
- go to Get Data --> Upload File
- choose file 'sequence.fasta' that you just downloaded from NCBI
- below under 'genome' select the genome you downloaded (this will tag the file for future reference)
- click 'Execute' and wait for the upload to complete
- click the pencil next to the uploaded file and change the name to something sensible
- go to EMBOSS --> msbar and mutate at will
Simulate short read data
- go to EMBOSS --> splitter
- split msbar output at size <100 with short overlaps
Align short reads to Reference genome
- go to NGS: Mapping --> Lastz
- Align sequencing reads in: <splitter output>
- Against reference sequences that are: <in your history>
- Select a reference dataset: <your genome fasta file>
- Select output format: <polymorphisms>
- click Execute
- wait to finish
- click pencil icon next to output dataset
- change Data Type --> bed
- change Database/Build --> the genome you chose in NCBI
- save the Attributes
Visualize Polymorphisms
- click graph icon in output dataset = Visualize in Trackster
- choose Insert into New Browser
- for Reference Genome Build use the same as your genome
- change "Select Chrom/Contig" to "chr"