User:R. Eric Collins/GenomicsTutorial/Genomics/Mutant

Overview
The purpose of this exercise is to become familiar with the following:

Concepts

 * Taxonomy
 * Shotgun sequencing

Techniques

 * Short read mapping
 * Visualization of reads

Software/Databases

 * NCBI Taxonomy (taxonomy database)
 * Galaxy (web server for genomics and metagenomics analysis)
 * Lastz (software for mapping small reads to a reference sequence)

Get Genome Sequence

 * go to the NCBI Taxonomy Browser
 * select your Domain of interest
 * click "Display" after filtering for "has genome sequences"
 * search or browse for a genome of interest
 * click the name of the organism
 * click on genome sequences on the right
 * click on the RefSeq ID for the genome of interest
 * click again on the ID
 * http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Undef&id=2&lvl=3&p=mapview&p=has_linkout&p=blast_url&p=genome_blast&lin=f&keep=1&srchmode=1&unlock&filter=genome_filter
 * click on FASTA
 * click "Send" and choose Destination --> File and Format --> FASTA and click "Create File"
 * file will be downloaded as sequence.fasta

Mutate Sequence

 * go the Galaxy and log in if you have an account
 * go to Get Data --> Upload File
 * choose file 'sequence.fasta' that you just downloaded from NCBI
 * below under 'genome' select the genome you downloaded (this will tag the file for future reference)
 * click 'Execute' and wait for the upload to complete
 * click the pencil next to the uploaded file and change the name to something sensible
 * go to EMBOSS --> msbar and mutate at will

Simulate short read data

 * go to EMBOSS --> splitter
 * split msbar output at size <100 with short overlaps

Align short reads to Reference genome

 * go to NGS: Mapping --> Lastz
 * Align sequencing reads in:
 * Against reference sequences that are:
 * Select a reference dataset:
 * Select output format:
 * click Execute
 * wait to finish
 * click pencil icon next to output dataset
 * change Data Type --> bed
 * change Database/Build --> the genome you chose in NCBI
 * save the Attributes

Visualize Polymorphisms

 * click graph icon in output dataset = Visualize in Trackster
 * choose Insert into New Browser
 * for Reference Genome Build use the same as your genome
 * change "Select Chrom/Contig" to "chr"