User talk:Darek Kedra/sandbox 28

From OpenWetWare

< User talk:Darek Kedra

Revision as of 04:10, 11 October 2013 by Darek Kedra (talk | contribs) (init page)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to navigation Jump to search

Winterschool program

Introduction to Linux and the command line

why Linux?
logging in, connecting to other servers with ssh / sftp
copy, rename/move files, create directories, symbolic links
view files (more/less, head, tail), count (wc)
search for strings / replace strings (grep & sed)
compressing / uncompressing files (gzip, bzip2, tar)
pipelines and redirection
awk in 5 minutes
where to go from there (clusters, python)

FASTQ

Illumina file formats (quality encodings)
paired / unpaired reads
quality checking (fastqc)
trimming & filtering (TagDust)
source of published FASTQ data: Short Read Archive vs ENA

Genomic fasta and gtf/gff gene annotation

resources at ENSEMBL
basic checks and reformatting

Mapping genomic reads

overview of mappers
1. GEM
2. bwa +/- stampy
3. last / bowtie
mapping steps (for each mapper)
genome indexing
mapping
+/- postprocessing

SAM and BAM file formats

Analyzing BAM files
sorting / indexing
viewing the mappings in IGV

tools for processing BAM files

samtools
picard
bamtools

getting mapping stats

extracting reads mapping to regions
getting coverage info for selected regions

Detecting SNPs

general procedure
GATK pipeline
other SNP calling programs [tba]

Working with VCF files

VCF file format
viewing VCFs in IGV
filtering SNPs by quality
set operations on VCF files (common SNPs, unique SNPs)

RNASeq

caveats (ribosomal RNA contamination)
mapping RNASeq
tophat
GRAPE
creating gene models from RNASeq (cufflinks)

Retrieved from "https://openwetware.org/mediawiki/index.php?title=User_talk:Darek_Kedra/sandbox_28&oldid=737204"

Navigation menu