User talk:Darek Kedra/sandbox 28: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
Darek Kedra (talk | contribs) |
Darek Kedra (talk | contribs) m (→Software list) |
||
Line 8: | Line 8: | ||
# TagDust: http://genome.gsc.riken.jp/osc/english/software/src/tagdust.tgz | # TagDust: http://genome.gsc.riken.jp/osc/english/software/src/tagdust.tgz | ||
# fastareformat from fastareformat exonerate-2.2.0 | # fastareformat from fastareformat exonerate-2.2.0 [http://www.ebi.ac.uk/~guy/exonerate/] | ||
# fixing fasta headers (gff fields) with python? small script | # fixing fasta headers (gff fields) with python? small script | ||
# GEM (problem with cores on different laptops...) | # GEM [http://algorithms.cnag.cat/wiki/The_GEM_library] | ||
## CAVEAT: (problem with cores on different laptops...) | |||
http://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%203/ | http://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%203/ | ||
# BWA http://sourceforge.net/projects/bio-bwa/files/ | # BWA http://sourceforge.net/projects/bio-bwa/files/ | ||
# Stampy http://www.well.ox.ac.uk/~gerton/software/Stampy/stampy-1.0.22r1848.tgz | # Stampy http://www.well.ox.ac.uk/~gerton/software/Stampy/stampy-1.0.22r1848.tgz | ||
# last http://last.cbrc.jp/ | # last http://last.cbrc.jp/ (the 362 versiona has split and splice-mapping options) | ||
# bowtie http://bowtie-bio.sourceforge.net/bowtie2/index.shtml (bowtie2) | # bowtie http://bowtie-bio.sourceforge.net/bowtie2/index.shtml (bowtie2) | ||
# samtools http://sourceforge.net/projects/samtools/files/ | # samtools http://sourceforge.net/projects/samtools/files/ | ||
Line 21: | Line 22: | ||
# bamtools https://github.com/pezmaster31/bamtools | # bamtools https://github.com/pezmaster31/bamtools | ||
## requires cmake: http://www.cmake.org/files/v2.8/cmake-2.8.12.tar.gz (or apt get) | |||
# bedtools http://code.google.com/p/bedtools/downloads/list | # bedtools http://code.google.com/p/bedtools/downloads/list | ||
#GATK http://www.broadinstitute.org/gatk/auth?package=GATK (download yourself: license!) | #GATK http://www.broadinstitute.org/gatk/auth?package=GATK (download yourself: license!) | ||
Line 29: | Line 31: | ||
# cufflinks http://cufflinks.cbcb.umd.edu/ (may require Boost libs!) | # cufflinks http://cufflinks.cbcb.umd.edu/ (may require Boost libs!) | ||
# GEMtools https://github.com/gemtools/gemtools | # GEMtools https://github.com/gemtools/gemtools | ||
==Introduction to Linux and the command line== | ==Introduction to Linux and the command line== |
Revision as of 06:20, 24 October 2013
Winterschool program
Software list
Basics
- linux Ubuntu 12.04.3 vs Debian 7.1 (think about 32 vs 64 bit versions)
- java http://www.java.com/en/download/linux_manual.jsp?locale=en
Specific tools 1
- TagDust: http://genome.gsc.riken.jp/osc/english/software/src/tagdust.tgz
- fastareformat from fastareformat exonerate-2.2.0 [1]
- fixing fasta headers (gff fields) with python? small script
- GEM [2]
- CAVEAT: (problem with cores on different laptops...)
http://sourceforge.net/projects/gemlibrary/files/gem-library/Binary%20pre-release%203/
- BWA http://sourceforge.net/projects/bio-bwa/files/
- Stampy http://www.well.ox.ac.uk/~gerton/software/Stampy/stampy-1.0.22r1848.tgz
- last http://last.cbrc.jp/ (the 362 versiona has split and splice-mapping options)
- bowtie http://bowtie-bio.sourceforge.net/bowtie2/index.shtml (bowtie2)
- samtools http://sourceforge.net/projects/samtools/files/
- picard http://sourceforge.net/projects/picard/files/
- IGV/ IGVtools http://www.broadinstitute.org/software/igv/download
- bamtools https://github.com/pezmaster31/bamtools
- requires cmake: http://www.cmake.org/files/v2.8/cmake-2.8.12.tar.gz (or apt get)
- bedtools http://code.google.com/p/bedtools/downloads/list
- GATK http://www.broadinstitute.org/gatk/auth?package=GATK (download yourself: license!)
- vcftools http://sourceforge.net/projects/vcftools/files/
Specific tools 2/RNA-Seq
- tophat http://tophat.cbcb.umd.edu/
- cufflinks http://cufflinks.cbcb.umd.edu/ (may require Boost libs!)
- GEMtools https://github.com/gemtools/gemtools
Introduction to Linux and the command line
- why Linux?
- logging in, connecting to other servers with ssh / sftp
- copy, rename/move files, create directories, symbolic links
- view files (more/less, head, tail), count (wc)
- search for strings / replace strings (grep & sed)
- compressing / uncompressing files (gzip, bzip2, tar)
- pipelines and redirection
- awk in 5 minutes
- where to go from there (clusters, python)
FASTQ
- Illumina file formats (quality encodings)
- paired / unpaired reads
- quality checking (fastqc)
- trimming & filtering (TagDust)
- source of published FASTQ data: Short Read Archive vs ENA
Genomic fasta and gtf/gff gene annotation
- resources at ENSEMBL
- basic checks and reformatting
- grepping fasta headers
- fasta reformat from exonerate??
Mapping genomic reads
- overview of mappers
- GEM
- bwa +/- stampy
- last / bowtie
- mapping steps (for each mapper)
- genome indexing
- mapping
- +/- postprocessing
SAM and BAM file formats
- Analyzing BAM files
- sorting / indexing
- viewing the mappings in IGV
tools for processing BAM files
- samtools
- picard
- bamtools
getting mapping stats
- extracting reads mapping to regions
- getting coverage info for selected regions
Detecting SNPs
- general procedure
- GATK pipeline
- other SNP calling programs [tba]
Working with VCF files
- VCF file format
- viewing VCFs in IGV
- filtering SNPs by quality
- set operations on VCF files (common SNPs, unique SNPs)
RNASeq
- caveats (ribosomal RNA contamination)
- mapping RNASeq
- tophat
- GRAPE
- creating gene models from RNASeq (cufflinks)