User talk:Darek Kedra/sandbox 29
From OpenWetWare
				
				
				Jump to navigationJump to search
				
				
EMBO Tunis 2014
From sequencing data to knowledge
00 Programs used
sequence pre-processing
- SRA_toolkit ver current
- FastQC ver 0.11.2
- Trimmomatic ver 0.32
- TagDust ver 2.13
- Coral ver 1.4
general tools
- fastx_toolkit ver 0.0.13
- Samtools classic ver 0.1.19
- samtools/HTSlib ver 1.0
- Picard ver 1.119
mappers
Splice reader mappings
- fqgrep Github version plus
- TRE_library ver 0.80
viewers
quantification
SNPs discovery
- GATK ver 3.2-2
01 Data files used
FASTQ files
L.amazonensis RNA-Seq
L mexicana genomic DNA
(extra set) L.enriettii genomic DNA
Stuff to read / compare
File formats
- http://biobits.org/samtools_primer.html (file formats)
VCF
- http://vcftools.sourceforge.net/ (VCFTools)
BED
- http://genome.ucsc.edu/FAQ/FAQformat.html#format1
- http://www.broadinstitute.org/igv/BED
- http://www.ensembl.org/info/website/upload/bed.html
- http://bedtools.readthedocs.org/en/latest/ BEDTOOLS
GFF / GTF
Genomes and annotations
- L mexicana
- L.amazonensis
- L.enriettii
- L.major
Extra material
(optional) Stampy
Stampy is a quite slow but at times more accurate mapper, allowing for improvement over simple BWA mappings. The basic usage is as follows:
#creating two special index files stampy.py --species=Lmex --assembly=Lmex_toyasembly -G Lmex_toygenome Lmex_genome.nfix.fa #Result: Lmex_toygenome.stidx stampy.py -g Lmex_toygenome -H Lmex_toyasembly #Result: Lmex_toyasembly.sthash #remapping reads already mapped with BWA (prefered option) stampy.py -g Lmex_toygenome -h Lmex_toyasembly -t2 --bamkeepgoodreads -M LmxM.01_ERR307343_12.Lmex.bwa_mem.Lmex.bam > LmxM.01_ERR307343_12.Lmex.bwa_mem.Lmex.stampy.sam #convert SAM to BAM, sort and index BAM file: java -jar ~/soft/picard_1.119/SortSam.jar \ I=LmxM.01_ERR307343_12.Lmex.bwa_mem.Lmex.stampy.sam \ O=LmxM.01_ERR307343_12.Lmex.bwa_mem.Lmex.stampy.bam \ SO=coordinate VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true #Result: LmxM.01_ERR307343_12.Lmex.bwa_mem.Lmex.stampy.bam LmxM.01_ERR307343_12.Lmex.bwa_mem.Lmex.stampy.bam
Mapping looking worse with this data than bwa
Quantifications of mapped reads
- Gene quantifications (DNA & RNA levels)