User:Lindenb/Notebook/UMR915/20101210
From OpenWetWare

Belgium
/usr/local/package/mosaik-aligner/bin/MosaikSort -in align2.mka -out align2.sorted.mka MosaikText -in align2.sorted.mka -bam sample2.bam
ReadGroupCovariate with GATK:
create a subset of dbsnp_129.rod for my ranges
java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXX.fa -T CountCovariates -l INFO -recalFile recal_data1.csv -cov ReadGroupCovariate -cov QualityScoreCovariate -cov CycleCovariate -cov DinucCovariate --DBSNP dbsnp_129_chrXXXX.rod -U
sample1:
INFO 15:31:10,995 TraversalEngine - [PROGRESS] Traversed 5,625,164 sites in 73.96 secs (13.15 secs per 1M sites) INFO 15:31:10,996 TraversalEngine - Total runtime 73.96 secs, 1.23 min, 0.02 hours INFO 15:31:10,998 TraversalEngine - 89 reads were filtered out during traversal out of 187197 total (0.05%) INFO 15:31:10,998 TraversalEngine - -> 89 reads (0.05% of total) failing ZeroMappingQualityReadFilter
sample2:
(...)
what recal_data1.csv does look like ?
# Counted Sites 4619191 # Counted Bases 80003987 # Skipped Sites 18142 # Fraction Skipped 1 / 255 bp ReadGroup,QualityScore,Cycle,Dinuc,nObservations,nMismatches,Qempirical ZDID8XTBKGO,7,106,AA,2,0,40 ZDID8XTBKGO,7,107,AT,2,0,40 ZDID8XTBKGO,7,107,NN,1,0,40 ZDID8XTBKGO,7,107,TT,5,0,40 ZDID8XTBKGO,7,108,AA,1,0,40 ZDID8XTBKGO,7,108,CT,2,0,40 ZDID8XTBKGO,7,108,GA,1,0,40 ZDID8XTBKGO,7,108,GT,1,0,40 ZDID8XTBKGO,7,108,TA,1,1,1 ZDID8XTBKGO,7,108,TT,1,0,40
TableRecalibration
java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXX.fa -T TableRecalibration -l INFO -recalFile recal_data1.csv -o sample1.recal.bam -U java -jar GenomeAnalysisTK.jar -I sample2.bam -R chrXXXX.fa -T TableRecalibration -l INFO -recalFile recal_data2.csv -o sample1.recal.bam -U
calling with GATK
with a list of regions
java -jar GenomeAnalysisTK.jar -I sample1.recal.bam -R chrXXXXXXXXXXXXX.fa -T UnifiedGenotyper -o sample1.vcf -U -S SILENT -L ranges.list java -jar GenomeAnalysisTK.jar -I sample2.recal.bam -R chrXXXXXXXXXXXXX.fa -T UnifiedGenotyper -o sample2.vcf -U -S SILENT -L ranges.list
java -jar GenomeAnalysisTK.jar -I sample1.recal.bam -R chrXXXX.fa -T IndelGenotyperV2 -o sample1.indels.vcf -U -S SILENT -bed jeter.out.bed -verbose jeter.verbose.txt --refseq chrXXXX.refGene.rod java -jar GenomeAnalysisTK.jar -I sample2.recal.bam -R chrXXXX.fa -T IndelGenotyperV2 -o sample2.indels.vcf -U -S SILENT -bed jeter.out.bed -verbose jeter.verbose.txt --refseq chrXXXX.refGene.rod
NO result.
Asked at GATK support: http://gsfn.us/t/1ytli
GATK IndelGenotyperV2 does not support the 454 Data :-/