User:Lindenb/Notebook/UMR915/20101210

=Belgium= /usr/local/package/mosaik-aligner/bin/MosaikSort -in align2.mka -out align2.sorted.mka MosaikText -in align2.sorted.mka -bam sample2.bam

ReadGroupCovariate with GATK:
create a subset of dbsnp_129.rod for my ranges java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXX.fa -T CountCovariates -l INFO  -recalFile recal_data1.csv  -cov ReadGroupCovariate -cov QualityScoreCovariate  -cov CycleCovariate  -cov DinucCovariate --DBSNP dbsnp_129_chrXXXX.rod -U

sample1: INFO 15:31:10,995 TraversalEngine - [PROGRESS] Traversed 5,625,164 sites in 73.96 secs (13.15 secs per 1M sites) INFO 15:31:10,996 TraversalEngine - Total runtime 73.96 secs, 1.23 min, 0.02 hours INFO 15:31:10,998 TraversalEngine - 89 reads were filtered out during traversal out of 187197 total (0.05%) INFO 15:31:10,998 TraversalEngine -   -> 89 reads (0.05% of total) failing ZeroMappingQualityReadFilter sample2: (...)

what recal_data1.csv does look like ? # Counted Sites   4619191 ReadGroup,QualityScore,Cycle,Dinuc,nObservations,nMismatches,Qempirical ZDID8XTBKGO,7,106,AA,2,0,40 ZDID8XTBKGO,7,107,AT,2,0,40 ZDID8XTBKGO,7,107,NN,1,0,40 ZDID8XTBKGO,7,107,TT,5,0,40 ZDID8XTBKGO,7,108,AA,1,0,40 ZDID8XTBKGO,7,108,CT,2,0,40 ZDID8XTBKGO,7,108,GA,1,0,40 ZDID8XTBKGO,7,108,GT,1,0,40 ZDID8XTBKGO,7,108,TA,1,1,1 ZDID8XTBKGO,7,108,TT,1,0,40
 * 1) Counted Bases    80003987
 * 2) Skipped Sites    18142
 * 3) Fraction Skipped 1 / 255 bp

TableRecalibration
java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXX.fa  -T TableRecalibration -l INFO   -recalFile  recal_data1.csv -o sample1.recal.bam -U java -jar GenomeAnalysisTK.jar -I sample2.bam -R chrXXXX.fa  -T TableRecalibration -l INFO   -recalFile  recal_data2.csv -o sample1.recal.bam -U

calling with GATK
with a list of regions java -jar GenomeAnalysisTK.jar -I sample1.recal.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -U -S SILENT -L ranges.list java -jar GenomeAnalysisTK.jar -I sample2.recal.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample2.vcf  -U -S SILENT -L ranges.list

java -jar GenomeAnalysisTK.jar -I sample1.recal.bam -R chrXXXX.fa -T IndelGenotyperV2 -o sample1.indels.vcf   -U -S SILENT   -bed jeter.out.bed -verbose jeter.verbose.txt  --refseq  chrXXXX.refGene.rod java -jar GenomeAnalysisTK.jar -I sample2.recal.bam -R chrXXXX.fa -T IndelGenotyperV2 -o sample2.indels.vcf   -U -S SILENT   -bed jeter.out.bed -verbose jeter.verbose.txt  --refseq  chrXXXX.refGene.rod

NO result. Asked at GATK support: http://gsfn.us/t/1ytli

GATK IndelGenotyperV2 does not support the 454 Data :-/