User:Lindenb/Notebook/UMR915/20101124

Cover
waiting for http://www.cell.com/immunity/home ...
Sysadmin
meeting with SC : we need space
Dindel
installing dindel from sources http://www.sanger.ac.uk/resources/software/dindel . Edit the Makefile and set the path to the samtools src then 'make'.
g++ -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 -DNDEBUG -D_IOLIB=2 -DMINREADS=2 -DDINDEL -c -o DInDel.o DInDel.cpp Library.hpp: In member function ‘void Library::calcProb(const std::vector<double, std::allocator<double> >&)’: Library.hpp:89: warning: converting to ‘int’ from ‘const double’ g++ -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 -DNDEBUG -D_IOLIB=2 -DMINREADS=2 -DDINDEL -c -o HapBlock.o HapBlock.cpp g++ -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 -DNDEBUG -D_IOLIB=2 -DMINREADS=2 -DDINDEL -c -o HaplotypeDistribution.o HaplotypeDistribution.cpp g++ -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 -DNDEBUG -D_IOLIB=2 -DMINREADS=2 -DDINDEL -c -o ObservationModelFB.o ObservationModelFB.cpp Library.hpp: In member function ‘void Library::calcProb(const std::vector<double, std::allocator<double> >&)’: Library.hpp:89: warning: converting to ‘int’ from ‘const double’ g++ -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 -DNDEBUG -D_IOLIB=2 -DMINREADS=2 -DDINDEL -c -o GetCandidates.o GetCandidates.cpp Library.hpp: In member function ‘void Library::calcProb(const std::vector<double, std::allocator<double> >&)’: Library.hpp:89: warning: converting to ‘int’ from ‘const double’ g++ -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 -DNDEBUG -D_IOLIB=2 -DMINREADS=2 -DDINDEL -c -o Faster.o Faster.cpp Library.hpp: In member function ‘void Library::calcProb(const std::vector<double, std::allocator<double> >&)’: Library.hpp:89: warning: converting to ‘int’ from ‘const double’ g++ -o dindel -I/usr/local/package/samtools-0.1.10 -Iseqan_library/ -I./ -Wno-deprecated -O3 DInDel.o HapBlock.o HaplotypeDistribution.o ObservationModelFB.o GetCandidates.o Faster.o -L/usr/local/package/samtools-0.1.10 -lbam -lz -lboost_program_options -static /usr/local/package/samtools-0.1.10/libbam.a(knetfile.o): In function `socket_connect': /home/lindenb/samtools-0.1.10/knetfile.c:99: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
but
[lindenb@srv-clc-02]$ /usr/local/package/dindel-1.01-src/dindel --analysis getCIGARindels --bamFile sampe.sorted.bam --outputFile dindel_output --ref mygene.fa Error parsing input options. Usage: (...)
fixed the bug by changing "bamFiles" to "bamList" in /usr/local/package/dindel-1.01-src/DInDel.cpp (seems to be a bug in BOOST http://lists.boost.org/Archives/boost/2006/01/98811.php ).
/usr/local/package/dindel-1.01-src/dindel --analysis getCIGARindels --bamFile sampe.sorted.bam --outputFile dindel_output --ref mygene.fa Reading BAM file: sampe.sorted.bam Parsing indels from CIGAR strings... Library: dindel_default mean: 207.5 stddev: 40.9283 Wrote indels in CIGARS for target XXXXXXX to file dindel_output Wrote library insert sizes to dindel_output.libraries.txt done! head dindel_output.libraries.txt #LIB dindel_default 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 (...)
but makeWindows.py described in the manual is not available in the sources... Hum switching to the binaries...
ok, restart
python /usr/local/package/dindel-1.01/dindel-1.01-python/makeWindows.py --inputVarFile dindel_output.variants.txt --windowFilePrefix sample.realign_windows --numWindowsPerFile 1000 (...) Number of candidates: 604 Number of windows: 494 Maximum window size: 161 Mean window size: 121 Chromosome: chr7_random Total lines: 173 at minimum distance 20 Number of candidates: 173 Number of windows: 131 Maximum window size: 141 Mean window size: 121 Chromosome: chrX_random Total lines: 200 at minimum distance 20 Number of candidates: 200 Number of windows: 163 Maximum window size: 141 Mean window size: 121 Chromosome: chr9 Total lines: 41545 at minimum distance 20 Number of candidates: 41545 Number of windows: 27168 Maximum window size: 848 Mean window size: 122 Chromosome: chr8 Total lines: 38569 at minimum distance 20 Number of candidates: 38569 Number of windows: 25891 Maximum window size: 201 Mean window size: 122 Chromosome: chr16_random Total lines: 12 at minimum distance 20 Number of candidates: 12 Number of windows: 10 Maximum window size: 123 Mean window size: 119 Chromosome: chr10 Total lines: 46560 at minimum distance 20 Number of candidates: 46560 Number of windows: 30113 Maximum window size: 321 Mean window size: 122 Chromosome: chr17_random Total lines: 763 at minimum distance 20 Number of candidates: 763 Number of windows: 587 Maximum window size: 161 Mean window size: 121
Many files are generated... Then for each result file one should
/usr/local/package/dindel-1.01/binaries/dindel-1.01-linux-64bit --analysis indels --doDiploid --bamFile recal_bwa_rmdup_XXXX.bam --outputFile sample.dindels.1 --ref /GENOTYPAGE/data/pubdb/ucsc/hg18/chromosomes/hg18.fa --varFile sample.realign_windows.1.txt --libFile dindel_output.libraries.txt head sample.dindels.1.glf.txt msg index analysis_type tid lpos rpos center_position realigned_position was_candidate_in_window ref_all nref_all num_reads post_prob_variant qual est_freq logZ hapfreqs indidx msq numOffAll num_indel num_cover_forward num_cover_reverse num_unmapped_realigned var_coverage_forward var_coverage_reverse nBQT nmmBQT mLogBQ nMMLeft nMMRight glf error_too_few_reads 1 NA chr6_random 417 536 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA error_too_few_reads 2 NA chr6_random 635 755 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA error_too_few_reads 3 NA chr6_random 2043 2162 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ok 4 dip.map chr6_random 5183 5302 5242 5243 1 NA +T 3 NA 0.0518175 NA NA NA 0 60 NA NA 1 0 0 1 0 NA NA NA NA NA 1/1:0.0518175 ok 4 dip chr6_random 5183 5302 5242 5243 1 NA +T,+TT 3 NA NA NA -21.4465 NA 0 60 0 0 1 0 0 1,0 0,0 129 1 -3.22403 0 0 0/0:-26.2281,0/1:-22.1231,0/2:-25.6365,1/1:-21.4403,1/2:-22.1099,2/2:-25.2702 ok 4 dip chr6_random 5183 5302 5242 5280 0 NA R=>A 3 NA NA NA -24.8196 NA 0 60 0 1 1 0 0 0 0 128 0 -3.22188 0 0 0/0:-26.2159,0/1:-25.4643,1/1:-24.8013 ok 5 dip.map chr6_random 21407 21538 21473 21467 1 NA +C 3 NA 0.000108848 NA NA NA 0 60 NA NA 1 1 0 0 0 NA NA NA NA NA 0/1:0.000108848 ok 5 dip chr6_random 21407 21538 21473 21467 1 NA +C 3 NA NA NA -17.8254 NA 0 53.4447 0 0 0 0 0 0 0 162 0 -3.27222 0 0 0/0:-17.8254,0/1:-19.2092,1/1:-34.9632 ok 5 dip chr6_random 21407 21538 21473 21468 1 NA +T 3 NA NA NA -17.8254 NA 0 53.4447 0 0 0 0 0 0 0 162 0 -3.27222 0 0 0/0:-17.8254,0/1:-19.2114,1/1:-35.423
save the file names:
echo "sample.dindels.1.glf.txt" > sample.dindel_stage2_outputfiles.txt
and then...
python /usr/local/package/dindel-1.01/dindel-1.01-python/mergeOutputDiploid.py --inputFiles sample.dindel_stage2_outputfiles.txt --outputFile dindel.variants.vcf --ref /GENOTYPAGE/data/pubdb/ucsc/hg18/chromosomes/hg18.fa Number of non-empty GLF files: 1 Calling variants from GLF file sample.dindels.1.glf.txt more dindel.variants.vcf ##fileformat=VCFv4.0 ##source=Dindel ##reference=/GENOTYPAGE/data/pubdb/ucsc/hg18/chromosomes/hg18.fa ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total number of reads in haplotype window"> ##INFO=<ID=HP,Number=1,Type=Integer,Description="Reference homopolymer tract length"> ##INFO=<ID=NF,Number=1,Type=Integer,Description="Number of reads covering non-ref variant on forward strand"> ##INFO=<ID=NR,Number=1,Type=Integer,Description="Number of reads covering non-ref variant on reverse strand"> ##INFO=<ID=NFS,Number=1,Type=Integer,Description="Number of reads covering non-ref variant site on forward strand"> ##INFO=<ID=NRS,Number=1,Type=Integer,Description="Number of reads covering non-ref variant site on reverse strand"> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype quality"> ##ALT=<ID=DEL,Description="Deletion"> ##FILTER=<ID=q20,Description="Quality below 20"> ##FILTER=<ID=hp10,Description="Reference homopolymer length was longer than 10"> ##FILTER=<ID=fr0,Description="Non-ref allele is not covered by at least one read on both strands"> ##FILTER=<ID=wv,Description="Other indel in window had higher likelihood"> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE chrX_random 46862 . CTT C 42 PASS DP=2;NF=1;NR=1;NRS=1;NFS=1;HP=2 GT:GQ 1/1:6 chrX_random 49403 . g gC 3 q20 DP=2;NF=1;NR=0;NRS=1;NFS=1;HP=1 GT:GQ 0/1:3 chrX_random 61565 . ga g 49 PASS DP=5;NF=1;NR=1;NRS=1;NFS=1;HP=2 GT:GQ 1/1:6 chrX_random 84517 . G GA 9 q20 DP=2;NF=1;NR=1;NRS=1;NFS=1;HP=9 GT:GQ 1/1:6 chrX_random 149111 . c cTG 11 q20 DP=4;NF=0;NR=1;NRS=0;NFS=1;HP=1 GT:GQ 1/1:4 chrX_random 1462787 . atg a 59 PASS DP=2;NF=1;NR=1;NRS=1;NFS=1;HP=1 GT:GQ 1/1:6 chrX_random 1468928 . A AAG 11 q20 DP=3;NF=1;NR=1;NRS=2;NFS=1;HP=2 GT:GQ 1/1:6 chrX_random 1472124 . C CGT 4 q20 DP=3;NF=0;NR=1;NRS=0;NFS=1;HP=1 GT:GQ 1/1:4
Cover for Immunity
http://www.cell.com/immunity/abstract/S1074-7613%2810%2900401-2 Cell-Cell Propagation of NF-κB Transcription Factor and MAP Kinase Activation Amplifies Innate Immunity against Bacterial Infection Immunity, Volume 33, Issue 5, 804-816, 18 November 2010 . Christoph Alexander Kasper, Isabel Sorg, Christoph Schmutz, Therese Tschon, Harry Wischnewski, Man Lyang Kim, Cécile Arrieumerlou
