User:Lindenb/Notebook/UMR915/20100628

From OpenWetWare
Jump to navigationJump to search

20100623        Top        20100629       


how many variations in the previous pileup?

for F in recal_bwa_rmdup_Brs*gz; do echo -n "$F "; gunzip -c $F  | wc -l ; done
recal_bwa_rmdup_Brs10.pileup.gz 118113
recal_bwa_rmdup_Brs1.pileup.gz 92292
recal_bwa_rmdup_Brs2.pileup.gz 92586
recal_bwa_rmdup_Brs3.pileup.gz 94320
recal_bwa_rmdup_Brs4.pileup.gz 91001
recal_bwa_rmdup_Brs5.pileup.gz 97320
recal_bwa_rmdup_Brs6.pileup.gz 90225
recal_bwa_rmdup_Brs7.pileup.gz 97952
recal_bwa_rmdup_Brs8.pileup.gz 96436
recal_bwa_rmdup_Brs9.pileup.gz 103912

insert variations in DB

 ~/bin/insertvariants.sh -s XX -C  -d "bwa recal rmdup for XX" -type pileup recal_bwa_rmdupXX.pileup.gz

hum.. some variation in pileup where alt=ref. Remove those var:

 mysql -u root -D umr915 -N -e  'select id from variation where ref=alt ' |\
 awk '{printf("delete from vcf_call where varation_id=%s;\ndelete from variation where id=%s;\n",$1,$1);}' > jeter.sql

generating input for PPH2

 mysql -u root -D umr915 -N -e 'select concat(chrom,":",position+1),concat(ref,"/",alt) from variation where ref in ("A","T","G","C") and alt in ("A","T","G","C")' > jeter.txt
 wc -l jeter.txt 
 271938 jeter.txt

wrote a tool for reading SIFT

http://code.google.com/p/code915/source/browse/trunk/tools/src/java/fr/inserm/umr915/tools/SiftToSQL.java

generating for SIFT

mysql -h 172.18.241.112 -u anonymous -D umr915 -N -e 'select chrom,position+1,1,concat(ref,"/",alt) from variation where ref in ("A","T","G","C") and alt in ("A","T","G","C")' |\
sed 's/^chr//' | tr "    " "," > jeter.txt
#sift accepts only file < 1Mb
split -C 900kB jeter.txt sift_

table for sift

mysql table for sift:

 create table sift(
 id int unsigned primary key auto_increment,
 variation_id  int unsigned not null ,
 index(variation_id),
 foreign key(variation_id) references variation(id) on delete cascade,
 meta text,
 creation datetime,
 modified timestamp,
 prediction enum("DAMAGING" , "DAMAGING *Warning! Low confidence." , "Not scored" , "TOLERATED"),
 score float ,
 median_info float
 )  engine=InnoDB;