User:Lindenb/Notebook/UMR915/20101110
From OpenWetWare
Cedric's Data: we trying to get the coverage for the exons of interest. Input is a set of GERALD files. Nice input from Biostar:
Anatomy of an 'export' file:
>>1 $1 ? : HWUSI-EAS454 $2 ? : 14 $3 ? : 1 $4 ? : 1 $5 ? : 2390 $6 ? : 1116 $7 ? : 0 $8 ? : 1 $9 ? : GATTACACCAGATGCAACGATGTCAATGTAAAACTCAGGAAANNNNNNNNGGCAAGGAAATATGANNNNNNNNNAG $10 ? : caccccccc_ccc_[cc[cBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB $11 ? : c18.fa $12 ? : $13 ? : 35086853 $14 ? : F $15 ? : 42TATAAAAA15TACCTGCAA2 $16 ? : 71 $17 ? : 0 $18 ? : $19 ? : $20 ? : 0 $21 ? : N $22 ? : Y <<1
trying to get the coverage/exon
GERALD=1.GERALD
BEDBIN=BEDTools-Version-2.10.1/bin
for I in XXXXXXXXXXXXXXXXXXX do cat ${GERALD}/s_${I}_*_export.txt | egrep 'Y$' |\ grep -w "cXXXXXXXXXXXXXXXXXXXXXXXXX\.fa" |\ awk -F ' ' '{S=int($13); E=S+length($9); if(E < XXXXXXXXXXXXXXXXX || S > XXXXXXXXXXXXXXX ) next; printf("chrXXXXXXXXXXXXXXX\t%d\t%d\n",S-1,E-1); }' > jeter.bed ${BEDBIN}/coverageBed -d -a jeter.bed -b ${HOME}/exons.bed > coverage${I}.txt done rm jeter.ped