RRedon:Protocols/Variation pipeline/Reference genome

=Download= Download the hg18/build36 from UCSC: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes

export http_proxy=${PXYHOST}:{PXYPORT} wget "http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip" md5sum chromFa.zip 7fc7f751134f3800f646118e39f9991d chromFa.zip ##OK same as http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/md5sum.txt unzip chromFa.zip ls chr*.fa | grep -v _hap | xargs cat > hg18.fa rm -f chr*.fa

=Indexing=

MAQ
(main article for MAQ). maq fasta2bfa hg18.fa hg18.bfa

(...) -- 45 sequences have been converted ls -lah -rw-r--r-- 1 root root 1,5G jun 2 17:13 hg18.bfa

BWA
(main article for BWA).

Index the reference genome: bwa index -a bwtsw hg18.fasta

(....)  [bwt_gen] Finished constructing BWT in 311 iterations. [bwa_index] 2229.02 seconds elapse. [bwa_index] Update BWT... 15.79 sec [bwa_index] Update reverse BWT... 15.97 sec [bwa_index] Construct SA from BWT and Occ... 1001.58 sec [bwa_index] Construct SA from reverse BWT and Occ... 987.96 sec

ls -la -rw-r--r-- 1 root root 3,0G jun 2 14:47 hg18.fa   -rw-r--r-- 1 root root 123K jun  2 15:15 hg18.fa.amb -rw-r--r-- 1 root root 1,8K jun 2 15:15 hg18.fa.ann -rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.bwt -rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.pac -rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.rbwt -rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.rpac -rw-r--r-- 1 root root 370M jun 2 17:03 hg18.fa.rsa -rw-r--r-- 1 root root 370M jun 2 16:47 hg18.fa.sa

Samtools
samtools faidx hg18.fa

will create a file:

hg18.fa.fai