User:Timothee Flutre/Notebook/Postdoc/2012/02/01
From OpenWetWare
< User:Timothee Flutre | Notebook | Postdoc | 2012 | 02
Main project page Previous entry Next entry
| |
Find SNPs in cis of genes
wget -O Ensembl_hg19_UCSC_20111019.txt.gz ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ensGene.txt.gz To view the description of the format, go to the UCSC Table Browser. Then choose the relevant parameters, in our case: clade="Mammal", genome="Human", assembly="hg19", group="Genes and Gene Prediction Tracks", track="Ensembl Genes" and table="ensGene". Finally, click on "describe table schema".
zcat Ensembl_hg19_UCSC_20111019.txt.gz | awk '{print $3"\t"$5"\t"$6"\t"$13"|"$2}' | gzip > Ensembl_transcripts.bed.gz
transcripts2genes.py Ensembl_hg19_UCSC_20111019.txt.gz Ensembl_genes.bed.gz
for i in {1..22}; do echo "chr"${i}"..."; awk -v i=${i} -F" " '{print "chr"i"\t"$3-1"\t"$3"\t"$2}' /path/to/chr${i}.impute | \
windowBed -w 500000 -a Ensembl_genes.bed.gz -b stdin | \
awk '{print $4"\t"$9"|"$8}' | \
gzip > chr${i}_genes_cisSNPs.txt.gz; done
| |



