User:Timothee Flutre/Notebook/Postdoc/2012/05/25: Difference between revisions
From OpenWetWare
(→One-liners with GNU tools: add "sort file with header") |
(→One-liners with GNU tools: add tuto cmd-line) |
||
(One intermediate revision by the same user not shown) | |||
Line 8: | Line 8: | ||
==One-liners with GNU tools== | ==One-liners with GNU tools== | ||
* ''' | * '''Toolbox''': often available by default on many computers running GNU/Linux | ||
** [https://en.wikipedia.org/wiki/Bash_%28Unix_shell%29 Bash] | |||
** [https://en.wikipedia.org/wiki/AWK AWK] | |||
** [https://en.wikipedia.org/wiki/Grep grep] | |||
** [https://en.wikipedia.org/wiki/Sed sed] | |||
** [https://en.wikipedia.org/wiki/GNU_Core_Utilities GNU coreutils] (head, tail, cut, uniq, sort, tr, ...) | |||
* ''' | * '''Tutorials''': | ||
** [http://en. | ** [http://en.flossmanuals.net/command-line/index/ Introduction to the command-line] | ||
** | ** [http://www.ibm.com/developerworks/aix/library/au-unixtext/index.html Introduction to text manipulation on UNIX-based systems] by Brad Yoes (IBM) | ||
Line 62: | Line 63: | ||
$ echo -e "x\ty"; for i in {1..10}; do echo -e $i"\t"$RANDOM; done | (read -r; printf "%s\n" "$REPLY"; sort -k2,2n) | $ echo -e "x\ty"; for i in {1..10}; do echo -e $i"\t"$RANDOM; done | (read -r; printf "%s\n" "$REPLY"; sort -k2,2n) | ||
* '''Get rows from a big file which are also in a small file''': example of using awk with 2 input files by loading the important information from the small file into an array in memory, then parsing the big file line by line and comparing each with the content of the array | |||
$ echo -e "gene\tsnp\tpvalue\ngene1\tsnp1\t0.002\ngene2\tsnp2\t0.8\ngene2\tsnp3\t0.1" > file_all.txt | |||
$ echo -e "gene1\tsnp1" > file_subset.txt | |||
$ awk 'NR==FNR{a[$1$2]++;next;}{x=$1$2;if(x in a)print $0}' file_subset.txt <(sed 1d file_all.txt) | |||
Revision as of 23:00, 3 November 2013
Project name | <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page <html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html> </html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html> |
One-liners with GNU tools
for i in {1..10}; do echo $i; done | sed 3,6d
$ for i in {1..20}; do echo $i; done | sed -n 3,5p
$ for i in {-5..5}; do echo $i; done | awk 'function abs(x){return (((x < 0.0) ? -x : x) + 0.0)} {print abs($1)}'
$ echo -e "gene\tsnp\tpvalue\ng1\ts1\t0.3\ng1\ts2\t0.002\ng2\ts2\t0.7\ng2\ts3\t0.05" > dat.txt gene snp pvalue g1 s1 0.3 g1 s2 0.002 g2 s2 0.7 g2 s3 0.05 $ cat dat.txt | sed 1d | sort -k1,1 -k3,3 | awk '{print $3"\t"$2"\t"$1}' | uniq -f2 g1 s2 0.002 g2 s3 0.05
$ subgroups=("s1" "s2" "s3" "s4"); for i in {0..2}; do let a=$i+1; for j in $(seq $a 3); do s1=${subgroups[$i]}; s2=${subgroups[$j]}; echo $s1 $s2; done; done
$ awk 'BEGIN{RS=">"} {if(NF==0)next; split($0,a,"\n"); printf "@"a[1]"\n"a[2]"\n+\n"; \ for(i=1;i<=length(a[2]);i++)printf "}"; printf"\n"}' probes.fa > probes.fq
$ echo -e "x\ty"; for i in {1..10}; do echo -e $i"\t"$RANDOM; done | (read -r; printf "%s\n" "$REPLY"; sort -k2,2n)
$ echo -e "gene\tsnp\tpvalue\ngene1\tsnp1\t0.002\ngene2\tsnp2\t0.8\ngene2\tsnp3\t0.1" > file_all.txt $ echo -e "gene1\tsnp1" > file_subset.txt $ awk 'NR==FNR{a[$1$2]++;next;}{x=$1$2;if(x in a)print $0}' file_subset.txt <(sed 1d file_all.txt)
|