User talk:Darek Kedra/sandbox 26

=GFF Comparison= GFF, in particular GFF3, is a fairly common standard to store information in text files. For description see: http://www.sequenceontology.org/gff3.shtml

In the process of genome annotation using multiple tools there is a need of comparing the output of i.e. gene prediction programs, ESTs/protein mapping. Given two GFF files (A and B) with gene models, one can compare them on various levels, such as:

how many nucleotides annotated as features, i.e. nucleotides in exons are in both sets
 * nucleotide level:


 * splice junction level

how many exact exons on the same strand do overlap
 * exon level

how many genes are identical
 * gene level

For more information read Evan Keibler's (autor of eval) master thesis: http://mblab.wustl.edu/software/download/eval-documentation.pdf

CAVEAT: tools listed below are often fairly simple. Some do not take into account "type" (#3 column), therefore one can compare exons from one file with a combined set of genes, exons and introns from another. Some programs smuggle extra information about primary/last exons into type" field, so all exons from one file will be compared with not all exons from the other. Always check if GFF data is compatible.

Perl scripts collection
link: http://biowiki.org/GffTools/

Tested: gffsort.pl (sorts GFF streams by sequence name and startpoint)

Python efforts
https://github.com/chapmanb/bcbb/tree/master/gff https://github.com/daler/GFFutils
 * Brad Chapman's GFF parser:
 * GFFutils by Ryan Dale:

main link: http://code.google.com/p/pygr/
 * Pygr

discussion about gff/annotation parsing: http://www.mail-archive.com/pygr-dev@googlegroups.com/msg01551.html

http://code.google.com/p/bpbio/
 * bpbio

http://bitbucket.org/james_taylor/bx-python/overview
 * bx-python

Ruby
http://www.bioruby.org/rdoc/classes/Bio/GFF/GFF3.html
 * BioRuby library:

=Java= Biojava module: http://www.biojava.org/docs/api/org/biojava/bio/program/gff/GFFTools.html

Stand alone programs

 * Eval

link: http://mblab.wustl.edu/software/eval/ version: 2.2.8

Perl program with GUI.

GFPE: gene-finding program evaluation Bioinformatics (2003) 19 (13): 1712-1713. doi: 10.1093/bioinformatics/btg216
 * GPFE

link: ftp://anonymous@iubio.bio.indiana.edu/molbio/genefind/ Program in java.

link: http://big.crg.cat/services/overlap author: Sarah Djebali
 * overlap