Wikiomics:Genome aligners: Difference between revisions
Darek Kedra (talk | contribs) m (→Aligners) |
Darek Kedra (talk | contribs) m (→MUMmer) |
||
(22 intermediate revisions by the same user not shown) | |||
Line 4: | Line 4: | ||
For introduction read: Lyons and Freeling "How to usefully compare homologous plant genes and chromosomes as DNA sequences" 2008 | For introduction read: Lyons and Freeling "How to usefully compare homologous plant genes and chromosomes as DNA sequences" 2008 | ||
Also: Parameters for accurate genome alignment by Frith et al BMC Bioinformatics 2010, 11:80 http://www.biomedcentral.com/1471-2105/11/80 | |||
=Aligners= | =Aligners= | ||
===MUMmer=== | |||
web: http://mummer.sourceforge.net/ | |||
version: MUMmer3.22.tar.gz from 2009-09-21 | |||
===LAGAN Toolkit=== | ===LAGAN Toolkit=== | ||
http://lagan.stanford.edu/lagan_web/index.shtml | http://lagan.stanford.edu/lagan_web/index.shtml | ||
ver 2.0 from 2006 | ver 2.0 from 2006 | ||
Line 17: | Line 24: | ||
http://www.vmatch.de/ | http://www.vmatch.de/ | ||
Free of charge non-commercial license (requires faxing). | |||
===lastz (successor of blastz)=== | ===lastz (successor of blastz)=== | ||
latest release: 2010-Jan-12 | |||
http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.02.00/README.lastz-1.02.00a.html | web site: http://www.bx.psu.edu/~rsharris/lastz/ | ||
latest stable release: 2010-Jan-12 | |||
documentation: http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.02.00/README.lastz-1.02.00a.html | |||
New releases: http://www.bx.psu.edu/~rsharris/lastz/newer/ | |||
===last=== | ===last=== | ||
http://last.cbrc.jp/ | http://last.cbrc.jp/ | ||
last release: last- | last release: last-159.zip 14-Feb-2011 18:59 340K | ||
Compare two vertebrate genomes (Human vs. mouse: 1 day on 1 CPU) | Compare two vertebrate genomes (Human vs. mouse: 1 day on 1 CPU) | ||
Line 40: | Line 54: | ||
===YASS=== | ===YASS=== | ||
http://bioinfo.lifl.fr/yass/ | web: http://bioinfo.lifl.fr/yass/ | ||
last release: pre-release v1.14 build Apr 15, 2010 | last release: pre-release v1.14 build Apr 15, 2010 | ||
paper; doi:10.1093/nar/gki478 | paper; doi:10.1093/nar/gki478 | ||
spliced seeds, see also links to hedera & iedera programs on YASS page. | spliced seeds, see also links to hedera & iedera programs on YASS page. | ||
===Cgaln=== | |||
web: http://www.genome.ist.i.kyoto-u.ac.jp/~aln_user/cgaln/ | |||
last release: Cgaln-1.0.0.tar.gz | last release: Cgaln-1.0.0.tar.gz | ||
Two step aligner (first at the blocks then nucleotide levels). According to authors, fast and memory efficient. | Two step aligner (first at the blocks then nucleotide levels). According to authors, fast and memory efficient. Suitable for bacterial genomes and mammalian chromosomes on a desktop computer( untested dk). | ||
===FEAST=== | ===FEAST=== | ||
http://monod.uwaterloo.ca/feast/ | web: http://monod.uwaterloo.ca/feast/ | ||
last release: feast-105-bin.tar.gz | last release: feast-105-bin.tar.gz | ||
more sensitive but slower than lastz, new tool not widely tested. | more sensitive but slower than lastz, new tool not widely tested. | ||
===MAUVE=== | |||
multiple genome alignment | |||
http://asap.ahabs.wisc.edu/mauve/ | |||
last release: 2.3.1, from November 11th 2009. | |||
Java application with GUI. Simple to use, producing colorful graphic. Output gets too cluttered with too many / too divergent sequences. | |||
* | ===Spines=== | ||
software collection from Broad | |||
http://www.broadinstitute.org/science/programs/genome-biology/spines | |||
latest release: spines-1.11.tar.gz from 2010-10-28 | |||
* Satsuma "highly parallelized program for high-sensitivity, genome-wide synteny" | |||
* Papaya "an all-purpose alignment tool for less diverged sequences" | |||
* SLAP "context-sensitive local aligner for diverged sequences with large gaps" | |||
===Mercator=== | |||
Multiple Whole-Genome Orthology Map Construction | |||
http://www.biostat.wisc.edu/~cdewey/mercator/ | |||
latest release: cndsrc-2010.10.11.tar.gz | |||
===Enredo-Pecan-Ortheus pipeline=== | |||
Several programs used for aligning eukariotic genomes at ENSEMBL. | |||
* Enredo: http://www.ebi.ac.uk/~jherrero/downloads/enredo/ | |||
* Pecan: http://www.ebi.ac.uk/~bjp/pecan/ | |||
* Ortheus: http://www.ebi.ac.uk/~bjp/ortheus/ | |||
===FSA=== | |||
http://orangutan.math.berkeley.edu/fsa/ | |||
latest version: fsa-1.15.5.tar.gz (10.1 MB) | |||
paper: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000392 | |||
===Mugsy=== | |||
http://mugsy.sourceforge.net/ | |||
(bacterial genomes) | |||
===AuberGene=== | |||
http://www.ibi.vu.nl/programs/aubergenewww/ | |||
Probably most suitable for aligning a particular gene locus. | |||
=Alignment visualisation= | =Alignment visualisation= | ||
Line 76: | Line 127: | ||
* [http://synteny.cnr.berkeley.edu/CoGe/GEvo.pl GeLo] | * [http://synteny.cnr.berkeley.edu/CoGe/GEvo.pl GeLo] | ||
* [http://globin.bx.psu.edu/dist/gmaj/ Gmaj] | * [http://globin.bx.psu.edu/dist/gmaj/ Gmaj] | ||
Novel: | |||
* Strudel http://bioinf.scri.ac.uk/strudel/ | |||
=Supporting tools= | =Supporting tools= | ||
Line 84: | Line 138: | ||
=Useful links= | =Useful links= | ||
* how to create a synteny map between two genomes: http://synteny.cnr.berkeley.edu/wiki/index.php/SynMap | * how to create a synteny map between two genomes: http://synteny.cnr.berkeley.edu/wiki/index.php/SynMap | ||
=Conservation scores= | |||
==phastCons== | |||
http://compgen.bscb.cornell.edu/phast/ | |||
==GERP== | |||
http://mendel.stanford.edu/SidowLab/downloads/gerp/index.html | |||
==Scone== | |||
http://ika.bwh.harvard.edu/scone/ | |||
=Varia= | |||
==SiPhy== | |||
web: http://www.broadinstitute.org/genome_bio/siphy/ | |||
article: http://bioinformatics.oxfordjournals.org/content/25/12/i54.short |
Latest revision as of 03:50, 17 March 2011
List of programs used for large scale DNA alignment. At the moment the statements are mostly from web sites of programs in question.
For introduction read: Lyons and Freeling "How to usefully compare homologous plant genes and chromosomes as DNA sequences" 2008 Also: Parameters for accurate genome alignment by Frith et al BMC Bioinformatics 2010, 11:80 http://www.biomedcentral.com/1471-2105/11/80
Aligners
MUMmer
web: http://mummer.sourceforge.net/
version: MUMmer3.22.tar.gz from 2009-09-21
LAGAN Toolkit
http://lagan.stanford.edu/lagan_web/index.shtml ver 2.0 from 2006
- LAGAN
- M-LAGAN
- Shuffle-LAGAN
Vmatch
Free of charge non-commercial license (requires faxing).
lastz (successor of blastz)
web site: http://www.bx.psu.edu/~rsharris/lastz/
latest stable release: 2010-Jan-12
documentation: http://www.bx.psu.edu/miller_lab/dist/README.lastz-1.02.00/README.lastz-1.02.00a.html
New releases: http://www.bx.psu.edu/~rsharris/lastz/newer/
last
http://last.cbrc.jp/ last release: last-159.zip 14-Feb-2011 18:59 340K
Compare two vertebrate genomes (Human vs. mouse: 1 day on 1 CPU) copes more efficiently with repeat-rich sequences can align a large number of sequences (i.e. next gen sequencing data to genome)
Use softmasked input sequences.
#create database from one of the genomes (larger?) on a machine with > 20GB free RAM to speed up the process lastdb -c -s20G -v genome1_db genome1_sequence.fa #align the genomes with maf output lastal -o genome2_vs_genome1.maf -v genome1_db genome2_sequence.fa
YASS
web: http://bioinfo.lifl.fr/yass/
last release: pre-release v1.14 build Apr 15, 2010
paper; doi:10.1093/nar/gki478
spliced seeds, see also links to hedera & iedera programs on YASS page.
Cgaln
web: http://www.genome.ist.i.kyoto-u.ac.jp/~aln_user/cgaln/
last release: Cgaln-1.0.0.tar.gz
Two step aligner (first at the blocks then nucleotide levels). According to authors, fast and memory efficient. Suitable for bacterial genomes and mammalian chromosomes on a desktop computer( untested dk).
FEAST
web: http://monod.uwaterloo.ca/feast/
last release: feast-105-bin.tar.gz
more sensitive but slower than lastz, new tool not widely tested.
MAUVE
multiple genome alignment http://asap.ahabs.wisc.edu/mauve/ last release: 2.3.1, from November 11th 2009. Java application with GUI. Simple to use, producing colorful graphic. Output gets too cluttered with too many / too divergent sequences.
Spines
software collection from Broad http://www.broadinstitute.org/science/programs/genome-biology/spines latest release: spines-1.11.tar.gz from 2010-10-28
- Satsuma "highly parallelized program for high-sensitivity, genome-wide synteny"
- Papaya "an all-purpose alignment tool for less diverged sequences"
- SLAP "context-sensitive local aligner for diverged sequences with large gaps"
Mercator
Multiple Whole-Genome Orthology Map Construction
http://www.biostat.wisc.edu/~cdewey/mercator/
latest release: cndsrc-2010.10.11.tar.gz
Enredo-Pecan-Ortheus pipeline
Several programs used for aligning eukariotic genomes at ENSEMBL.
- Enredo: http://www.ebi.ac.uk/~jherrero/downloads/enredo/
- Pecan: http://www.ebi.ac.uk/~bjp/pecan/
- Ortheus: http://www.ebi.ac.uk/~bjp/ortheus/
FSA
http://orangutan.math.berkeley.edu/fsa/
latest version: fsa-1.15.5.tar.gz (10.1 MB)
paper: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000392
Mugsy
(bacterial genomes)
AuberGene
http://www.ibi.vu.nl/programs/aubergenewww/ Probably most suitable for aligning a particular gene locus.
Alignment visualisation
Novel:
Supporting tools
- DAGchainer: Computing Chains of Syntenic Genes in Complete Genomes (Perl)
http://dagchainer.sourceforge.net/
Useful links
- how to create a synteny map between two genomes: http://synteny.cnr.berkeley.edu/wiki/index.php/SynMap
Conservation scores
phastCons
http://compgen.bscb.cornell.edu/phast/
GERP
http://mendel.stanford.edu/SidowLab/downloads/gerp/index.html
Scone
http://ika.bwh.harvard.edu/scone/
Varia
SiPhy
web: http://www.broadinstitute.org/genome_bio/siphy/
article: http://bioinformatics.oxfordjournals.org/content/25/12/i54.short