Wikiomics:Repeat finding

From OpenWetWare
Revision as of 04:11, 22 March 2010 by Darek Kedra (talk | contribs) (De novo repeat library construction)
Jump to: navigation, search

To simplify, this page assumes eucakariotic genomic DNA repeat finding.

Repeat finding can be divided into two tasks, depending on availability of repeat library:

A) Library exists for a given (or possibly closely related species)


B) you construct such library de novo.

Task A is usually a prerequisite step for genome annotation and even blast searches. For newly sequences genomes one should start with B (constructing species specific repeat library).

Detecting known repeats

De novo repeat library construction

For review see: Saha et al. Empirical comparison of ab initio repeat finding programs (2008)


command line only, requires compilation


current version (2010-03): 1.05


Simplest run:

build_lmer_table -sequence input_sequence.fas -freq output_lmer.frequency
RepeatScout -sequence input_sequence.fas -output output_repeats -freq  output_lmer.frequency