Wikiomics:Repeat finding

From OpenWetWare
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

To simplify, this page assumes eucakariotic genomic DNA repeat finding.

Repeat finding can be divided into two tasks, depending on availability of repeat library:

A) Library exists for a given (or possibly closely related species)

or

B) you construct such library de novo.


Task A is usually a prerequisite step for genome annotation and even blast searches. For newly sequences genomes one should start with B (constructing species specific repeat library).


Detecting known repeats

De novo repeat library construction

For review see: Saha et al. Empirical comparison of ab initio repeat finding programs (2008)

RepeatScout

command line only, requires compilation

Site: http://bix.ucsd.edu/repeatscout/

current version (2010-03): 1.05

Documentation:

Simplest run:

build_lmer_table -sequence input_sequence.fas -freq output_lmer.frequency
RepeatScout -sequence input_sequence.fas -output output_repeats -freq  output_lmer.frequency