Wikiomics:Repeat finding

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
m (1st draft)
(De novo repeat library construction)
Line 17: Line 17:
=De novo repeat library construction=
=De novo repeat library construction=
For review see: Saha et al. [http://nar.oxfordjournals.org/cgi/content/full/gkn064v1 Empirical comparison of ab initio repeat finding programs] (2008)
For review see: Saha et al. [http://nar.oxfordjournals.org/cgi/content/full/gkn064v1 Empirical comparison of ab initio repeat finding programs] (2008)
 +
 +
==RepeatScout==
 +
command line only, requires compilation
 +
 +
Site: http://bix.ucsd.edu/repeatscout/
 +
 +
current version (2010-03): 1.05
 +
 +
Documentation:
 +
* http://bix.ucsd.edu/repeatscout/readme.1.0.5.txt
 +
* PPT presentation presenting algorithm: http://bix.ucsd.edu/repeatscout/repeatscout-ismb.ppt
 +
* publication (PDF)[http://bioinformatics.oxfordjournals.org/cgi/reprint/21/suppl_1/i351.pdf De novo identification of repeat families in large genomes] 2005
 +
 +
Simplest run:
 +
<pre>
 +
build_lmer_table -sequence input_sequence.fas -freq output_lmer.frequency
 +
RepeatScout -sequence input_sequence.fas -output output_repeats -freq  output_lmer.frequency
 +
</pre>

Revision as of 07:11, 22 March 2010

To simplify, this page assumes eucakariotic genomic DNA repeat finding.

Repeat finding can be divided into two tasks, depending on availability of repeat library:

A) Library exists for a given (or possibly closely related species)

or

B) you construct such library de novo.


Task A is usually a prerequisite step for genome annotation and even blast searches. For newly sequences genomes one should start with B (constructing species specific repeat library).


Detecting known repeats

De novo repeat library construction

For review see: Saha et al. Empirical comparison of ab initio repeat finding programs (2008)

RepeatScout

command line only, requires compilation

Site: http://bix.ucsd.edu/repeatscout/

current version (2010-03): 1.05

Documentation:

Simplest run:

build_lmer_table -sequence input_sequence.fas -freq output_lmer.frequency
RepeatScout -sequence input_sequence.fas -output output_repeats -freq  output_lmer.frequency
Personal tools