Tregwiki:Collected regulatory elements

From OpenWetWare
Revision as of 15:28, 9 December 2007 by Bill Flanagan (talk | contribs) (1 revision(s))
Jump to navigationJump to search

Plants, Yeast, Bacteria

  • RegTransBase a massive database with almost 6000 curated article and more than 7000 genes, looks like a comprehensive resource, if you happen to work on bacteria...
  • [1] various database and scanning of binding sites for plants
  • [Atprobe] for plants
  • [Place] also only plants
  • SGDlite offers a genome browser with most of the binding sites of yeast, discovered in the groundbreaking Harbison et al. 2004 (pubmed) paper in nature. TODO: There is a newer version with updated predictions.

Other Organisms

  • Transfac public 4.0, the last version on the internet but is still in the public domain, dates from 1999. (Content: 8000 sites, 300 matrices)
  • Transfac public 7.0: You cannot download the database, but you can scan your sequences for free. (Content: ~13000 sites, 620 matrices)
  • FlyReg by Daniel Pollard: Matrices extracted using MEME from Flyreg by Danial Pollard. Some manual selection was used, Daniel kept only the matrices the seemed "good" to him.
  • Jaspar: Completely free data, only matrices. (Content: ~ 110 matrices)
  • MPromDb Yet another database of vertebrate promoters and binding sites. No new data, everything is copied together from somewhere else. (Content: 6500 sites of human/mouse/rat but taken from an older Transfac). ChipChip-Whole-Genome-Data for Factors E2F1, ER{alpha}, Myc , NF-y and E2F4 is also mapped to the sequences.
  • Database on Tunicate Gene Regulation Only 29 sites, but a nice interface. Only interesting for people working with C. intestinalis (not many... :-)
  • TFD and its successor (?) ooTFD is a collection of binding sites, curated from literature, with unfortunately very little additional information (Content: 7000 sites, no matrices)
  • RedFly is a database of regulatory elements of any kind: promoters, enhancers and their binding factors. It's different from the data in flybase/flyreg. Included in ORegAnno.
  • FLYREG2: Collection for Drosophila, from 120 publications, completely free, all proven by DNAse experiments (Content: 1200 sites, included in UCSC (Track: Regulation and Expression / Flyreg in the D. melanogaster browser and newer versions of Transfac) It's different from redfly in that in specializes into single TF binding sites. Included in (and eventually to be replaced by) ORegAnno.
  • ABS Yet another database or sites and promoters (Content: 650 sites, 211 promoters). It tries to appeal to the user by offering to benchmark site prediction programs (it accepts gff).
  • ORegAnno currently the best public database of all kinds of regulatory elements. All sites are mapped to the genome and downloadable in various formats (XML, Gff, MySql) (Content: 2268 sites, 1871 regulatory regions (promoters, enhancers, etc), 180 regulatory polymorphisms, no matrices). Includes MTIR, Flyreg, RedFly, Stanford promoters, Vista enhancers, and several other datasets. Included in UCSC browser (Track: Regulation and Expression / ORegAnno).
  • DBSD Michael Zhang's drosophila binding site database
  • Pazar is a VERY complex database but filled with nice datasets. The idea is to have several public databases under one interface, that's where the name comes from
  • Muscle specific regulatory elements, now part of Oreganno, is the primary data set for benchmarking all kinds of regulatory element prediction (should also be part of pazar)

Commercial Databases

  • Transfac professional 9.2: You have to pay 500$/year per academic group to download/scan data. (Content: 17000 sites, 660 matrices)
  • Genomatix Matrix Library: No download of data, only scanning, cost: about 500$ per year (Content: 540 matrices, no sites) Based on a licenced older version of Transfac, Genomatix tries to add and refine the data by its own annotation group.