Tregwiki:Recent bibliography


 * a 3' utr region contacts a 5' utr region and influences transcription
 * motif discovery, microarray, muscle, human
 * whole-genome scanning again: retina specific factors and their sites
 * microfluidics to the rescue: using a microfluidics platform, binding sites were determined and the genome scanned for them.
 * anther whole-genome promoter-scanning search: pan-neuronal genes in general
 * yet another whole-genome promoter-scanning example: c.elegans, ciliary/ciliopathy neurons
 * SMARTs are also being systematically analyzed: Interesting is the short list of smart-predictors that are benchmarked against the data set. I have exported the data from the paper into UCSC but haven't seen a lot of conservation in these SMART-regions.
 * trying to learn transcription factor concentrations from chip and network data
 * hedgehog signalling regulates at least 220 different genes (supression of HH signalling combined with microarray analysis), the gli motifs is overrepresented upstream of these genes
 * Motif discovery with a twist:Searching for the most UNDERreprensented motifs gives "nullomers", 15bp-long motifs that are completely avoided by nature and cannot be found one single time in genbank. Cool idea.
 * Yet another putative link between repeats and regulatory sequences: regulatory elements that are repeated are prone for recombination and quickly create new combinations of existing regulatory elements
 * ever wondered why only certain alleles is expressed? Zac1 seems to be a factor that restricts expression to alleles
 * complex - the sonic hedgehog signaling database: Reviews, papers, diagrams, gene lists, diseases, etc.
 * =2&itool=pubmed_docsum now even in plants: motif discovery and alignment combined, phylogenetic footprinting (van der peer)

Application of motif search but simple ERE-steroird inverted repeat model]
 * a "boosted" classifier (motif discovery) is trained to find decision trees that describe the relation between motif substitution patterns and the outcome of the chip-chip experiment
 * the yeast regulatory network contains certain types of motifs (feed-forward, sinle link/convergence + mutual regulation) and not other types (divergence, cascade, divergence), is organized into "organizers", "Regulatory connections inside organizers are dense, while inter-organizer connections are sparse.", interactions as "waves" (davidson calls that batteries since 20 years) that are triggered by few signals
 * regulation of OTP in zebrafish: HH, FGF, nodal
 * nice review about chromatin structure and inf jumps into neuronal genes, can act on methylation patterns
 * Coevolution of binding sites and transcription factors: new hybrid-based experiments to prove it
 * Review: review about computational versus biological methods in uncovering binding sites in flies. Conclusion: back to the bench :-)
 * Silencing useless binding sites
 * Spt6 and FACT silence superflous binding sites while PolII is elongating the DNA
 * Protein Microarrays
 * algorithm to design protein microarrays
 * Success Stories (scanning & discovery)
 * a real-life example how to apply promoter prediction and binding site scanning (NNPP and alibaba)
 * [Application of motif discovery]
 * [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16289377&itool=pubmed_DocSum
 * Motif Discvoery finds a 19bp motif, software used was ClustalX (!!), model: bacteria
 * [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16412085&itool=pubmed_DocSum
 * matinspector to discover and prove one binding site. Important: This binding site is NOT conserved at all among most mammals!
 * Vasarhelyi seach for overrepresented binding sites in micrarray data
 * SNPs and TFBS binding sites in two mouse genes: SNPs change expression
 * Applikation of MODEL to upstream sequences of microarray chosen genes


 * Whole genome searches for sites:
 * starting from only a single site, by looking at the conservation pattern, you can build a completely network by conservation and PWM matches.
 * Coding theory and regulation
 * the number of transcription factors are limited by genome size and information content of binding domain


 * Relation between cis-regulatory sequences and trans-acting factors
 * Ohler Cornell 2000, pmid 10679430: Vnd/nkx, ind/gsh, and msh/msx: conserved regulators of dorsoventral neural patterning? ... while recent loss-oncRNA-search benchmark]: Wu-Blast is better (faster, better options) than NCBI Blast, more specific algorithms are better than generic aligners (surprise, surprise)
 * indels in the human lineage follow the trend set by HERs (see UCSC papers, Siepel etc) and cluster around trancription/regulation genes.
 * histones and gene regulation: NFAT, YY, CCAAT, GATA3 recruit HDAC4 to the IL5-promoter which in turn regulates expression
 * background models for dna motifs: adding gc-context seems to improve them
 * long-range synteny amphioxus-vertebrates: Why doesn't anyone try to add sea-urchin + ciona + lancelet?? (pubmed)
 * in yeast, promotors are REALLY much better characterized than in eurcaryotes... (schneider)
 * Once again, the even-skipped enhancer: Mathematical modelling of expression using TF concentrations and binding site data (Halfon)
 * conserved sequences have certain properties that allow to distinguish between exons, CNS and Utrs: Exons/CNS have longer regions. In Exons, mutations show the 3n+2 pattern. Long CNS can perhaps be explained by overlapping TF sites.
 * and finally in science: CNS evolution is faster close to neuronal genes (Prabhakar, Rubin)
 * a great review on methods in miRNA prediction/validation from nature genetics special issue on miRNA: if just we had this in binding site predictions...
 * a point mutation in distal SHH enhancers is responsible for a genetic disease: another good reason why it is a good idea to test all these conserved sequences...
 * pRNA is competing for the polymerase and this regulates gene expression. I wonder if there is something similar with enhancers competing for TFs...
 * transposable elements are linked to gene expression,old review that I didn't know before, strange examples, strange conclusions, no single example where the transposabl direct binding ones identified by chipchip


 * Co-occurence of binding sites
 * combination in yeast of matrix matches at preferred distances to find protein interactions
 * same for human factors using tissue-based sorting


 * PSSM and extensions:
 * graph based approach for motif recognition
 * Transcription factor binding domains:
 * dna elasticity rather than direct base-touching is used to recognize binding sites
 * Ultraconserved and Co:
 * The strange story of Evf-2 which is at the same time enhancer AND ncRNA binding to a homf-function studies in flies and mice indicate that these three genes may have a conserved role in regional specification, there is no obvious conservation of the particular cell fates deriving from corresponding domains. The three-column expression pattern may thus represent a developmental mechanism that is more resistant to evolutionary changes than genetic events upstream or downstream of it.
 * nature paper describing how regulation of yeast has changed yet phenotype is almost identical between S cerevisia and C albins


 * ultraconserved elements:
 * ultras avoid segmental duplications


 * RNA and regulation:
 * RNA as cofactors in the transcriptional complex for RNA Poly II

quote from abstract: "Finally, we conducted DNAse I-footprinting assays in nuclear extracts for the 184bp region and detected two protected sequences. Data bank search indicates that these sites contain consensus binding sites for transcription factors." Wonder if they ever pasted a random sequence into transfac match...
 * the "billboard" model:
 * on cancer
 * accelerated regions between human/monkey are non-coding and located around neural proteins (rubin)
 * yet another try to predict binding sites from TF protein structure
 * review: locating TF binding sites (jones)
 * conserved fugu regions correspond to most conserved (>350bp, 77%) human-mouse regions (ovcharenko)
 * human-zebrafish comparison and enhancer test (ovcharenko?)
 * the sea urchin transcriptional regulatory network (davidson)
 * mulan tools from dcode.org (ovcharenko)
 * The advantages of transposons, review in nature
 * properties of binding sites in transfac analyzed: GC-content and palindromic structure, depends on species and promoter/enhancer localization
 * CREAD in action: fax2/wnt target genes analysed and new motifs found, new factor found to be expressed in the lung and verified with rtPCR
 * a second type of binding sites for CRP exist that exchange a T for a C in the core site
 * functional evolution and variation of non-coding dna
 * The predicted NfKappab-binding sites in the HCMV gene are not functional Surprise, surprise!
 * genes with intermediate complex expression have the longest introns (genome design theory)
 * mirnas in arabidopsis have been copied around together with their promoters
 * survey how to located binding sites in genomes
 * enhancer prediction with 83% accuracy on the vista browser see also biology paper: * [http://www.ncbi.nlm.nih.eodomain
 * Contribution of transposable elements to expression: TEs can influence expression, as silencers, LINE-1 preferentiallye element really is proven to be the minimal enhancer
 * a third party opinion about chip-chip data can confirm most findinings but warns that conclusions from some studies might not be generalizable to all transcription factors
 * the correct phylogentic tree for drosophila depends on the gene you're looking at
 * birectional promoters are ubiquitous on the human genome and 90% are functional
 * the protein interaction community have a "minimum information document" that is needed for publications. Why don't we have this for protein-DNA interactions?
 * an update for an absolutely fascinating technology to capture enhancer interactions: chromatin conformation capture (3C) called 4C
 * nature article showing that dna structure can be altered and this can be used to block a given enhancer
 * fish genomics review: the 2R hypothesis, differences between assemblies, a phylogenetic tree, etc.
 * retrotransposons are transcribed and have their own promotors/enhancers
 * How NOT to use binding site databases:
 * transcription factors are related by OMIM, example pitx2, pax6 and foxc
 * In ascidians, eletroporation of random 1kb fragments has an enhancer-hit-rate of around 8% (levine)
 * Great news for electronic freaks: all AND/OR combinations are possible for the e.coli lac operon, SNPs can change binding sites and change the corresponding logic
 * application of RP on thyroid enhancer set


 * gene CEL is located within irx2
 * many genome sizes...
 * old paper


 * Combinations (Disc,Scan,Expression,chip2chip, deleteion):
 * TFBSScan combines alignments, expression, Chip-Chip to find new motifs in yeast.
 * combining binding site data with protein interactions predicts phylegenetic result of deletions
 * deletion of transcription factor leu3 does identify targets but notist_uids=16847329&itool=pubmed_DocSum NFkappaB example] showing how the distal enhancers make the promoter accessible via chromatin modifications and thus starts transcription
 * promoter that can be dissected into 6 states


 * Randomization techniques when scanning for motifs:
 * Scanning human/mouse/rat, rating conservation and rating motif matches


 * Matrices are bad:
 * by re-building ERE matrices, < 15% error rate in prediction was obtained
 * alignment
 * tf-maps is an alignment without sequence similarity


 * Bio papers about chosen transcription factors:
 * Regulation of the transcription factor FOXM1c by Cyclin E/CDK2 PMID 16504183


 * Mathmatics, statistics, background models:
 * Various Markov Models to improve the PWM scheme.


 * combinatorial detection
 * CodeFinder tries to find overrepresented combinations


 * cis-reg evoluation is important, few bps enough to create dots on wings
 * yellow dots in drosophila


 * motif disc reviews
 * motif discovery review
 * tompa review proposing new formula

Applikation of motif discovery in plants]
 * Gibbs Sampler applied to find a ccaat-box
 * Clustering of genes according to a) co-expression and b) co-occurence of hexamers and transfac motifs, programmed in R
 * Developmental regulation
 * ncRNA has 2700 ncRNAs that might be involved in development
 * transposon deserts are associated with developmental regulators


 * Motif Scanning
 * Yeast Dyad scanning distances conserved, direction as well. Tf active in Multiple conditions -> variable distances


 * Tools:
 * Bio++ C++ library specialized in phygenetics can at least do some sequence reading/writing
 * Biology
 * Strange article by the infamous Katoh/Katoh team about spray being indentified with double TCF/LEF binding sites conserved]
 * How to validate binding sites to be REALY REALLY sure that they are valid?
 * How to validate binding sites?


 * Structure of the DNA
 * modification of histone to unwind chromatin structure
 * DHS search overlayed with UCSC conservation
 * review nucleosome structure


 * Chip-Chip or similar, large sequence biology data:
 * E2F1 Chip2Chip suggests that 20% of all promoters are bound by E2F1, usual proximal, sites do not correspond to matrix (!)
 * Yeast one hybrid assay to determine matrices for given transcription factor, WITHOUT selex and without Chip-Chip


 * Link between SNPs and binding sites:
 * problems faced at the moment
 * Rice Example published in Science: SNP in 5' causes phylogenetic difference
 * regulatory SNP causes disease SCIENCE 05/06
 * genome-wide survey of linkage between genes that are differently expressed and their regulatory SNPs


 * Networks and binding sites:
 * yeast data suggests that sequences (knockout-data) can predict phenotypes based on network (manual) and binding site (predicted and chip-chip)
 * Pitx stuff:
 * regulates muscle and eye development


 * Applications of conservation to elucidate regulation
 * MCEP2 dissection of regulatory region


 * Relation between phenotype and cis-regulatory variation
 * Carp acclimated to 10/30 deg. is analyzed in respect to cis-reg variation


 * Redundancy in Regulation:
 * Demonstration that you can delete functional binding sites without changing expression (note: at a certain tissue/development stage/pathway status!!)
 * Another example that binding sites cluster together
 * Drosophila colocalization hotspots bind many different factors: "we predict that many more factors will show strong colocalization and that hotspots may recruit up to a few hundred different proteins. This observation contrasts with results in yeast where overlap between transcription factors was relatively rare and hotspots were not apparent (7, 8)."