PLoS Computational Biology Volume 3 | Issue 1 | JANUARY 2007
- Automated Querying of Genome Databases
Web-based, interactive querying of the genome databases has enabled the analysis of genomes in an integrated and visual manner that previously was difficult or impossible. However, many important biological questions cannot practically be answered using simple interactive methods that query only a single genomic location at a time. Addressing these questions requires batch- and programmatic database-querying. Although these approaches involve an initial, one-time cost of learning how to use the associated API and establishing an access method for programmed querying, the important capabilities they provide for addressing significant and otherwise relatively complex genomic questions often makes this effort well worthwhile. Moreover, as Web-based interfaces like Galaxy—which provide batch data acquisition and post-processing, but do not require programming—evolve, the powerful tools of genome-wide data analysis should become accessible to an ever wider range of biologists.
PLoS Computational Biology Volume 2 | Issue 12 | DECEMBER 2006
- Computational Reconstruction of Iron- and Manganese-Responsive Transcriptional Networks in α-Proteobacteria
The availability of hundreds of complete genomes allows one to use comparative genomics to describe key metabolic processes and regulatory gene networks. Genome context analyses and comparisons of transcription factor binding sites between genomes offer a powerful approach for functional gene annotation. Reconstruction of transcriptional regulatory networks allows for better understanding of cellular processes, which can be substantiated by direct experimentation. Iron homeostasis in bacteria is conferred by the regulation of various iron uptake transporters, iron storage ferritins, and iron-containing enzymes. In high concentrations, iron is poisonous for the cell, so strict control of iron homeostasis is maintained, mostly at the level of transcription by iron-responsive regulators. Despite their general importance, iron regulatory networks in most bacterial species are not well-understood. In this study, Rodionov and colleagues applied comparative genomic approaches to describe the regulatory network formed by genes involved in iron homeostasis in the alpha subclass of proteobacteria, which have extremely versatile lifestyles. These networks are mediated by a set of various DNA motifs (or regulatory signals) that occur in 5′ gene regions and involve at least six different metal-responsive regulators. This study once again shows the power of comparative genomics in the analysis of complex regulatory networks and their evolution.
BMC Bioinformatics Volume 7 | OCTOBER 2006
- Using the nucleotide substitution rate matrix to detect horizontal gene transfer
Existing sequence-based methods for detecting HGT focus on changes in nucleotide composition or on differences between gene and genome phylogenies; these methods have high error rates. Micah Hamady etl. introduce a new class of methods for detecting HGT based on the changes in nucleotide substitution rates that occur when a gene is transferred to a new organism. New methods discriminate simulated HGT events with an error rate up to 10 times lower than does GC content. Use of models that are not time-reversible is crucial for detecting HGT. Using combinations of multiple predictors of HGT offers substantial improvements over using any single predictor, yielding as much as a factor of 18 improvement in performance (a maximum reduction in error rate from 38% to about 3%). Multiple predictors were combined by using the random forests machine learning algorithm to identify optimal classifiers that separate HGT from non-HGT trees.
PLoS Computational Biology Volume 2 | Issue 8 | AUGUST 2006
- An Integrative Method for Accurate Comparative Genome Mapping
Comparative genomics is an important discipline with applications in evolutionary, genetic, and ge nome rearrangement studies. When comparing genomes, one is usually interested in investigating the relation between the genomic segments to establish their evolutionary origin: are the segments orthologous, and hence inherited from their most recent common ancestor? Are they paralogs, and hence duplicated from an ancestral segment? Did the segments undergo reordering? Were the segments deleted or inserted and—if so—how (insertion sequence, prophage, horizontal gene transfer)?
In this paper, Swidan et al. present MAGIC, a new approach for comparative genome mapping. The main novelty of this approach is the biologically intuitive clustering step, which aims towards both calculating reorder-free segments and identifying orthologous segments. The authors demonstrate MAGIC's robustness, relative to both its initial input and to its parameters' values. MAGIC's scalability is demonstrated by running it on distantly related organisms and on large genomes. In addition, Swidan et al. provide a detailed analysis of the differences between MAGIC and other comparative mapping methods.
Applying MAGIC to several prokaryotic pairs enabled the authors to address the aforementioned questions and to quantitatively study the different evolutionary forces shaping the prokaryotic genome as well as to investigate their breakpoint distribution.