BMC Bioinformatics Volume 7 | OCTOBER 2006
- Using the nucleotide substitution rate matrix to detect horizontal gene transfer
Existing sequence-based methods for detecting HGT focus on changes in nucleotide composition or on differences between gene and genome phylogenies; these methods have high error rates. Micah Hamady etl. introduce a new class of methods for detecting HGT based on the changes in nucleotide substitution rates that occur when a gene is transferred to a new organism. New methods discriminate simulated HGT events with an error rate up to 10 times lower than does GC content. Use of models that are not time-reversible is crucial for detecting HGT. Using combinations of multiple predictors of HGT offers substantial improvements over using any single predictor, yielding as much as a factor of 18 improvement in performance (a maximum reduction in error rate from 38% to about 3%). Multiple predictors were combined by using the random forests machine learning algorithm to identify optimal classifiers that separate HGT from non-HGT trees.
PLoS Computational Biology Volume 2 | Issue 8 | AUGUST 2006
- An Integrative Method for Accurate Comparative Genome Mapping
Comparative genomics is an important discipline with applications in evolutionary, genetic, and ge nome rearrangement studies. When comparing genomes, one is usually interested in investigating the relation between the genomic segments to establish their evolutionary origin: are the segments orthologous, and hence inherited from their most recent common ancestor? Are they paralogs, and hence duplicated from an ancestral segment? Did the segments undergo reordering? Were the segments deleted or inserted and—if so—how (insertion sequence, prophage, horizontal gene transfer)?
In this paper, Swidan et al. present MAGIC, a new approach for comparative genome mapping. The main novelty of this approach is the biologically intuitive clustering step, which aims towards both calculating reorder-free segments and identifying orthologous segments. The authors demonstrate MAGIC's robustness, relative to both its initial input and to its parameters' values. MAGIC's scalability is demonstrated by running it on distantly related organisms and on large genomes. In addition, Swidan et al. provide a detailed analysis of the differences between MAGIC and other comparative mapping methods.
Applying MAGIC to several prokaryotic pairs enabled the authors to address the aforementioned questions and to quantitatively study the different evolutionary forces shaping the prokaryotic genome as well as to investigate their breakpoint distribution.