User:R. Eric Collins/MBL/ML

From OpenWetWare
Jump to: navigation, search

Maximum Likelihood

Paul Lewis

Derrick Zwickl

Large ML inference

  • branch length optimization makes up majority of ML inference time because changing one branch affects lengths of every other branch.
    • this problem gets untenable with large trees (e.g. >50 taxa)
  • how accurate does a tree likelihood need to be?
  • evolutionary (not technically genetic) algorithm
    • start with a large number of individuals (parameter sets)
    • compete them against each other, unfit ones die out (and some fit ones, but never best fit)
    • can give starting tree that you believe, even if it has polytomies
  • difficulty due to parameter estimation: nucleotide < amino acid < codon (because of transition probability estimation)
  • when to use codons
    1. mixture of divergent and closely related sequences
    • not always better, plus it takes longer
    1. when the models get better


  1. which branches take longest to calculate? leaves? deep branches?
  2. settings contstraints... are there any algorithms that 'fill in' the valleys (level the mountains)
  3. when do you need a tree this good? use cases for depth-of-tree'ing would be useful
  4. ancestral state reconstruction... indel/gap models