User:R. Eric Collins/MBL/ML
Large ML inference
- branch length optimization makes up majority of ML inference time because changing one branch affects lengths of every other branch.
- this problem gets untenable with large trees (e.g. >50 taxa)
- how accurate does a tree likelihood need to be?
- evolutionary (not technically genetic) algorithm
- start with a large number of individuals (parameter sets)
- compete them against each other, unfit ones die out (and some fit ones, but never best fit)
- can give starting tree that you believe, even if it has polytomies
- difficulty due to parameter estimation: nucleotide < amino acid < codon (because of transition probability estimation)
- when to use codons
- mixture of divergent and closely related sequences
- not always better, plus it takes longer
- when the models get better
- which branches take longest to calculate? leaves? deep branches?
- settings contstraints... are there any algorithms that 'fill in' the valleys (level the mountains)
- when do you need a tree this good? use cases for depth-of-tree'ing would be useful
- ancestral state reconstruction... indel/gap models