User:Carl Boettiger/Notebook/Comparative Phylogenetics/2010/04/18

{| width="800"
 * style="background-color: #EEE"|[[Image:owwnotebook_icon.png|128px]] Comparative Phylogenetics
 * style="background-color: #F2F2F2" align="center"|  |Main project page
 * style="background-color: #F2F2F2" align="center"|  |Main project page


 * colspan="2"|
 * colspan="2"|

MATICCE

 * Testing out Andrew Hipp's maticce package, 10.1093/bioinformatics/btp625.
 * Goal of package is to model-average over trees and regimes. Takes a brownie approach to regimes, identifying nodes at which a transition may have occurred.

The Approach
Takes a brute force approach to avoid user-defined paintings in the ouch framework. Users identify nodes of interest, software tries combinations of possible shifts.

runBatchHansen
What models/regimes are tested?
 * User identifies a set of nodes of interest, and optionally a maximum number of transitions.
 * You choose a set of candidate nodes and a maximum number of transitions that can occur (defaults to all nodes). For instance, you may specify four candidate nodes but set a max at 2 changes (three peaks).  Then the batch will evaluate 12 models: the 4 choose 2 = 6 three-peak models, the 4 choose 1 = 4 two peak models, a single peak, and a brownian motion model.  If you didn't specify a max, it would allow the 1 five-peak (4 choose 4), and the 4 four-peak (4 choose 3) models, for a total of 17 models.


 * Has summary function. Model averaging is a weighted average by BIC


 * characterStates -- the trait data, currently must be one-dimensional, seems like it can be in ape or ouch format.


 * nodes of interest have to be specified by listing all the species descendants. somewhat cumbersome.


 * Takes an ouchtree or list of ouchtrees.

Output

See loglik, sigma and alpha values of the models tested (doesn't include brownian motion by default)

Where the four nodes are shown as separate columns, and there numbers correspond to their indices specified in ovales.nodes.

multiModel
For a focal node, there are 5 hypotheses we want to distinguish:
 * BM
 * OU
 * 1) Two OU
 * 2) Two BM, censored as separate sub trees (data of non-descendants is not considered)
 * 3) Two OU, censored.

Note that the first three of these cases are among the cases covered in a runBatchHansen over the node.

Other functions

 * paintBranches turns a maticce specification of a focal node into an OUCH regimes list, identifying the regime of each branch.

Evaluation

 * A useful package for batch runs of hansen hypotheses, reasonably polished. Two chief drawbacks:
 * 1) Method of specifying which nodes are of interest seems very annoying.  Should require only two tips, and hence specify their common ancestor as the node.  Should have an option to identify the node merely by id number.
 * 2) Having to specify nodes doesn't really get around the painting problem.
 * The parameter space of paintings is too big!! A new approach is needed to thinking about this; one that does not consider all possible outcomes as different models.  (but of course I'd say that).

Notes / Reading

 * Linked Data a new approach to semantic web ideas.
 * Newsweek on why scientists are losing PR battle.
 * Recently started wikipedia page on Science 2.0
 * lots of Stack Overflow models. Came across: Semantic Overflow (thanks Rod Page).


 * }