Group call 18 March 09

(Sam taking notes)

Steve, James, Jess B., Josh, and Sam were the only iSEEM folks around for the call.

  • Steve: Preliminary results
    • At a workshop last weekend, he talked about his preliminary results, some of which are on a CollectiveX Disscussion page. Talk was received well; people are interested.
    • What he has done: he ran AMPHORA on the DeLong data set, got phylogenies for the 31 Marker genes, and calculated PD (phylogenetic diversity) for the different phylogenies.
    • The results: Results were all over the place. No discernible pattern across the genes, although there were some patterns along salinity or depth gradients.
    • Major issue: For each protein family, there are very few reads found by AMPHORA. How should we deal with this problem? And is it because AMPHORA is not picking up on genes that it should identify or because the data is not there and we just need more protein families?
    • Steve suggests there are two ways to think about it what we are doing:
      • (1) each gene is a random sample of the "true" underlying relationships among the taxa.
      • (2) "Bag of genes" -- each gene is under different pressure in the different environments, and the phylogenies reflect this.
    • One way to test these theories is to get more gene families in AMPHORA to see if we can see a correlation of gene function and PD (or some other aspect of the phylogeny).
    • Another way to test this is via the simulations: maybe not just by using gene families but by fragmenting whole genomes for a subset of taxa for which the phylogeny is well understood. This would mimic the sequencing better, and it may better test AMPHORA's ability to find the genes.
  • Sam: Simulations
    • Pipeline is almost finished; some assembly of scripts still required.
    • Simulated fragments are now being blastxed against database of reference sequences in AMPHORA to figure out what frame to translate them in, and then they are translated into peptide sequences. For now, she'll use HMMER to align translated simulated fragments to the AMPHORA profile HMM for the gene family, and that will be the output
  • Joining pipelines:
    • Steve and Sam will put their pipelines in the directory Srijak is supposed to have gotten created, assuming it got created. They'll e-mail Srijak to check. Otherwise, they'll figure out some alternative until he returns from India.
  • Josh and James:
    • Scheduled separate skype call to discuss their work.