= Outline =
= Outline =

Data Files

Technical Questions for Thought/Discussion

  • how will Bellerophon handle incomplete sequences? Does the type of incompleteness matter, i.e. hole in the middle vs. only one half of the clone?


  • So here is a graph of the number of unclassified clones in all of our laboratory samples. The overall question will be what do we do with these clones, and just looking at our lab samples is a good place to start. It is hard for me to imagine that this level of "unclassified" organisms results because they are actually novel or not found in greengenes. This is more likely a technical problem. As an example, DO and DM are both melanogaster on lab media, however the discrepancy between them is hardly trivial.

Overall, what we are to do with libraries like this, or with individual unclassified clones, is a question that I think needs to be resolved before we can get any credible information from the "classified" clones.

Scientific Questions for Thought/Discussion

Media:79_9Categories.pdf Here is a graph of all of our libraries (minus 3A, 3B, Ns and Wolbachia) at the genus level. The interesting thing to note is that taking all of our samples as a whole (lab/wild/interior/exterior/everything...), we get that a full 79% of the clones can be put into just 9 different categories. Although there is hardly a "core microbiome" since no genus is present in all the samples, it is cool to see that the same few things make up such a large percentage of our data.

This also brings up several technical questions. When we want to look at our data as a whole and make broad conclusions, which libraries should be included? There is the obvious distintions of lab/wild and interior/exterior/guts, but what about the different sequencing methods, sanger/phylochip, and the different DNA extraction methods. To answer the bigger question of just was a typical fly communities is composed of, I want to use as much data as possible. But many of our libraries (and not to even mention the corby-harris and Cox-Gilmore data) are slightly different from each other. Media:74_7Genera.pdf Here is the same as above, except that I have removed ALL the unclassified clones from the analysis. Note that now some of the libraries (Dml_CANS, DmW_Turr1/2, IMH) are now mainly composed of those genera not found in all the other libraries. We need to resolve if these differences are due to actual biology or just technical difficulties. But it still good to see that only a few genera make up a substantial portion of the dataset.

Lab Flies only: Internal vs External, unclassifieds still included-Media:intVSext+unC.pdf

Lab Flies Only: Internal vs External, gammaproteo and enterobacter unclassieds removed-Media:intVSext-unC.pdf Only D. melanogaster is represented in the above two files. The "grand total" column is the average of all the internal samples. The "external" column is the average of 4 external and media samples taken from both the Kimbrell and the Kopp labs, and hopefully is a good representation of what the flies are exposed to in the lab. Things to note: It seems as though the only genus that is enriched inside the flies is Providencia. In fact, Acetobacter and Lactobacillus, the two genera that we have been focusing on most heavily, seem to make up a relatively small proportion of the internal community. Next, it is interesting how the lines from the Kimbrell lab (all those that begin with DmL) are mostly Shigella, Microbacterium and Variovorax, while the Kopp lab samples (all the others) do not have those three Genera at all.

Table of all Samples Sequenced


Pretty self explanatory, but here is a table of all the libraries for which we have sequence data. The first column is the identifier that I will be using in all of my graphs and communications.

Example of Replication





Materials and Methods



Current Status


