User:Morgan G. I. Langille/Notebook/Unknown Genes/2010/09/29

From OpenWetWare
Jump to navigationJump to search
Unknown Genes Main project page
Previous entry      Next entry

Calculating ecological diversity using R

  • Initially I was using R library ecodist
    • library(ecodist)
    • bray_curtis_via_ecodist<-as.matrix(bcdist(mat,rmzero=TRUE));
    • sorenson_via_ecodist<-as.matrix(distance(mat,method="sorensen"));
  • R library vegan seems to have more options (and is preferred by Steve Kembel??)
    • bray_curtis_via_vegan<-vegdist(mat,method="bray")
  • Wrote col_betadiversity.R to calculate beta diversity using several measurements and outputs the data in several formats
    • Results in this directory contain different beta diversity measurements between GOS (and other) samples using PFAM counts.
    • This command generates the output files: ./col_betadiversity.R camera_proteins_vs_pfam.txt beta_diversity
    • The output files are labeled "beta_diversity_" followed by the name of the dissimilarity measurement.
      • "norm" means that the matrix was normalized by dividing each count by the total number of counts in each column.
    • Each diversity metric is output in three files:
      • "matrix" is a simple dissimilarity matrix (upper triangular is redundant)/
      • "pairwise" outputs each dissimilarity score on its on line with the name of the samples in the first two columns.
      • "hclust.pdf" is a pdf showing the results of doing hiearchael clustering using the distances.