# User:Timothee Flutre/Notebook/Postdoc/2011/11/16

(Difference between revisions)
 Revision as of 16:43, 16 November 2011 (view source) (try pkg snpStats)← Previous diff Revision as of 22:34, 27 November 2013 (view source) (→Entry title: change page into "about statistical modeling")Next diff → Line 6: Line 6: | colspan="2"| | colspan="2"| - ==Entry title== + ==About statistical modeling== - + - * try the R/Bioconductor package [http://www.bioconductor.org/packages/devel/bioc/html/snpStats.html snpStats]: + - + - library(snpStats) + - tmp <- matrix(c(1,3,2,1,3,0,1,3,0,1), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep=""))) + - tmp + - tmp2 <- new("SnpMatrix", t(tmp)) + - tmp2 + - summary(tmp2) + - print(as(t(tmp2), 'character')) + - print(as(t(tmp2), 'numeric')) + - + - Unfortunately, it doesn't seem possible to convert a matrix of characters into SnpMatrix, assuming 1=AA, 2=AB, 3=BB and 0=NC: + - + - tmp <- matrix(c("A/A","B/B","A/B","A/A","B/B","","A/A","B/B","","A/A"), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep=""))) + - tmp + - tmp2 <- new("SnpMatrix", t(tmp)) + - + - Thus, in the case where one has a matrix of genotypes obtained by Illumina (whether we have AA or A/A), we need to convert it first to the 1/2/3/0 encoding: + - + - tmp <- gsub("A/A", 1, tmp) + - tmp <- gsub("A/B", 2, tmp) + - tmp <- gsub("B/B", 3, tmp) + - tmp <- gsub("^\$", 0, tmp) + - tmp <- matrix(as.numeric(tmp), ncol=ncol(tmp), dimnames=list(rownames(tmp), colnames(tmp))) + - tmp + - tmp2 <- new("SnpMatrix", t(tmp)) + - tmp2 + - summary(tmp2) + - + - Then, one can easily look at summary statistics, eg. the histogram of minor allele frequencies, of z-score for HWE, etc, and filter data accordingly: + - + - hist(col.summary(tmp2)\$MAF) + - hist(col.summary(tmp2)\$z.HWE) + + * visualizing, plotting: + ** "Visualizing uncertainty about the future" by Spiegelhalter, et al. (Science 2011, [http://dx.doi.org/10.1126/science.1191181 DOI]) + ** "Let's practice what we preach: turning tables into graphs": Gelman et al (The American Statistician 2002, [http://dx.doi.org/10.1198/000313002317572790 DOI])

## Revision as of 22:34, 27 November 2013

Project name Main project page
Previous entry      Next entry