User:Timothee Flutre/Notebook/Postdoc/2011/11/16

From OpenWetWare
Revision as of 13:43, 16 November 2011 by Timothee Flutre (talk | contribs) (try pkg snpStats)
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

Entry title

  • try the R/Bioconductor package snpStats:
library(snpStats)
tmp <- matrix(c(1,3,2,1,3,0,1,3,0,1), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep="")))
tmp
tmp2 <- new("SnpMatrix", t(tmp))
tmp2
summary(tmp2)
print(as(t(tmp2), 'character'))
print(as(t(tmp2), 'numeric'))

Unfortunately, it doesn't seem possible to convert a matrix of characters into SnpMatrix, assuming 1=AA, 2=AB, 3=BB and 0=NC:

tmp <- matrix(c("A/A","B/B","A/B","A/A","B/B","","A/A","B/B","","A/A"), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep="")))
tmp
tmp2 <- new("SnpMatrix", t(tmp))

Thus, in the case where one has a matrix of genotypes obtained by Illumina (whether we have AA or A/A), we need to convert it first to the 1/2/3/0 encoding:

tmp <- gsub("A/A", 1, tmp)
tmp <- gsub("A/B", 2, tmp)
tmp <- gsub("B/B", 3, tmp)
tmp <- gsub("^$", 0, tmp)
tmp <- matrix(as.numeric(tmp), ncol=ncol(tmp), dimnames=list(rownames(tmp), colnames(tmp)))
tmp
tmp2 <- new("SnpMatrix", t(tmp))
tmp2
summary(tmp2)

Then, one can easily look at summary statistics, eg. the histogram of minor allele frequencies, of z-score for HWE, etc, and filter data accordingly:

hist(col.summary(tmp2)$MAF)
hist(col.summary(tmp2)$z.HWE)