Talk:DataONE:Notebook/Data Citation and Sharing Policy/2010/07/12

From OpenWetWare
Jump to navigationJump to search

R start

  • Heather A Piwowar 00:28, 13 July 2010 (EDT): Nic, once you install R, here are a few commands to get you going. Note: these graphs aren't pretty, they aren't very relevant, and the code is ugly... I just wanted to show you how to run some things on your own spreadsheets to get the feel of R. I used your EvoBio_Journals sheet.

Two ways to get the data in. Either make your Google Fusion pages with Share->Visibility->Public and then use

Or export the table and then use (substituting in the proper path)

  • filename = "~/Downloads/EvoBio_Journals.csv"

After one of those lines, you can run the following, one line at a time at the R prompt:

dat.raw = read.csv(filename, stringsAsFactors=F)
dim(dat.raw)
names(dat.raw)
str(dat.raw)
plot(table(dat.raw$Publisher))
plot(table(dat.raw$Peer.Reviewed))
plot(table(dat.raw$Policy.Has.Instructions.how.to.share.data))
hist(as.numeric(dat.raw$Impact.Factor))
plot(as.numeric(dat.raw$Impact.Factor), as.numeric(dat.raw$Cited.Half.life))
plot(as.numeric(dat.raw$Impact.Factor), as.numeric(dat.raw$X5.Year.Impact.Factor))
abline(lm(as.numeric(dat.raw$X5.Year.Impact.Factor) ~ as.numeric(dat.raw$Impact.Factor)))
lm(as.numeric(dat.raw$X5.Year.Impact.Factor) ~ as.numeric(dat.raw$Impact.Factor))
cor(as.numeric(dat.raw$X5.Year.Impact.Factor), as.numeric(dat.raw$Impact.Factor), use="complete.obs")
cor.test(as.numeric(dat.raw$X5.Year.Impact.Factor), as.numeric(dat.raw$Impact.Factor), use="complete.obs")

Let me know how it goes! Then we can start figuring out what stats you really want to calculate and how to do that.