DataONE:GEO reuse study/Phase 1

From OpenWetWare
Revision as of 09:15, 18 June 2010 by Heather A Piwowar (talk | contribs) (initial content)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Research Plan

  • Query PubMed Central for GEO accession number patterns
  • Only look at one year of PMC because deposit rate (and possibly spectrum) not constant over time

Open Questions

  • Also look at Highwire Press, Google Scholar, other full text sources?
    • More difficult because can't process queries automatically
  • Look for accession number patterns for datasets and data series?

Limitations

Important for argument

This is a conservative estimate because:

  • Many papers not in PMC (source for percentages?)
  • Many data citations not attributed using accession numbers (source for percentages?)

Less important for argument

  • Doesn't capture reuse outside the peer-reviewed literature (for example, reuse during training)
  • Deposits into PMC not stable over time, distribution may change over time