DataONE/Summer 2010/Research questions

From OpenWetWare
Jump to navigationJump to search

This DataONE OpenWetWare site contains informal notes for several research projects funded through DataONE. DataONE is a collaboration among many partner organizations, and is funded by the US National Science Foundation (NSF) under a Cooperative Agreement.

DataONE

Home        People        Research        Summer 2010        Resources       


Research Questions and Research Plans

Let's start brainstorming formal research questions, then you can flush out the scope and add your research plans for a June 30th mini-deliverable.

Data citation practice inventory within journals

Owner: Sarah.

  1. What are various practices for data citation within academic papers? How prevalent is each variety?
  2. How do these practices vary across discipline, journal, data type, data source?
  3. How have these practices varied across time?

Scope and Plan

  • which journals?
    • We have some survey results on scientist attitudes and behaviours that might sync up nicely with these results if we choose journals that reflect the scientists' fields. When asked "Which of the following best describes your primary field of concentration within evolutionary biology?" the top results were:
      • Behavior/Neurobiology 23%
      • Development/Morphology 21%
      • Ecology 17%
      • Genetics/Genomics 14%
      • Molecular evolution 8%
      • Paleontology 8%
    • I don't know which journals best sync up with these fields?
  • which time periods?
  • what data will you extract?
  • how many datapoints do you expect?
  • what stats will you run? what is your statistical power?
  • what do you plan to have complete by June 30th?
  • plans for integration with other intern work?

Data sharing and citation policies

Owner: Nic

  1. What are the data sharing and citation policies applicable to authors, from funders, journals, institutions, and repositories?
  2. How are the collages of applicable policies different by discipline, journal, data type, data source?
  3. How have the collages of applicable policies changed across time?
  4. How do the applicable policies correlate with data sharing behaviour

Comment: this may need narrowing down...

==Scope and Plan++

  • where to focus the research. Specific issues of specific journals? Same as Sarah's?

I think Sarah and I should coordinate our research efforts, in so far as the journals she is mining for data reuse and citations should also be the journals where I am collecting Metadata and broader policies on data sharing and citation.

Our work should also overlap in that I can look for authors funding resources and institutional affiliation for further policies. A potential obstacle is that this isn’t necessarily going to give us clear boundaries for discipline specific data (other than place of publication) Data Types might help to parse this out a bit, but not reliably.

  • what data will you extract?

(would love recommendations in each of these categories)

Metadata elements of Journals 1.Publisher 2. Date of Publication 3. Format of Pub (e-only or available in print) 4. Society Affiliation 5. Data Repository 6. Open Access / Subscription 7. Impact Factor 8. Peer Reviewed 9. Where Indexed / Abstracted Policy of Sharing and Citations (For Institutions, Repositories, Journals and Funding Sources) Metadata: 1. Entity Name 2. Physical Location / Affiliation (if any) 3. Domain Affiliation / Included Disciplines (if any) 4. ??? Broader Data 1. What are the elements of their Data Policies a. Institutions and Funding sources : Requirements of a Data Management Plans for researchers b. Repositories and Journals: Requirements for deposit / publication 2. Specific language for sharing data and or citing data 3. Suggestions on how to cite data 4. ??? </nowiki>

  • how many datapoints do you expect?

Strongly depends on how broad the sample size is… (in short, I don’t know yet)

  • what stats will you run? what is your statistical power?
  • what do you plan to have complete by June 30th?
  • plans for integration with other intern work?

I think / hope that Sarah and I can coordinate our data gathering. Hopefully this will allow us to have more correlations in our data, and we can begin to see broader patterns of data citation w/r/t impact, what effect these have on Question 2 of my research (the collage of applicable policies ) and vice versa.

Data citation practice inventory for repositories

Owner: Valerie

  1. What are all the ways that data housed in given repositories are cited or attributed?
  2. How do these practices vary across discipline, journal, data type, data source?
  3. How have these practices varied across time?

(Very similar to Sarah's project, above)--->*I have some ideas on repository inventory that I haven't been able to explore yet, we should talk about ideas/approaches...I'll post more later, email me if I don't by June 14 or so!!!! - Sarah

Scope and Plan

  • which repositories? TreeBASE, Pangaea, the ORNL DAAC archive ?
  • how will you bound the problem? a subset of repository entries? a subset of journals for citation and attribution links?
  • what methods will be used to search for citation and attributions? using which search resources?
  • what is the estimated coverage of these methods? Could come from Sarah's project results.
  • how many datapoints do you expect?
  • what stats will you run? what is your statistical power?
  • what do you plan to have complete by June 30th?
  • plans for integration with other intern work?
  • plans for integration/parallel analysis with Heather's NCBI GEO work?
    • I'll flush out some background info and this and provide links... feel free to ask in the meantime