DataONE:Notebook/Reuse of repository data/2010/06/15

{| width="800"
 * style="background-color: #EEE"|[[Image:owwnotebook_icon.png|128px]] Reuse of Repository Data
 * style="background-color: #F2F2F2" align="center"|  |Main project page
 * style="background-color: #F2F2F2" align="center"|  |Main project page


 * colspan="2"|
 * colspan="2"|

Notes for June 15, 2010

 * See also: Data Citation Spreadsheet (File renamed to reflect other databases investigated with tabs for each database)
 * For citations of resources found, please refer to this CiteULike page.
 * Will start searching for articles citing data stored in Pangaea, but will return to TreeBASE as more search string ideas come.

Resources searched with search terms and hit count for TreeBASE

 * 1) Resource: Google Scholar Search term(s): upload, OR selected "from TreeBASE" -download, -available Limits: Published between two dates: 2008-2010 (month not available) Search only articles in the following subject areas: Biology, Life Sciences, and Environmental Science and Medicine, Pharmacology, and Veterinary Science All articles excluding patents. Results: no results
 * 2) Resource: Google Scholar Search term(s): "from TreeBASE" Limits: Published between two dates: 2008-2010 (month not available) Search only articles in the following subject areas: Biology, Life Sciences, and Environmental Science and Medicine, Pharmacology, and Veterinary Science All articles excluding patents. Results: 88 results
 * 3) Resource: Google Scholar Search term(s): treebase download "study accession" -deposit, -submit Published between two dates: 2008-2010 (month not available) Search only articles in the following subject areas: Biology, Life Sciences, and Environmental Science and Medicine, Pharmacology, and Veterinary Science All articles excluding patents. Results: 38 results
 * 4) Resource: Google Scholar Search term(s): treebase download "study accession" -deposited -submitted Published between two dates: 2008-2010 (month not available) Search only articles in the following subject areas: Biology, Life Sciences, and Environmental Science and Medicine, Pharmacology, and Veterinary Science All articles excluding patents. Results: 9 results
 * 5) Resource: ISI Web of Science Search term(s): Cited Author=(Yoo) AND Cited Work=(BMC PLANT BIOL) AND Cited Year=(2006)Timespan=2008-2010. Databases=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH Results: 6

Resources searched with search terms and hit count for Pangaea

 * 1) Resource: ISI Web of Science Search term(s): Cited author(Pangaea) Limits: Timespan=2008-2010 (month field not available in advanced search) Language: English Results: 3
 * 2) Resource: ISI Web of Science Search term(s): Cited Work(Pangaea) Limits: Timespan=2008-2010 (month field not available in advanced search) Language: English Results: 23
 * 3) Resource: Scirus Search term(s): (exact phrase) "data from Pangaea"  (in the complete document) Only show results published between: 2008 and 2010 (month field not available in advanced search) Results: 3
 * 4) Resource: Scirus Search term(s): (all of the words) data set, Pangaea database (in the complete document) Limits: Only show results published between: 2008 and 2010 (month field not available in advanced search) Only show results that are Abstracts, Articles Results: 107
 * 5) Resource: Scirus Search term(s): (any of the words) doi:10.1594/PANGAEA* (in the complete document) Limits: Only show results published between: 2008 and 2010 (month field not available in advanced search) Only show results that are Abstracts, Articles Results: 0
 * 6) Resource: Scirus Search term(s): (exact phrase) doi:10.1594/PANGAEA* (in the complete document) Limits: Only show results published between: 2008 and 2010 (month field not available in advanced search) Only show results that are Abstracts, Articles Results: 12

Observations

 * Valerie Enriquez 10:40, 15 June 2010 (EDT): Initial search may have been too narrow. Will try again from broad and narrow as Google Scholar allows.
 * Valerie Enriquez 10:55, 15 June 2010 (EDT): Results found in second search include an article about a new (as of 2009) wrapper for TreeBASE accessible at TBase Wrapper. Article can be found here: doi:10.1186/1471-2148-9-93
 * Valerie Enriquez 11:39, 15 June 2010 (EDT): Broadening the search resulted in a lot of false drops that include articles where the authors deposited their data into TreeBASE instead of downloading data from TreeBASE.
 * Valerie Enriquez 12:03, 15 June 2010 (EDT): Overlapping results with Syst. Biol search from 6/14/2010 entry: doi:10.1080/10635150801886156 and doi:10.1093/sysbio/syp019
 * Valerie Enriquez 12:19, 15 June 2010 (EDT): Third search pulled some overlap with the second. However, despite adding "-" (NOT) operator to words "deposit" and "submit," words "deposited" and "submitted" still appear in article in the context of data from study being deposited into TreeBASE.
 * Valerie Enriquez 12:19, 15 June 2010 (EDT): Fourth search still pulls articles with word "available" in the context of data from study being deposited into TreeBASE and available for download.
 * Valerie Enriquez 13:08, 15 June 2010 (EDT): First search in ISI Web of Science for citations of data from Pangaea did not yield many hits. Once again, articles that have deposited their data have been included in results such as: DOI: 10.1017/S0025315407058249 and were not included in the spreadsheet data.
 * Valerie Enriquez 14:01, 15 June 2010 (EDT): Second search in ISI Web of Science for articles citing Pangaea yielded more results. More author names are included, sometimes DOIs for datasets given in citation as per Pangaea's recommendations found here.
 * Valerie Enriquez 14:19, 15 June 2010 (EDT): Third search for articles citing Pangaea data resulted only in links to this website: Alfred Wegener Institute for Polar and Marine Research (AWI) Methods that included this presentation: Using Pangaea: How to extract data from Pangaea and convert it into formats that are useful for further applications. Will broaden search further.
 * Valerie Enriquez 14:33, 15 June 2010 (EDT): Added restriction to show only articles and abstracts to eliminate website hits or books as they are not relevant to this part of the study. Once again, false drops including articles that have deposited their data in Pangaea are included. Also, some articles about the supercontinent Pangaea are included.
 * Valerie Enriquez 14:44, 15 June 2010 (EDT): Attempted search using DOI prefix and wildcard (*) in Scirus for articles citing data stored in Pangaea. No results found.
 * Valerie Enriquez 14:51, 15 June 2010 (EDT):Altered search for "exact phrase" instead of "all words," resulting in 12 hits. However, at least three of the results include the phrasing "available in Pangaea," in the context of the investigators depositing their research in Pangaea as opposed to extracting the work of others from Pangaea.
 * Valerie Enriquez 15:46, 15 June 2010 (EDT): After suggestion at meeting, I am in phase II of searching for reused data from TreeBASE by the article citations. For example, the first article I had found in phase I cited legacy study ID S1459 (S1515) with the following citation: Yoo M., Albert V., Soltis P., & Soltis D. 2006. Phylogenetic diversification of glycogen synthase kinase 3/ SHAGGY-like kinase genes in plants. BMC Plant Biology, null. doi:10.1186/1471-2229-6-3 This is probably the best search method, searching for articles citing the data through the study citation and not the accession numbers. However, I was a bit concerned that there may be some articles that do not cite from TreeBASE, but rather, the journal from which the article citation originally came from. In this case, data cited from BMC Plant Biology as opposed to directly from TreeBASE. What I may do is compare findings from phase I and phase II's for each of the studies to identify gaps in how reused data can be found.


 * }