OpenDocProject:DC CollectionPolicy/2010/09/07

{| width="800"
 * style="background-color: #EEE"|[[Image:owwnotebook_icon.png|128px]] Project name
 * style="background-color: #F2F2F2" align="center"|  |Main project page
 * style="background-color: #F2F2F2" align="center"|  |Main project page


 * colspan="2"|
 * colspan="2"|

Beginings
Today was the first opportunity for me to present a bit of my findings to the DC_ Data Practices working group. I am focusing on gathering elements of an effective Collections Policy statement for the Data Conservancy. I got a bit of a late start on getting acclimated to this project and just sort of jumped straight in, so I'll present what I've done thus far in order carry on with an organized (HA!) notebook from this point forward:

I began by surveying repositories from a list mentioned in an article by Brad Hemminger, "Scientific Data Repositories on the Web". The list is from figure 6 :



This list didn't necessarily fit the model that DC is looking to build (large, not domain specific, not institutionally tied science rep with varied data types), but again there aren't a lot of resources out there that do fit this description. This list in total rendered about 6 (!) repositories with collection policies. So I expanded my search, and noted some of my rough findings in a google doc SS. As you can see, there are a number of elements either left unaddressed by the repository or unattached to a particular field given my mappings.

Purely anecdotal at this point: What I've found really interesting (and separate from the work I did this summer) is that a number of repositories aren't explicit about what materials they will collect, what types they will collect, what material are ingested automatically, and what is solicited from one-time deposits....but almost all make a statement, even if very general about the preservation actions they perform. This says to me, and maybe I'm misinterpreting, that these repositories want to guarantee the material they hold is safe, and trusted for future access FOREMOST.

Now, there are a number of portals, such as the DAACS, and ADS of NERC DAta centers, that promote their services as facilitating reuse and discovery, but on the whole I would say from this small set of observations this is not the case. It's something I brought up in our meeting this afternoon and one of the questions I think will fuel future discussions: Are we promoting reuse, or trusted stewardship (and likely the answer is both, but which is of foremost importance (and C All of the above is not an option.)

Not so surprising: Very few set out how policy is developed. The Social Science archives, like ICPSR and Data Archive UK seem to be better at articulating privarcy issues (mostly due to the sensitive nature of their holdings), but also seem quite mature in disseminating info on how their policies are evaluated. ICPSR in particular has a semi annual review by their Acquisition Board. This also informs their "Current Areas of Interest" in expanding the holdings of the Archive.

Work to Finish
I'm going to begin annotating a list of elements that I feel are necessary for us going forward, and linking to relevant examples of repositories with these specific features.

Need to begin formulating questions for Birds of a Feather working group: service level agreements, mission statements, terms of use, preservation elements to include, data citation standards...


 * }