User:Steven J. Koch/Wiki Ideas/Public file system

Near term strategy
I don't think it makes sense for OWW to implement something like this now or even in the near-term future. But I think it does make sense for OWW to think ahead to some day where OWW offers something significantly more than just the MediaWiki stuff. (NOTE: But I also don't want OWW to consider abandoning MW either!) My lab is going to pursue posting stuff on our our file server as necessary, and also work towards real-time dumping of data to our public file server. We'll link to things via URLs as I describe below. If this ends up being useful, it will provide a clearer picture of what benefits could be provided by the file server being on OWW instead of my personal lab. Maybe in 2-3 years.

Specifically, we are beginning to prepare a paper that will involve about 3000 DNA sequences (text files), about 3000 data sets (text files, spreadsheet style), and 100's of LabVIEW VI's (*.vi files). We'd like to post all of this on OWW, and of course we're not going to upload each of these things by hand. So, we'll plan on creating top-level OWW pages that link to the top-level file hierarchy, with specific links as needed. I think this will work really well, and will only have the disadvantage of possibly disappearing if something goes terribly wrong with our server.

Problems that Steve doesn't know how to deal with

 * Security
 * Technical support for the inevitable array of new bugs and problems

Things that Steve thinks seem like problems but aren't

 * Encouraging use of non-universal formats (e.g. *.doc, *.xls, *.whatever)
 * I don't think this is an issue, because the "upload file" already permits uploading these formats. GoogleDocs, OpenOffice, and Microsoft bashing won't solve this problem, because scientists will continue to use applications that create specialized binary data files.  (E.g., the LabVIEW *.vi files we want to share.)

Initial thoughts (maybe incomprehensible)
Steve Koch 03:50, 1 September 2008 (EDT):Caleb (an ECE major who is amazing with windows networking and other stuff in our lab) a few months ago showed my how to share files from our Windows server on the internet via IIS (which I think means Internet Information Services). The first (and pretty much only) thing I did was to put my entire hierarchy of data from graduate school in the shared folder on our server. (You can see the hierarchy here: http://kochlab.org/files/data/Koch_Data )

Recently I've been looking at some of this data, and making links to it (in my private wiki unfortunately). I have been delighted in how easy it is to link to non-wiki files by just putting in the URL. For example, here are some links I put in my notebook today:
 * file 0069 in EcoRI Close Site Popping (Word file)
 * http://kochlab.org/files/data/Koch_Data/Popping%20Paper%20I/EcoRI%20Close%20Site%20Popping/020127/Converted/0068%20Seg0003-465nm-020325%2017mer%20.dat (Data file)
 * http://kochlab.org/files/data/Koch_Data/Popping_VC_BsoBI/021112/Converted/0110%20Seg0008-021112%20BsoBI%2010000%20Linear.dat (Data file)

Notice there is one link to a Word file and two to data files. The Word file I linked via a renamed link (because it can open easily in Word), whereas the data files I just copied the whole URL (because more than likely I would be cutting and pasting address from the wiki page). The cool thing to me is that linking to any filetype is easy and possible

What does this mean? It means that if you know where a file is on a public file system, you can easily to it in a MediaWiki lab notebook via just the URL. What am I hoping? I'll try to explain:
 * For all of these, it would be cool if OWW could provide a networked folder for directly saving from Windows (and other) filesystems
 * Images
 * Images are commonly desired in lab notebooks. The process of saving them to the local hard rive and uploading to OWW, while straightforward, is time consuming and a significant barrier.  It would save a lot of time if users could directly save to OWW via a networked folder and then link via a URL.
 * Data
 * At least in our lab, lots of data sets are generated as individual files. Uploading these individually would be too difficult.  We intend to directly write them to our public server (described above).  But if there were some tie-in with OWW to ensure longevity, that would be even better.  The disk space for us is not a problem, but the ability to promise longevity is an issue
 * Many instruments in labs are setup to dump data (gel images, spec readings, etc.) to a directory. Why not an OWW mapped folder?
 * It would be nice for publications to be able to upload an entire set of raw data, converted data, labview VIs, etc. that were used during the publication.
 * Archived information
 * My above grad school data is just one sub-set of information I have buried on my hard-drive. I also have a whole hierarchy of word documents and other files that I could put in public easily via a file server like the one Caleb set up.  We've discussed on OWW the possibility of mass converting Word documents etc. to wiki format, but is this really necessary for information that is useful but 8 years old?

I don't know the technical details or feasibility. What I am imagining is OWW users being able to map a network folder (e.g. an "O:\" drive) using their openwetware username (or related) and password. Users would be able to create but not change or delete data.