Difference between revisions of "PGP and Tranche"
(→Transferring Data onto Tranche)
|Line 7:||Line 7:|
In order to increase the utility of project data and make more of it available to the public, the [http://personalgenomes.org
In order to increase the utility of project data and make more of it available to the public, the [http://personalgenomes.org Genome Project] (PGP) has launched [[PersonalGenomes@Home]]. This effort uses [http://tranche.proteomecommons.org/about/ Tranche] for persistent storage. The Tranche Project is a free and open source file sharing tool that enables collections of computers to easily share and cite scientific data sets. Designed and built with scientists and researchers in mind, Tranche essentially solves the data sharing problem in a secure and scalable fashion.
====Tranche User Account====
====Tranche User Account====
Revision as of 14:24, 17 December 2008
- User:Andrea Loehr (PGP)
- User:Alexander Wait Zaranek (PGP)
- User:James A. Hill (Tranche)
- User:Bryan E. Smith (Tranche)
In order to increase the utility of project data and make more of it available to the public, the Personal Genome Project (PGP) has launched PersonalGenomes@Home. This effort uses Tranche for persistent storage. The Tranche Project is a free and open source file sharing tool that enables collections of computers to easily share and cite scientific data sets. Designed and built with scientists and researchers in mind, Tranche essentially solves the data sharing problem in a secure and scalable fashion.
Tranche User Account
To apply for a user account fill out the form at Tranche User Account Application. Pending account applications are reviewed weekly on Mondays.
Java Runtime Environment 5.0 or later; See System Requirements
Tranche User Guide and Instructions for Up- and Downloads
A detailed user guide can be found Tranche User Guide here.
There are three ways to add or get data from the network:
- GUI: Go to the Tranche homepage and click "Launch Tranche". (Requires Java 5+ with Web Start)
- Command-line tools: See below
- Java API: For custom tools development
The most popular of the three is the GUI, as it is easy to use. The command-line tools are useful for automating tasks or working in headless environments, and the API is useful when integrating Tranche in a software project or for creating a custom tool
wget http://tranche.proteomecommons.org/files/CommandLineAddFileTool.zip wget http://tranche.proteomecommons.org/files/CommandLineGetFileTool.zip
In order to use these tools you also need a certificate, which you can get instantly at Tranche Autocert. It comes in the form of USER.zip.encrypted.
Download each tool, unzip the file, go into unzipped directory, type java -jar NAME.jar --help to obtain usage information. (If java is not in your system path, add it to your path or type the full path /path/to/java -jar NAME.jar --help.
For usage information java -jar Tranche-Downloader.jar --help
Download a project with a certain hash: java -jar Tranche-Downloader.jar HASH
For usage information: java -jar Tranche-Uploader.jar --help
Upload a file:
java -Xmx521m -jar Tranche-Uploader.jar -u USER.zip.encrypted -p PASSWORD -c true -t "MY TITLE" -d "MY DESCRIPTION" /home/DataForUpload
There is the option to download/upload encrypted data:
java -jar Tranche-Downloader.jar -e supersecret HASH
java -jar Tranche-Uploader.jar -u FILE.zip.encrypted -p supersecret /home/DataForDownpload
To get notified about changes and upgrades one can join the automated tool group for command-line tools and API. </br>
Transferring Data onto Tranche
For initial data transfer, could ship (two?) USB drives to BPF:
Attn: Andrew Gagne Biopolymers Facility 77 Ave. Louis Pasteur Room 0088 Boston, MA 02115
- PGP2 - FC37_2 - http://genomerator.freelogy.org/~awz/pgp2-FC_00037_L002/ Note: Other data sets could appear on the hard-disk(s) with this directory structure. On arrival data could be loaded into Tranche as 100 data "bundles" per data-set (i.e an Illumina lane).
- PGP1 - FC37_3
- PGP3 - FC35_3
- PGP5 - FC44_2
- PGP7 - FC44_4
- PGP8 - FC37_1,FC51_2,FC51_6
- PGP9 - FC43_3,FC51_3,FC51_7
- PGP10 - FC41_3
Also, could use:
- CONTROL - FC35, FC37, FC41, FC43, FC44, FC51.
For all the above there is a top level directory (eg. pgp2-FC_00037_L002) and exactly 36 directories below that. Within each of those directories there are 4x100 files. For this release, it would be ideal if the data was organized in tranche as 18x100 "randomly addressable" data sets that a volunteer computer could ask for as desired. Each addressable "bundle" of data would then be 4x36 files.