Schumer lab: Depositing data on Oak

From OpenWetWare
Jump to navigationJump to search


Oak is a large storage resource where we can keep large raw data files and large processed data files. You can link to files on Oak from your local directory on Sherlock.

For example, if you wanted to link to this file on Oak, you could navigate to your local directory and type:

ln -s /oak/stanford/groups/schumer/data/High_coverage_whole_genome_quail_data/Xcortezi_whole_genome_data_sc_project_HUIC_sc_and_wt/HUICXI17JM06wt_read_1_allcombined.fastq.gz ./

Examples of file types that should be deposited on Oak:

  • fastq.gz files
  • associated metadata files for the fastq.gz files
  • sam/bam files
  • Ancestry tsv files from completed ancestryHMM runs. Make sure to use and informative name and copy a matching cfg file.

Ancestry tsv files deposited on Oak can be found here:


sam/bam files and vcf files will be in different subfolders of Processed_files depending on the reference they were mapped to. For example:


Current organization system on OAK

Oak is currently organized by data type and raw and processed data folders.

We are in the process of documenting all the data available on OAK here:

Raw data:









Processed data:



Processed_files are resources to be used by anyone in the lab, lab_member_folders are personal backups

Moving data to or from our lab directory on Oak

Important!!! Make sure to create a new data folder for each deposited dataset and name your raw data folder informatively

For example:


This name gives the species, the technology used for library prep, the company that did the sequencing, and the date of sequencing.

  • Note: swordtail Tn5 data is stored in the following subdirectory:


There are two ways to move data to our lab directory on Oak:

1) Using scp:

To move files to Sherlock (replacing with your user name):

scp myfile

To move files from Sherlock to a local directory:

scp ./

2) Using globus

Sign up for a globus account

Oak is linked to globus so you can navigate and download files through their interface.

Important: Documenting data you have uploaded

If you are the one downloading new data to Oak, place it in the appropriate directory with the full library name, data type, and month and year sequenced. For example:




Important! Please update the google spreadsheet so that others can easily find the data

If you're depositing fastq files make sure to deposit the appropriate metadata files in the same directory!

For example, if your fastq files are named:





and are the result of two plates of Tn5 prep, put a single i5 file and two i7 files with informative names in the same directory. e.g.: