Schumer lab: Depositing data on Oak

From OpenWetWare
Jump to navigationJump to search

Introduction

Oak is a large storage resource where we can keep large raw data files and large processed data files. You can link to files on Oak from your local directory on Sherlock.

For example, if you wanted to link to this file on Oak, you could navigate to your local directory and type:

ln -s /oak/stanford/groups/schumer/data/Xcortezi_whole_genome_data_sc_project_HUIC_sc_and_wt/HUICXI17JM06wt_read_1_allcombined.fastq.gz ./


Examples of file types that should be deposited on Oak:

  • fastq.gz files
  • associated i5 and i7 files for the fastq.gz files
  • sam/bam files
  • Ancestry tsv files from completed ancestryHMM runs. Make sure to use and informative name and copy a matching cfg file.

Moving data to or from our lab directory on Oak

Important!!! Make sure to create a new data folder for each deposited dataset and name your raw data folder informatively

For example:

Xbirchmanni_10Xchromium_Hudsonalpha_July2018_raw_data

This name gives the species, the technology used for library prep, the company that did the sequencing, and the date of sequencing.

  • Note: swordtail Tn5 data is stored in the following subdirectory:

/oak/stanford/groups/schumer/data/All_swordtail_low_coverage_Tn5_data


There are two ways to move data to our lab directory on Oak:


1) Using scp:

To move files to Sherlock (replacing with your user name):

scp myfile user@login.sherlock.stanford.edu:/oak/stanford/groups/schumer/data/mydirectory

To move files from Sherlock to a local directory:

scp user@login.sherlock.stanford.edu:/oak/stanford/groups/schumer/data/mydirectory/myfile ./


2) Using globus

Sign up for a globus account

Oak is linked to globus so you can navigate and download files through their interface.

Documenting data you have uploaded

Important: whenever you deposit data put the path to its location in the lab inventory in box!!!: Schumer_lab_Inventory_Log.xlsx

If this is a library produced by our lab, click on the "Libraries_and_data_paths" tab and fill in the path on oak


If this library was produced in another lab (or predates our inventory system), click on the "Paths_to_other_sequence_data" tab and fill in the sample description and path on oak


If you're depositing fastq files make sure to deposit the appropriate i5 and i7 files in the same directory!

For example, if your fastq files are named:

COACVI2018_CHAFV2018_ACUAV2018_ACUAVI2015_Tn5_S0_I1_alllanes_combined.fastq.gz

COACVI2018_CHAFV2018_ACUAV2018_ACUAVI2015_Tn5_S0_I2_alllanes_combined.fastq.gz

COACVI2018_CHAFV2018_ACUAV2018_ACUAVI2015_Tn5_S0_R1_alllanes_combined.fastq.gz

COACVI2018_CHAFV2018_ACUAV2018_ACUAVI2015_Tn5_S0_R2_alllanes_combined.fastq.gz


and are the result of two plates of Tn5 prep, put a single i5 file and two i7 files with informative names in the same directory. e.g.:

i5_library_COACVI2018_CHAFV2018_ACUAV2018_ACUAVI2015

20180626_Tn5_library_COAC_VI_2018_i7_barcodes.txt

20180718_Tn5_library_ACUA_VI_2015_CHAF_XI_2017_COAC_XI_2017_i7_barcodes.txt