Schumer lab: Lab group on Sherlock
Please remember to clean up files you will not use or keep most of your temporary files on /scratch! It is really easy to forget about this and use up all of our space so make a habit of routinely removing temporary files.
Lab organization on Sherlock
Useful basic info
To see a copy of a presentation given by Cheyenne and Quinn in November 2021:
Dropbox/Schumer_lab_resources/Shared_lab_resources/Common_commands_and_pipelines/sherlock-slurm-oak_intro.pdf
To see a recording of the Zoom meeting led by Cheyenne and Quinn:
https://drive.google.com/file/d/19FHu2Rm1p19ibrmwiBdbmE5cO3dHaYf7/view?usp=drive_link
To see a copy of the onboarding presentation from SRCC in October 2021:
Dropbox/Schumer_lab_resources/Shared_lab_resources/Common_commands_and_pipelines/sherlock_onboarding-10-2021.pdf
Basics of logging in
!!! NOTE: These directions are for Mac users specifically !!!
Open Terminal to gain access to Sherlock directly from your computer. To log in, use the following command:
ssh [your_SUnetID]@login.sherlock.stanford.edu
You'll be asked for your password. And then asked how you want to do two-factor authentication.
Enter a passcode or select one of the following options:
1. Duo Push to XXX-XXX-[####]
2. Phone call to XXX-XXX-[####]
3. SMS passcodes to XXX-XXX-[####]
Passcode or option (1-3):
Enter the number for your option and confirm your login.
This will start you in your home directory
$HOME
which is the equivalent path to:
/home/users/[your_SUnetID]
Overall organization
Files and folders containing active scripts or small data files for analyses can be kept in:
$GROUP_HOME
which is the equivalent path to:
/home/groups/schumer
This directory does not have much space (1 TB shared) so should not be used for big files of any kind. A better place for that is:
$GROUP_SCRATCH
which is the equivalent path to:
/scratch/groups/schumer/
$GROUP_SCRATCH files are automatically deleted after being inactive for 90+ days so make sure to backup on $GROUP_HOME or $OAK if you need files or scripts long-term
Long term data storage and large files should kept in our lab directory:
$OAK
which is the equivalent path to:
/oak/stanford/groups/schumer
Some useful basic Linux commands
pwd
"print working directory" provides the path to the current directory you're in
cd [directory_path]
"change directory" allows you to move to different directories. Some useful commands using `cd` include:
cd $GROUP_SCRATCH
to navigate to the lab's group scratch directory /scratch/groups/schumer/
It would be equivalent to use the following command:
cd /scratch/groups/schumer/
ls -or- ls [directory_path]
"list" allows you to see all of the files and directories in the directory you're in OR a directory of your choice
less [file_name]
"less" allows you to view a particular file, use `q` to leave the view window
mkdir [name_of_new_directory]
"make directory" allows you to create a new directory with a name of your choice. This will create the directory IN the directory you're currently in, unless you include a path.
du -sh
"directory usage" allows you to estimate file and directory storage usage
exit
"exit" allows you to leave your Sherlock session
Some Sherlock specific commands that are useful include:
squeue -u [SUNetID]
"Sherlock queue" allows you to see the programs you currently have running on Sherlock. (`-u` stands for "user" and you can actually run this command with any SUNetID, so e.g., if you need to check if an undergrad's script is running you can do that too!)
sh_quota
"Sherlock quota" allows you to see what computational resources are available to you.
Making your directory
When you log into Sherlock for the first time, a good first step is to create a directory where you can perform analyses and store files.
Navigate to $GROUP_SCRATCH using `cd`:
cd $GROUP_SCRATCH
If you use `ls` you can list out the files that are present in this directory and you will see that many current (and former) lab members have their directories there!
ls
Next, make a directory for yourself!
mkdir [your_name]
Using `ls` you should now see your new directory.
ls
You can move into your new directory using `cd`:
cd [your_name]
Where to find specific files or programs
Genome assemblies, annotation files, recombination maps and other shared resources can be found here:
/home/groups/schumer/shared_bin/shared_resources
Shared scripts and the lab git repository are here:
/home/groups/schumer/shared_bin/Lab_shared_scripts
Commonly used programs in the lab that are not available via module loading can be found here:
/home/groups/schumer/shared_bin
To make these programs globally available, you can export the path to this folder:
export PATH="/home/groups/schumer/shared_bin:$PATH"
To avoid doing this every time you login to Sherlock you can edit you .bashrc file, which can be found here:
/home/users/$USER/\.bashrc
and add the export command below the line that says:
# User specific aliases and functions
export PATH="/home/groups/schumer/shared_bin:$PATH"
In addition to the above command you might also want to automatically load certain modules, e.g.:
module load perl
module load R
Getting started
Make your own person folder inside our lab group folder here:
/home/groups/schumer/lab_member_folders
Storing data and data backup
Our lab raw data repository and repository for large files is on Oak, and can be accessed in three ways:
1) cd /oak/stanford/groups/schumer/data
2) cd $OAK/data
3) I've put a link to this folder in our lab Sherlock directory: cd $GROUP_HOME/data
- NOTE: Oak is not regularly backed up and cannot be seen as a permanent data repository. Other options include external hard drives and NCBI SRA. It is possible to set the release data on the SRA so that release can occur after publication if necessary, this can be edited any time.
- For directions on how to deposit on the SRA, see this file in dropbox: depositing_data_to_NCBIs_SRA.txt
Interactive jobs
Before running any programs from the login node on Sherlock, remember to request an interactive job:
you can use our dedicated nodes:
srun --pty -p schumer -t 0-2 bash
or request general nodes:
srun --pty -t 0-2 bash
You can also submit jobs, this is covered elsewhere Submitting slurm jobs
Common programs/dependencies and how to load them
Many programs on Sherlock are available as modules. To see the modules that are available, type:
module avail
Lots of biology specific modules have to be loaded after loading the general biology module:
module load biology
Any module can be loaded by typing:
module load $MODULE_NAME
after which the commands will be available globally
R
You will probably want to load R interactively on Sherlock at some point, to do quick analyses of your data or run analysis/generate figures on a dataset too large to load quickly on your Desktop.
To access R, first load the module:
module load R
To run R from the command line simply type R then enter.
To quit, type: q()
perl
Perl is globally available on Sherlock, but to install packages you need to load the perl module to use cpan:
module load perl
Many scripts we have in the lab depend on certain perl modules, they are listed below:
Math::Random
List::Util
List::MoreUtils
You can install perl modules by running the following from your terminal:
cpan MODULE
for example:
cpan Math::Random