Wayne:High Throughput Sequencing Resources: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:
<div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div>
<div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div>
<div align="right">[http://openwetware.org/wiki/Wayne:Laboratory_Protocols Laboratory Protocols] </div>


== Basic server commands (for Sirius) ==
== Basic server commands (for Sirius) ==

Revision as of 18:15, 15 February 2013

Basic server commands (for Sirius)

Here is a list of commonly used linux commands:

Command Usage
pwd Print working directory (your current location
ls List (all contents of current location)
ls options ls -a (hidden files), ls -l (long/detailed list), ls -t (sorted by time modified instead of name)
cd /give/path Change directories
cd .. Go up one directory
mkdir directoryName Make a new directory
rmdir directoryName Remove directory (must be empty)...Remember that you cannot undo this move!
rmdir -r directoryName Recursively remove directory and the files it contains...Remember that you cannot undo this move!
rmdir filename Remove specified file...Remember that you cannot undo this move!
head filename Print to screen the top 10 lines or so of the specified file
tail filename Print to screen the last 10 lines or so of the specified file
more filename Allows file contents or piped output to be sent to the screen one page at a time
less filename Opposite of more command
wc filename Print byte, word, and line counts
wc filename [options] -c (bytes); -l (lines); -w (words) delimited by whitespace or newline
whereis [filename, command] Lists all occurances of filename or command
mv Move (akin to cut/paste), to remove the file in the current location; Usage: mv current/path/filename destination/path/filename
cp Copy (also used to rename files if you keep them in their current path), keeps a copy in the current path; Usage: cp current/path/filename destination/path/filename
nohup commands & To initiate a no-hangup background job
screen To initiate a new screen session to start a new background job
tar -xzf filename.tar.gz Decompress tar.gz file
gzip -c filename >filename.gz Compress file into tar.gz; the ">" means print to outfile filename.gz



Here is a list of commonly used linux commands for learning about the CPU utilization:

Command Usage
top Display top CPU processes/jobs and provides an ongoing look at processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system, and can provide an interactive interface for manipulating processes. It can sort the tasks by CPU usage, memory usage and runtime.
mpstat To display the utilization of each CPU individually. It reports processors related statistics.
mpstat -P ALL The mpstat command display activities for each available processor, processor 0 being the first one. Global average activities among all processors are also reported.
sar Displays the contents of selected cumulative activity counters in the operating system


High throughput (HT) platform and read types

  • Illumina single-end vs. paired-end
  • 454 Roche
  • SOLiD
  • MiSeq
  • Ion Torrent


File formats and conversions

  • bcl
  • qseq
  • fastq



Deplexing using barcoded sequence tags

  • Editing (or hamming) distance


Quality control

  • Fastx tools
  • Using mapping as the quality control for reads



Trimming and clipping

  • Trim based on low quality scored per nucleotide position within a read
  • Clip sequence artefacts (e.g. adapters, primers)


R basics


HT sequence analysis using R (and Bioconductor)


DNA sequence analysis


RNA-seq analysis

Common objectives of transcriptome analysis:


SOLiD software tools