Wayne:High Throughput Sequencing Resources: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 151: Line 151:
</tr>
</tr>
<tr>
<tr>
<td>tar -xzf filename.tar.gz</td>
<td>tar -xzf ''filename.tar.gz''</td>
<td>Decompress tar.gz file</td>
<td>Decompress tar.gz file</td>
</tr>
</tr>
Line 158: Line 158:
<td>Compress file into tar.gz; the ">" means print to outfile ''filename.gz''</td>
<td>Compress file into tar.gz; the ">" means print to outfile ''filename.gz''</td>
</tr>
</tr>
</table>
<br>
Here is a list of commonly used linux commands for using ''top'' and learning about the server usage:
<table border="0">
<tr>
<td><b>Command</b></td>
<td><b>Usage</b></td>
</tr>
</table>
</table>
<br>
<br>

Revision as of 17:30, 15 February 2013

High throughput (HT) platform and read types

  • Illumina single-end vs. paired-end
  • 454 Roche
  • SOLiD
  • MiSeq
  • Ion Torrent

File formats and conversions

  • bcl
  • qseq
  • fastq



Deplexing using barcoded sequence tags

  • Editing (or hamming) distance



Quality control

  • Fastx tools
  • Using mapping as the quality control for reads



Trimming and clipping

  • Trim based on low quality scored per nucleotide position within a read
  • Clip sequence artefacts (e.g. adapters, primers)



DNA sequence analysis



RNA-seq analysis

  • Quantifying and annotating aligned reads
  • DESeq
  • edgeR

A variety of additional R packages are available for normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG):

  • easyRNASeq (simplifies read counting per genome feature)
  • DEXSeq (Inference of differential exon usage)
  • DEGseq
  • baySeq (also see: segmentSeq)
  • Genominator (Bullard et al. 2010)


Basic server commands (for Sirius)

Here is a list of commonly used linux commands:

Command Usage
pwd Print working directory (your current location
ls List (all contents of current location)
ls options ls -a (hidden files), ls -l (long/detailed list), ls -t (sorted by time modified instead of name)
cd /give/path Change directories
cd .. Go up one directory
mkdir directoryName Make a new directory
rmdir directoryName Remove directory (must be empty)...Remember that you cannot undo this move!
rmdir -r directoryName Recursively remove directory and the files it contains...Remember that you cannot undo this move!
rmdir filename Remove specified file...Remember that you cannot undo this move!
head filename Print to screen the top 10 lines or so of the specified file
tail filename Print to screen the last 10 lines or so of the specified file
more filename Allows file contents or piped output to be sent to the screen one page at a time
less filename Opposite of more command
wc filename Print byte, word, and line counts
wc filename [options] -c (bytes); -l (lines); -w (words) delimited by whitespace or newline
whereis [filename, command] Lists all occurances of filename or command
mv Move (akin to cut/paste), to remove the file in the current location; Usage: mv current/path/filename destination/path/filename
cp Copy (also used to rename files if you keep them in their current path), keeps a copy in the current path; Usage: cp current/path/filename destination/path/filename
nohup commands & To initiate a no-hangup background job
screen To initiate a new screen session to start a new background job
tar -xzf filename.tar.gz Decompress tar.gz file
gzip -c filename >filename.gz Compress file into tar.gz; the ">" means print to outfile filename.gz


Here is a list of commonly used linux commands for using top and learning about the server usage:

Command Usage


R basics

HT sequence analysis using R (and Bioconductor)