Wayne:High Throughput Sequencing Resources: Difference between revisions

Revision as of 18:40, 15 February 2013

Basic server commands (for Sirius)

Here is a list of commonly used linux commands:

Command	Usage
pwd	Print working directory (your current location
ls	List (all contents of current location)
ls options	ls -a (hidden files), ls -l (long/detailed list), ls -t (sorted by time modified instead of name)
cd /give/path	Change directories
cd ..	Go up one directory
mkdir directoryName	Make a new directory
rmdir directoryName	Remove directory (must be empty)...Remember that you cannot undo this move!
rmdir -r directoryName	Recursively remove directory and the files it contains...Remember that you cannot undo this move!
rmdir filename	Remove specified file...Remember that you cannot undo this move!
head filename	Print to screen the top 10 lines or so of the specified file
tail filename	Print to screen the last 10 lines or so of the specified file
more filename	Allows file contents or piped output to be sent to the screen one page at a time
less filename	Opposite of more command
wc filename	Print byte, word, and line counts
wc filename [options]	-c (bytes); -l (lines); -w (words) delimited by whitespace or newline
whereis [filename, command]	Lists all occurances of filename or command
mv	Move (akin to cut/paste), to remove the file in the current location; Usage: mv current/path/filename destination/path/filename
cp	Copy (also used to rename files if you keep them in their current path), keeps a copy in the current path; Usage: cp current/path/filename destination/path/filename
nohup commands &	To initiate a no-hangup background job
screen	To initiate a new screen session to start a new background job
tar -xzf filename.tar.gz	Decompress tar.gz file
gzip -c filename >filename.gz	Compress file into tar.gz; the ">" means print to outfile filename.gz

Here is a list of commonly used linux commands for learning about the CPU utilization:

Command	Usage
top	Display top CPU processes/jobs and provides an ongoing look at processor activity in real time. It displays a listing of the most CPU-intensive tasks on the system, and can provide an interactive interface for manipulating processes. It can sort the tasks by CPU usage, memory usage and runtime.
mpstat	To display the utilization of each CPU individually. It reports processors related statistics.
mpstat -P ALL	The mpstat command display activities for each available processor, processor 0 being the first one. Global average activities among all processors are also reported.
sar	Displays the contents of selected cumulative activity counters in the operating system

Top

Wayne Lab Home

High throughput (HT) platform and read types

ABI-SOLiD
Illumina single-end vs. paired-end
Ion Torrent
MiSeq
Roche-454
Solexa

Top

Wayne Lab Home

CBI Collaboratory

UCLA's

Computational Biosciences Institute Collaboratory hosts a variety of 3-day workshops that provide both a general introduction to genome/bioinformatic sciences as well as more advanced (focus) workshops (e.g. ChIP-Seq; BS-Seq; Exome sequencing). The CBI Collaboratory focuses on a set of publicly available resources, from the web-based bioinformatic tool Galaxy/UCLA (resource for HT workflows and is a central location of a variety of HT tools for multiple platforms and data types), but also tools such as R and Matlab. The introductory workshops do not require any programming experience and the Collaboratory Fellows additionally serve as a counseling resource for data analysis.

Top

Wayne Lab Home

File formats and conversions

bcl
qseq
fastq

Top

Wayne Lab Home

Deplexing using barcoded sequence tags

Editing (or hamming) distance

Top

Wayne Lab Home

Quality control

Fastx tools
Using mapping as the quality control for reads

Top

Wayne Lab Home

Trimming and clipping

Trim based on low quality scored per nucleotide position within a read
Clip sequence artefacts (e.g. adapters, primers)

Top

Wayne Lab Home

FASTQC and FASTX tools

Top

Wayne Lab Home

BED and SAM tools

Top

Wayne Lab Home

GATK variant calling

Top

Wayne Lab Home

R basics

Top

Wayne Lab Home

HT sequence analysis using R (and Bioconductor)

Top

Wayne Lab Home

DNA sequence analysis

Top

Wayne Lab Home

RNA-seq analysis

Common objectives of transcriptome analysis:

Quantifying and annotating aligned reads
Normalizing RNA-Seq read count data and identifying differentially expressed genes (DEG) (R packages):
- easyRNASeq (simplifies read counting per genome feature)
- DEXSeq (Inference of differential exon usage)
- DEGseq
- baySeq (also see: segmentSeq)
- Genominator (Bullard et al. 2010)
Detection of alternative splice junctions
- ERANGE
- TopHat
- SpliceMap
- SplitSeek

Top

Wayne Lab Home

SOLiD software tools

SOLiD tools

Top

Wayne Lab Home

@@ Line 133: / Line 133: @@
 == High throughput (HT) platform and read types ==
 <ul>
+<li> ABI-SOLiD
 <li> Illumina single-end vs. paired-end
-<li> 454 Roche
+<li> Ion Torrent
-<li> SOLiD
 <li> MiSeq
-<li>Ion Torrent
+<li> Roche-454
+<li> Solexa
 </ul>
@@ Line 189: / Line 190: @@
 <li> Clip sequence artefacts (e.g. adapters, primers)
 </ul>
+<br>
+<div align="right">[http://openwetware.org/wiki/Wayne:High_Throughput_Sequencing_Resources Top]</div>
+<div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div>
+== FASTQC and FASTX tools ==
+<br>
+<div align="right">[http://openwetware.org/wiki/Wayne:High_Throughput_Sequencing_Resources Top]</div>
+<div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div>
+== BED and SAM tools ==
+*<div>[http://code.google.com/p/bedtools/ BED tools]</div>
+*<div>[http://samtools.sourceforge.net SAMtools]</div>
+<br>
+<div align="right">[http://openwetware.org/wiki/Wayne:High_Throughput_Sequencing_Resources Top]</div>
+<div align="right">[http://openwetware.org/wiki/Wayne_Lab Wayne Lab Home]</div>
+== GATK variant calling ==
 <br>

Wayne:High Throughput Sequencing Resources: Difference between revisions

Revision as of 18:40, 15 February 2013

Contents

Basic server commands (for Sirius)

High throughput (HT) platform and read types

CBI Collaboratory

File formats and conversions

Deplexing using barcoded sequence tags

Quality control

Trimming and clipping

FASTQC and FASTX tools

BED and SAM tools

GATK variant calling

R basics

HT sequence analysis using R (and Bioconductor)

DNA sequence analysis

RNA-seq analysis

SOLiD software tools

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools