Revision as of 15:48, 23 July 2010

Short read toolbox

This page has been created to help list resources for working with next generation sequence data.

Online short-read resources

SEQanswers - Online forum for next generation sequencing.
SEQanswers software post - Post of software avaliable for next generation sequence data.
SEQwiki - SEQ Answers wikilist of bioinformatic applications.
De novo tips - Blog on de novo assembly.
UCSC Bioinformatics - UC Santa Cruz's bioinformatics server.
Cipres - Cipres.
GMOD - GMOD.

List of short-read quality control software

TileQC - Requires R, RMySQL and MySQL.
Short Read Toolbox - Scripts for quality control of Illumina data.

List of sequence format information

Short Read Toolbox - Scripts for quality control of Illumina data.
FASTQ - Wikipedia's FASTQ page.
FASTA - Wikipedia's FASTA page.

List of alignment format information

SAMtools - SAMtools.
AMOS - AMOS.
UCSC - UCSC's faq on file formats.

List of open source de novo assemblers

Velvet - Implements De Bruijn Graphs in C. Requires 64 bit Linux OS.
Edena - 32 and 64 bit Linux.
ABySS - Multi-threaded de novo assembly.
Ray - Multi-threaded de novo assembly.

QSRA - Utilizes quality scores.

List of open source reference guided assemblers

MAQ - Mapping and Assembly with Qualities.
Bowtie - Bowtie.
BWA - Burrows-Wheeler aligner.

RGA - Perl script which calls blat to assemble short reads.

List of assembly viewers

Tablet
SAMtools - SAMtools.

List of alignment programs

MAFFT - MAFFT.
T-Coffee - T-Coffee.
Muscle - Muscle.
MUMmer - MUMmer.

List of query programs

BLAST - BLAST.
BLAT - BLAT.

Perl

A very brief example to demonstrate file input/output.

Code:

#!/usr/bin/perl
use strict;
use warnings;
my (@temp, $in, $out);
my $inf = "data.fq";
my $outf = "data_out.fq";
open($in, "<", $inf) or die "Can't open $inf: $!";
open($out, ">", $outf) or die "Can't open $outf: $!";
while(<$in>){
  chomp($temp[0]=$_); # First line is an identifier.
  chomp($temp[1]=<$in>); # Second line is sequence.
  chomp($temp[2]=<$in>); # Third line is an identifier.
  chomp($temp[3]=<$in>); # Fourth line is quality.
  print $out join("\t", @temp)."\n";
}
close $in or die "$in: $!";
close $out or die "$out: $!";

perlintro - Introduction to perl with links to other documentation.
BioPerl beginners - Introduction to BioPerl (be prepared for object oriented code).

Python

R project

R project - Statistical programming environment.
Bioconductor - R for biologists (micro-array and next generation data).
APE - Analysis of phylogenetics and evolution R package.

@@ Line 94: / Line 94: @@
 *[[User:Brian J. Knaus]]
 *[[Cronn Lab]]
-*[[Liston:Computer_Scripts]]Liston lab NGS scripts
+*[[Liston:Computer_Scripts]]

Short read toolbox: Difference between revisions

Revision as of 15:48, 23 July 2010

Contents

Short read toolbox

Online short-read resources

List of short-read quality control software

List of sequence format information

List of alignment format information

List of open source de novo assemblers

List of open source reference guided assemblers

List of assembly viewers

List of alignment programs

List of query programs

Perl

Python

R project

Useful links

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools