Wikiomics:BLAST

From OpenWetWare
Jump to navigationJump to search

Basic Local Alignment Search Tool rapily identifies similar sequences. For a modification of the algorithm see PSI-BLAST. For other sequence similarity search tools see also FASTA and HMMER.

For more detailed description see Wikiomics:BLAST_tutorial and Wikipedia:BLAST.

Overview of the algorithm

The algorithms apply several heuristics - shortcuts - that allow it to provide very fast searches at reasonable accuracy. Most notably, it starts by searching for regions of high, uninterrupted similarity and only later connects them through low-similarity, gapped alignments. Other tricks include a statistically sound way of defining what's "similar enough", or the use of an approximated (and often wrong!) way of calculated how likely is an observed similarity to have occurred by chance.

Software availability

There are two main flavors and distributions of BLAST. One is provided through the NCBI called NCBI-BLAST and the other through Warren Gish at Washington University at St. Louis.

NCBI-BLAST is available as a binary as well as part of the NCBI toolbox. Note that much of the code is being ported over to a C++ version of the toolbox. Several optimized versions of NCBI exist as RPMs at Biolinux and Scalable Informatics. Under Debian the ncbi-tools-bin provides the necessary tools.

WU-BLAST is freely available to academic labs but a license agreement must be filled out. The source code is not available for WU-BLAST. More information can be found at the WU-BLAST site.

If you plan to do a moderate to large amount of sequence analysis with BLAST it makes the most sense to download the tool to run locally. This assumes you have sufficient compute resources and disk space - the non-redundant protein database NR is 600+ Mb compressed. See the tips below on ways to speed up your searches.

If you're not looking for homology between distant related genes / proteins, there's also a BLAST Like Alignment Tool (BLAT). The main difference with BLAST is that BLAST was designed to be faster than the Smith-Waterman (SW) algorithm but optimized for finding relatively weak homology. BLAT is optimized for speed at the expense of abandoning weak homology.

Online BLAST Tools and Webservices

BLAST is available through several online services to allow quick and simple access to the tools. In addition with corresponding scripts for automating submission and retrieval.

Genome Compiler

Genome Compiler, is an all-in-one DNA design software platform. BLAST is embedded inside the software, so you can simply send sequences or a whole part to Basic Local Alignment Search Tool (BLAST) directly from within the software.

NCBI BLAST

NCBI BLAST - Download binaries

blastcl3 (BLAST client which accesses the newest NCBI BLAST search engine)

BioPerl Remote BLAST

PISE

PISE site

BioPerl PISE script pise doc and (link to BioPerl site).

Tips and FAQs

  1. STRAP is a front-end for BLAST: The program STRAP contains a comfortable front end for local BLAST programs WU-BLAST and NCBI-BLAST as well as for the BLAST server at EBI. The user can submit several proteins at once which are processed one after the other by the program. The results are permanently stored in a cache located in the local file system. When the same query is requested a second time then the BLAST result comes up immediately because it is found in the cache. --Christo 14:38, 12 January 2006 (PST)
  1. KoriBlast, graphical platform to mine Blast data: KoriBlast is a reliable graphical environment dedicated to sequence data mining. KoriBlast combines Blast searches with advanced data management capabilities and a state-of-the-art graphical user interface.
  2. Bioedit: the famous Bioedit for Windows also has an interface to BLAST

See also

References

  1. Altschul SF, Gish W, Miller W, Myers EW, and Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990 Oct 5;215(3):403-10. DOI:10.1016/S0022-2836(05)80360-2 | PubMed ID:2231712 | HubMed [Altschul1990]
  2. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, and Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. DOI:10.1093/nar/25.17.3389 | PubMed ID:9254694 | HubMed [Altschul1997]
  3. McGinnis S and Madden TL. BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W20-5. DOI:10.1093/nar/gkh435 | PubMed ID:15215342 | HubMed [McGinnis2004]
  4. Korf I, Yandell M, and Bedell J, BLAST O'Reilly & Associates, 2003. isbn:0-596-00299-8

    [BLASTbook]
  5. BLAST Wikipedia article

    [wikipedia]
  6. [BlastTutorial]

All Medline abstracts: PubMed | HubMed