Wikiomics:Multiple sequence alignment

From OpenWetWare
Jump to navigationJump to search


See also Wikiomics:Bioinfo_tutorial#Protein_Alignment

Multiple sequence alignment is widely used in the sequence analysis. It is more reliable, and hosts more information than derived from BLAST multiple pairwise alignment. The MSA allows for identification of common regions between proteins (including motifs), finding conserved residues and analysis of evolutionary relationships between sequences.

Software producing the multiple sequence alignment

  • Clustal [1]
  • T-Coffee
  • Probcons [2]
  • Muscle [3]
  • MAFFT [4]
  • Kalign [5]
  • PCMA [6]
  • TBA
  • MultiSeq [7]

Analysis of conservation in the multiple sequence alignment

  • AL2CO

Databases of multiple sequence alignments

  • PFAM
  • InterPro
  • CDD


  1. Thompson JD, Higgins DG, and Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994 Nov 11;22(22):4673-80. DOI:10.1093/nar/22.22.4673 | PubMed ID:7984417 | HubMed [clustalw]
  2. Do CB, Mahabhashyam MS, Brudno M, and Batzoglou S. ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 2005 Feb;15(2):330-40. DOI:10.1101/gr.2821705 | PubMed ID:15687296 | HubMed [probcons]
  3. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792-7. DOI:10.1093/nar/gkh340 | PubMed ID:15034147 | HubMed [muscle]
  4. Katoh K, Kuma K, Toh H, and Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33(2):511-8. DOI:10.1093/nar/gki198 | PubMed ID:15661851 | HubMed [mafft]
  5. Lassmann T and Sonnhammer EL. Kalign--an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics. 2005 Dec 12;6:298. DOI:10.1186/1471-2105-6-298 | PubMed ID:16343337 | HubMed [kalign]
  6. Pei J, Sadreyev R, and Grishin NV. PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics. 2003 Feb 12;19(3):427-8. DOI:10.1093/bioinformatics/btg008 | PubMed ID:12584134 | HubMed [pcma]
  7. Roberts E, Eargle J, Wright D, and Luthey-Schulten Z. MultiSeq: unifying sequence and structure data for evolutionary analysis. BMC Bioinformatics. 2006 Aug 16;7:382. DOI:10.1186/1471-2105-7-382 | PubMed ID:16914055 | HubMed [multiseq]
  8. Pei J and Grishin NV. AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics. 2001 Aug;17(8):700-12. DOI:10.1093/bioinformatics/17.8.700 | PubMed ID:11524371 | HubMed [al2co]

All Medline abstracts: PubMed | HubMed