Revision as of 08:12, 26 November 2007

Protein mass spectrometry can be divided into:

identification of proteins/peptides
quantification

Protein/peptide identification

Peptide Mass Fingerprinting (PMF) or (MS)

Old method, superseded by MS/MS

algorithms:

- Mascot (gives probabilistic score)
- Aldente
- ProFound ProFound

caveats
- no sequence information
- journals started to require that at least one peptide of a protein identified by PMF should be confirmed by MS/MS

Peptide fragment fingerprinting (PFF) or (MS/MS)

algorithms (most commonly used):
- Sequest $$$
- Mascot $$$
- OMSSA Open Mass Spectrometry Search Algorithm, open source
- XTandem open source effort from Canada

algorithms (other/new/experimental):
- Spectrum Mill $$$
- MASPIC
  - this paper claims 5-15% more confident hits than Sequest: [1]
- InsPecT A new variable mods search from Pevzner & Tanner @UCSD (free?)

filtering bad quality spectra

filtering of the results
- Trans Proteomic Pipeline [2] (free?)
  - download from Sourceforge (TPP Cygwin Setup for Windows or 'Trans-Proteomic Pipeline' for Linux)
  - commercial offshot IPP
  - wiki devoted to TPP TPP_Wiki
  - dynamic newsgroup: spctools-discuss

- DTASelect it seems to be in a semi-frozen state (free for nonprofit but requires signed MTA)

Databases

Protein databases

Use (if possible):

IPI International Protein Index
always use target-decoy database (i.e. concatenated: human_IPI + reversed_human_IPI)
decoy databases creation methods:
- protein reversal (simple to perform. does not scramble fortunately quite rare palindromic sequences)
  - MEGGAYGAGKAGGAFDPYTL -=> LTYPDFAGGAKGAGYAGGEM
- peptide pseudo-reversal (used in Sorcerer by Sage-N Research)
  - MEGGAYGAGKAGGAFDPYTL => (trypsin digest, used Ms-Digest) MEGGAYGAGK AGGAFDPYTL => GAGYAGGEMK-ALTYPDFAGG (each peptide reversed, but trypsin digestion site preserved -> guessed from the Elias 2007 paper)
- shuffled
  - MEGGAYGAGKAGGAFDPYTL => FYAGADEAGMGTYKGGAGLP (used SMS, results differ each time) -> recommended by EBI ppl
- random (i.e. creating database of random proteins based on frequency of AA in source fasta file)
to create decoy database use DBToolkit free java standalone

Modification databases

Unimod (> 500 natural + labels)
Delta Mass A Database of Protein Post Translational Modifications (in vivo)
RESID detailed descriptions of > 400 modifications

Peptide Tag Searching

GutenTag free for non-profit, MTA required. Assigns fewer peptides than Sequest but with fewer false positives. Occupies a middle ground between mainstream search algorithms and de novo sequencing.

de novo sequence determination algorithms

PepNovo: (PDF)
Sherenga (PDF)
Peaks (PDF)
Lutefisk web

Spectral matching

The idea is that if one can match spectrum of an unknown peptide to a very similar MS/MS spectrum in a database with a determined sequence/annotation then one can annotate unknown peptide in a process similar to orthologue annotation in protein sequence databases. Caveat: bad annotations will also get propagated.

P3 (server) from Global Proteomics Machine (free)
- description
SpectraST from ISB, Seattle (not as many species/options as P3)
BiblioSpec from MacCoss lab. (free for non-profit, online licence)
- command line only

Protein quantification

approaches
- isotopic labeling (ICAT, ITRAQ, SILAC, 18O- or 15N-labeling)
- label-free methods

software
- ASAPRatio from Trans Proteomics Pipeline:

"calculates the relative abundances of proteins and the corresponding confidence intervals from ICAT-type ESI-LC/MS data"

- MSQuant Parser for Mascot results for quantitation (Windows only)

Frameworks/pipelines

Trans Proteomic Pipeline (TPP) most popular, included in Sorcerer from Sage-N Research. Windows/Cygwin/Perl or Linux based.
Open-MS German, C++ based
Sorcerer $$$ FPGA-based fast hardware solution for SEQUEST & Tandem searches with TPP on top of it.

Web sites

UCSD (Pevzner)
U. of Washington (MacCoss)
Proteome Commons collection of tools & links
GenePattern proteomics modules from Broad Inst.

Reviews

For a good review of programs and aspects of protein identification by mass spectrometry see:

Hernandez et al. 2006 (HTML)

Palagi et al. 2006 (PDF)

Shadforth et al. 2005 (PDF)

Tutorials

Frédérique Lisacek's @Proteomics Web-based MS/MS Data Analysis on the web: Mascot, Phenyx and X!Tandem

Other tools to be sorted out

ProteinLynx Global SERVER $$$, from Waters Waters Corporation
Phenyx from GeneBio (online web server)

DeNovoID web
SPIDER (PDF) de novo + homology search in other species
OpenSea (HTML) Java program available from authors

ModifiComb (HTML) (available from authors?)
MODi web server for PTMs discovery

SILVER view your spectra with LOD scores

Credits

Template:Credits

Darek Kedra wrote this tutorial

@@ Line 30: / Line 30: @@
 ** [http://peptide.ucsd.edu/Software/Inspect.html  InsPecT] A new variable mods search from Pevzner & Tanner @UCSD (free?)
+* filtering bad quality spectra
+** [http://www.bioinfo.no/software/spectrumquality SpectrumQuality] see  [http://dx.doi.org/10.1002/pmic.200500309 Fikka et al. 2006]
+** [http://proteomics.ucd.ie/msmseval/ msmsEval]
+**
 * filtering of the results
@@ Line 38: / Line 42: @@
 *** dynamic newsgroup: [http://groups.google.com/group/spctools-discuss spctools-discuss]
 ** [http://fields.scripps.edu/DTASelect/index.html DTASelect] it seems to be in a semi-frozen state (free for nonprofit but requires signed MTA)
 ==Databases==

Wikiomics:Protein mass spectrometry: Difference between revisions

Revision as of 08:12, 26 November 2007

Contents