Difference between revisions of "Wikiomics:Protein mass spectrometry"

From OpenWetWare
Jump to: navigation, search
m (1 revision(s))
Line 1: Line 1:
=Reviews=
+
Protein mass spectrometry can be divided into:
For a good review of programs and aspects of protein identification by mass spectrometry
+
* identification of proteins/peptides
see:
+
* quantification
* [http://www3.interscience.wiley.com/cgi-bin/fulltext/112139941/HTMLSTART Hernandez et al. 2006 (HTML)]
+
 
 +
 
 +
=Protein/peptide identification=
 +
==Peptide Mass Fingerprinting (PMF) or (MS)==
 +
Old method, superseded by MS/MS
 +
* algorithms:
 +
 
 +
** [http://www.matrixscience.com/home.html Mascot] (gives probabilistic score)
 +
** [http://www.expasy.org/tools/aldente/ Aldente]
 +
** [http://prowl.rockefeller.edu/prowl-cgi/profound.exe ProFound ProFound]
 +
 
 +
* caveats
 +
** no sequence information
 +
** journals started to require that at least one peptide of a protein identified by PMF should be confirmed by MS/MS
 +
 
 +
==Peptide fragment fingerprinting (PFF) or (MS/MS)==
 +
* algorithms (most commonly used):
 +
** [http://fields.scripps.edu/sequest/index.html Sequest] $$$
 +
** [http://www.matrixscience.com/home.html Mascot] $$$
 +
** [http://pubchem.ncbi.nlm.nih.gov/omssa/ OMSSA] Open Mass Spectrometry Search Algorithm, open source
 +
** [http://thegpm.org/ XTandem] open source effort from Canada
  
* [http://www3.interscience.wiley.com/cgi-bin/fulltext/113344091/PDFSTART Palagi et al. 2006 (PDF)]
+
* algorithms (other/new/experimental):
 +
** [http://www.chem.agilent.com/scripts/pds.asp?lpage=7771 Spectrum Mill] $$$
 +
** [http://compbio.ornl.gov/MASPIC/distribution/ MASPIC ]
 +
*** this paper claims 5-15% more confident hits than Sequest: [http://pubs.acs.org/cgi-bin/article.cgi/ancham/2005/77/i23/html/ac0501745.html]
 +
** [http://peptide.ucsd.edu/Software/Inspect.html  InsPecT] A new variable mods search from Pevzner & Tanner @UCSD (free?)
  
* [http://www3.interscience.wiley.com/cgi-bin/fulltext/112098427/PDFSTART Shadforth et al. 2005 (PDF)]
 
  
=Programs used in protein mass spectrometry=
+
* filtering of the results
 +
** Trans Proteomic Pipeline [http://tools.proteomecenter.org/TPP.php] (free?)
 +
*** download from  [http://sourceforge.net/project/showfiles.php?group_id=69281 Sourceforge] (TPP Cygwin Setup for Windows or 'Trans-Proteomic Pipeline' for Linux)
 +
*** commercial offshot  [http://www.insilicos.com/IPP.html IPP]
 +
*** wiki devoted to TPP [http://tools.proteomecenter.org/wiki/index.php?title=Main_Page TPP_Wiki]
 +
*** dynamic newsgroup: [http://groups.google.com/group/spctools-discuss spctools-discuss]
  
==TPP==
+
** [http://fields.scripps.edu/DTASelect/index.html DTASelect] it seems to be in a semi-frozen state (free for nonprofit but requires signed MTA)
Trans Proteomic Pipeline [http://tools.proteomecenter.org/TPP.php] and its comercial offshot  [http://www.insilicos.com/IPP.html IPP]
 
There is also a new wiki devoted to TPP [http://tools.proteomecenter.org/wiki/index.php?title=Main_Page] as well as a dynamic newsgroup:
 
[http://groups.google.com/group/spctools-discuss]
 
  
==GPM & XTandem==
+
==Databases==
An open source effort from Canada: [http://thegpm.org/]  
+
Use (if possible):
 +
* [http://www.ebi.ac.uk/IPI/IPIhelp.html IPI] International Protein Index
 +
* always use target-decoy database (i.e. concatenated: human_IPI + reversed_human_IPI)
 +
* decoy databases creation methods:
 +
** protein reversal (simple)
 +
*** MEGGAYGAGKAGGAFDPYTL -=> LTYPDFAGGAKGAGYAGGEM
 +
** peptide pseudo-reversal (used in Sorcerer by Sage-N Research)
 +
*** MEGGAYGAGKAGGAFDPYTL => (trypsin digest, used [http://169.230.19.26:8080/prospector/4.27.1/cgi-bin/msform.cgi?form=msdigest Ms-Digest]) MEGGAYGAGK AGGAFDPYTL => GAGYAGGEMK-ALTYPDFAGG (each peptide reversed, but trypsin digestion site preserved -> guessed from the Elias 2007 paper) 
 +
** shuffled
 +
*** MEGGAYGAGKAGGAFDPYTL => FYAGADEAGMGTYKGGAGLP (used [http://host9.bioinfo3.ifom-ieo-campus.it/sms2/shuffle_protein.html SMS],  results differ each time)
 +
** random (i.e. creating database of random proteins based on frequency of AA in source fasta file)
 +
* to create decoy database use [http://genesis.ugent.be/dbtoolkit/ DBToolkit] free java standalone
  
==InterAct==
+
==de novo sequence determination algorithms==
A new variable mods search from Pevzner & Tanner @UCSD [http://peptide.ucsd.edu/]
+
* PepNovo: [http://darwin.informatics.indiana.edu/col/meeting/2005_10/PepNovo.pdf (PDF)]
 +
* Sherenga [http://www.liebertonline.com/doi/pdfplus/10.1089/106652799318300 (PDF)]
 +
* Peaks [http://www.bioinformatics.uwaterloo.ca/papers/03peaks.pdf (PDF)]
 +
* Lutefisk [http://www.hairyfatguy.com/Lutefisk/ web]
  
==Other tools==
+
==Spectral matching ==
 +
* [http://p3.thegpm.org/tandem/ppp.html P3 (server)] from Global Proteomics Machine (free)
 +
** [http://www.thegpm.org/PPP/index.html description]
 +
* [http://www.peptideatlas.org/spectrast/ SpectraST] from ISB, Seattle (not as many species/options as P3)
 +
* [http://proteome.gs.washington.edu/software/bibliospec/documentation/index.html BiblioSpec] from MacCoss lab. (free for non-profit, online licence)
 +
** command line only
  
* massSorter [http://www.bioinfo.no/software/massSorter]
+
 +
=Web sites=
 +
* [http://peptide.ucsd.edu/Software.html UCSD (Pevzner)]
 +
* [http://proteome.gs.washington.edu/ U. of Washington (MacCoss)]
 +
* [http://www.proteomecommons.org/tools.jsp Proteome Commons] collection of tools & links
  
* Open Mass Spectrometry Search Algorithm (OMSSA) [http://pubchem.ncbi.nlm.nih.gov/omssa/]
+
=Reviews=
 +
For a good review of programs and aspects of protein identification by mass spectrometry
 +
see:
 +
* [http://www3.interscience.wiley.com/cgi-bin/fulltext/112139941/HTMLSTART Hernandez et al. 2006 (HTML)]
  
* DTASelect [http://fields.scripps.edu/DTASelect/index.html]
+
* [http://www3.interscience.wiley.com/cgi-bin/fulltext/113344091/PDFSTART Palagi et al. 2006 (PDF)]
it seems to be in a semi-frozen state.
 
  
* MASPIC [http://compbio.ornl.gov/MASPIC/distribution/index.html]
+
* [http://www3.interscience.wiley.com/cgi-bin/fulltext/112098427/PDFSTART Shadforth et al. 2005 (PDF)]
this paper claims 5-15% more confident hits than Sequest: [http://pubs.acs.org/cgi-bin/article.cgi/ancham/2005/77/i23/html/ac0501745.html]
 
  
* ProteinProspector [http://prospector.ucsf.edu/]
 
  
* ProFound [http://prowl.rockefeller.edu/prowl-cgi/profound.exe]
 
  
* Aldente [http://www.expasy.org/tools/aldente/]
+
==Other tools to be sorted out==
  
 +
* massSorter [http://www.bioinfo.no/software/massSorter]
 +
* ProteinProspector [http://prospector.ucsf.edu/]
 
* Sonar  [http://bioinformatics.genomicsolutions.com/service/prowl/sonar.html]
 
* Sonar  [http://bioinformatics.genomicsolutions.com/service/prowl/sonar.html]
  
=de novo sequence determination algorithms=
 
* PepNovo: [http://darwin.informatics.indiana.edu/col/meeting/2005_10/PepNovo.pdf (PDF)]
 
* Sherenga [http://www.liebertonline.com/doi/pdfplus/10.1089/106652799318300 (PDF)]
 
* Peaks [http://www.bioinformatics.uwaterloo.ca/papers/03peaks.pdf (PDF)]
 
* Lutefisk [http://www.hairyfatguy.com/Lutefisk/ web]
 
 
===to be verified===
 
 
* DeNovoID [http://proteomics.mcw.edu/denovoid web]
 
* DeNovoID [http://proteomics.mcw.edu/denovoid web]
 
* SPIDER [http://ieeexplore.ieee.org/iel5/9262/29416/01332434.pdf?tp=&isnumber=&arnumber=1332434 (PDF)] de novo + homology search in other species
 
* SPIDER [http://ieeexplore.ieee.org/iel5/9262/29416/01332434.pdf?tp=&isnumber=&arnumber=1332434 (PDF)] de novo + homology search in other species
 
* OpenSea [http://pubs.acs.org/cgi-bin/article.cgi/jprobs/2005/4/i02/html/pr049781j.html (HTML)] Java program available from authors
 
* OpenSea [http://pubs.acs.org/cgi-bin/article.cgi/jprobs/2005/4/i02/html/pr049781j.html (HTML)] Java program available from authors
  
=Comercial Programs=
 
* Sequest [http://fields.scripps.edu/sequest/index.html]
 
* Mascot [http://www.matrixscience.com/home.html]
 
* Spectrum Mill  [http://www.chem.agilent.com/scripts/pds.asp?lpage=7771]
 
 
=New additions=
 
 
* MSQuant [http://msquant.sourceforge.net/ MSQuant] Parser for Mascot results for quantitation.
 
* MSQuant [http://msquant.sourceforge.net/ MSQuant] Parser for Mascot results for quantitation.
 
* ModifiComb [http://www.mcponline.org/cgi/content/full/5/5/935 (HTML)] (available from authors?)
 
* ModifiComb [http://www.mcponline.org/cgi/content/full/5/5/935 (HTML)] (available from authors?)
Line 69: Line 105:
 
VEMS 3.0
 
VEMS 3.0
 
MassSorter Eidhammer
 
MassSorter Eidhammer
->
+
 
  
 
{{stub}}-->
 
{{stub}}-->

Revision as of 10:40, 23 November 2007

Protein mass spectrometry can be divided into:

  • identification of proteins/peptides
  • quantification


Protein/peptide identification

Peptide Mass Fingerprinting (PMF) or (MS)

Old method, superseded by MS/MS

  • algorithms:
  • caveats
    • no sequence information
    • journals started to require that at least one peptide of a protein identified by PMF should be confirmed by MS/MS

Peptide fragment fingerprinting (PFF) or (MS/MS)

  • algorithms (most commonly used):
  • algorithms (other/new/experimental):
    • Spectrum Mill $$$
    • MASPIC
      • this paper claims 5-15% more confident hits than Sequest: [1]
    • InsPecT A new variable mods search from Pevzner & Tanner @UCSD (free?)


  • filtering of the results
    • Trans Proteomic Pipeline [2] (free?)
    • DTASelect it seems to be in a semi-frozen state (free for nonprofit but requires signed MTA)

Databases

Use (if possible):

  • IPI International Protein Index
  • always use target-decoy database (i.e. concatenated: human_IPI + reversed_human_IPI)
  • decoy databases creation methods:
    • protein reversal (simple)
      • MEGGAYGAGKAGGAFDPYTL -=> LTYPDFAGGAKGAGYAGGEM
    • peptide pseudo-reversal (used in Sorcerer by Sage-N Research)
      • MEGGAYGAGKAGGAFDPYTL => (trypsin digest, used Ms-Digest) MEGGAYGAGK AGGAFDPYTL => GAGYAGGEMK-ALTYPDFAGG (each peptide reversed, but trypsin digestion site preserved -> guessed from the Elias 2007 paper)
    • shuffled
      • MEGGAYGAGKAGGAFDPYTL => FYAGADEAGMGTYKGGAGLP (used SMS, results differ each time)
    • random (i.e. creating database of random proteins based on frequency of AA in source fasta file)
  • to create decoy database use DBToolkit free java standalone

de novo sequence determination algorithms

Spectral matching

  • P3 (server) from Global Proteomics Machine (free)
  • SpectraST from ISB, Seattle (not as many species/options as P3)
  • BiblioSpec from MacCoss lab. (free for non-profit, online licence)
    • command line only


Web sites

Reviews

For a good review of programs and aspects of protein identification by mass spectrometry see:


Other tools to be sorted out

  • massSorter [3]
  • ProteinProspector [4]
  • Sonar [5]
  • DeNovoID web
  • SPIDER (PDF) de novo + homology search in other species
  • OpenSea (HTML) Java program available from authors
  • MSQuant MSQuant Parser for Mascot results for quantitation.
  • ModifiComb (HTML) (available from authors?)
  • MODi [6] web server for PTMs discovery
  • UNIMOD [7] database of PTMs
  • SILVER view your spectra with LOD scores