Pride Wizard: generation of standards compliant quantitative proteomics data
Author(s): Jenny Siepen, Neil Swainston, Andy Jones, Sarah Hart, Henning Hermjakob, Phil Jones, Simon Hubbard
Affiliations: University of Manchester, Manchester, UK. EMBL Outstation, European Bioinformatics Institute, Hinxton, Cambridge, UK.
Keywords: mass spectrometry, proteomics, iTRAQ, PRIDE
The introduction of the Proteomics Identifications Database (PRIDE) provided the proteomics community with a standards compliant data repository for proteomics data. PRIDE implements standards put forward by the Proteome Standards Initiative (PSI) including mzData.
Many commonly used proteomics software packages do not currently support these standards. As such, formatting data to adhere to the PRIDE schema requires the writing of data parsers to perform the conversion. To address this, the Pride Wizard has been introduced to perform the transformation steps required to convert more commonly used file formats into documents that adhere to the PRIDE schema. Examples of these include .mgf files for peak lists and Mascot .dat files for peptide and protein identifications.
The existing PRIDE schema has no provision for quantitative proteomics data. A new controlled vocabulary is introduced here to allow storage of quantitative I-TRAQ data. It is envisaged that this can be extended to allow submission of quantitative data from other labelling techniques.
The tool has been used to populate a PRIDE database with mass spectra and associated protein identifications and quantifications from three comprehensive sets of proteomics experiments.