Three de novo sequencing programs (Novor, PEAKS and PepNovo+) have been used for identification of 48 individual human proteins constituting the Universal Proteomics Standard Set 2 (UPS2) (“Sigma-Aldrich”, USA). Experimental data have been obtained by tandem mass spectrometry. The MS/MS was performed using pure UPS2 and UPS2 mixtures with E. coli extract and human plasma samples. Protein detection was based on identification of at least two peptides of 9 residues in length or one peptide containing at least 13 residues. Using these criteria 13 (Novor), 20 (PEAKS) and 11 (PepNovo+) proteins were detected in pure UPS2 sample. Protein identifications in mixed samples were comparable or worse. Better results (by ~20%) were obtained using prediction included high quality identified fragment (TAG) containing at least 7 residues and unidentified additional masses at N- and C-termini (PepNovo+). The latter approach confidently recognized mass-spectrometric artefacts (and probably PTM). Atypical mass changes missed in UNIMOD DB were found (PepNovo+) to be statistically significant at the C-terminus (+23.02, +26.04 and +27.03). Using peptides containing these modifications and milder detection threshold 41 of 48 UPS2 proteins were identified.
Keywords: shotgun proteomics, de novo sequencing, mass spectrometry, data processing