Application of molecular descriptors for recognition of phosphorylation sites in amino acid sequences


1. Institute of Biomedical Chemistry, Moscow, Russia; Pirogov Russian National Research Medical University, Moscow, Russia
2. Institute of Biomedical Chemistry, Moscow, Russia
Type: OMICS-technologies
DOI: 10.18097/PBMC20176305423      PubMed Id: 29080875
Year: 2017 vol: 63  issue:5  pages: 423-427
Abstract: Recognition of the phosphorylation sites in proteins is required for reconstruction of regulatory processes in living systems. This task is complicated because the phosphorylation motifs in amino acid sequences are considerably degenerated. To improve the prediction efficacy researchers often use additional descriptors, which should reflect physicochemical features of site-surrounding regions. We have evaluated the reasonability of this approach by applying molecular descriptors (MNA) for structural presentation of the peptide segments. Comparative testing was performed using the prognostic method PASS and two input data types: sets of the MNA descriptors represented peptides as chemical structures and amino acid sequences written using a one-letter code. Training sets were classified in accordance with the established types of the enzymes (protein kinases), modifying corresponding phosphorylation sites. The accuracy estimates obtained by prognosis validation for various classes of substrates were significantly different with both the letters and molecular descriptors. In case of the letter description, the prognosis accuracy demonstrated less dependence on the length of peptides in the training set, while in the case of structural descriptors the accuracy level was determined by the peptide size and descriptor characteristics (MNA levels). The maximal prognosis accuracy related to various kinase families was achieved at different sizes of molecular fragments covered by the MNA descriptors of corresponding levels. This obviously reflected structural differences in surroundings of phosphorylation sites modified by various protein kinases. The use of molecular descriptors provided the prognostic results comparable with the results obtained using traditional letter representation. The prognosis accuracy demonstrated less dependence on the method describing site-surrounding peptides at higher accuracy rates. Applying the MNA descriptors it is possible to achieve better accuracy in the cases when the letter description cannot provide acceptable accuracy.
Download PDF:
Reference: Karasev D.A., Savosina P.I., Sobolev B.N., Filimonov D.A., Lagunin A.A., Application of molecular descriptors for recognition of phosphorylation sites in amino acid sequences, Biomeditsinskaya khimiya, 2017, vol: 63(5), 423-427.