Sequence PTM Prediction
Sequence based predictions follow industry standards in flagging known PTM motifs. Input sequences are scanned for canonical motifs for deamidation, isomerization, and glycosylation.
Given the motif hierarchy for deamidation and isomerization, weights are applied accordingly to assign a score for a relative metric of occurrence. Oxidation is not included due to known structural parameters needed for modification.
The final report returns these metrics along with residue mappings for identified motifs.
Structural PTM Prediction
Cyrus predictive models for Asn deamidation and Met photooxidation match performance observed in literature.
Baseline model performance was reproduced and benchmarked using available supplemental training and validation data.
Alternative feature sets were tested and compared to baseline models to better target predictors of importance during feature selection.
Rosetta generated structural data was substituted in training and test data using available crystal structures [Deamidation only]
Feature sets were reduced based on top predictors described in literature and observed in model optimization [Deamidation and oxidation]
Running PTM prediction
PTM prediction can be run with the following command:
cyrus run ptm-prediction input.fasta
The job takes as input either a FASTA or PDB file
Flags:
"--offset" adjust output residue numbering from pose to original numbering scheme "--raw" output raw prediction data for each PTM
FASTA inputs will return reports of motif hits with scores based on degradation propensity using motif hierarchy.
Structural inputs will return liable residue positions and map them to the structure via PyMol script.