CaMPDB :: Substrate

Overview
Browse
Search
Calpacchopper
Statistics

Enter your query sequence in FASTA-style format:

Select Prediction model:

Usage »

This Bayesian model requires secondary structure prediction results and reliability scores by JPRED 4. If you have the results and the reliability scores, please enter "Jnet" and "Jnet Rel" respectively. Otherwise, please follow the steps below to obtain them.

Enter your query sequence to the top text box.
(Optional) Enter your email address to 'E-mail to' text box. It may take several minutes or more to calculate JPRED 4. Entering E-mail address, the hyperlink will be sent to the e-mail address to access this site with your query sequence and JPRED results even if you close the web browser.
(Optional) You can check whether the email address is correct by clicking 'E-mail Test' button. Sometimes it may take several minutes to receive the e-mail message.
Click 'Get JPRED4 Prediction' button
After JPRED4 calculation, the secondery strucuture prediction and its reliabiliby score are entered to the text box automatically.
Click 'Submit' button.

E-mail to:

The sequence of less than 800 residues is available.

Jnet - Final secondary structure prediction for query:
Jnet Rel - Jnet reliability of prediction accuracy:

About prediction models »

Below are AUC scores for our Calpacchopper (Bayesian model) and GPS^*), evaluated on a set of 20-mer sequences comprised of 210 curated cleaved sequences in the literature that were not used for training either Calpacchopper or GPS, Their reversed sequences were used as negative samples (total of 420 sequences):

Calpacchopper: 0.667
GPS: 0.605

(*) GPS-CCD is an on-line predictor for calpain cleavage sites. http://ccd.biocuckoo.org/

Below are AUC scores for each model, evaluated using 10x10 cross-validation on a curated dataset of 90 substrate sequences (220 cleavage sites). Please note that these are non-comparable to the above AUC values:

MKL: 0.834 (SEM: 0.0054
SVM RBF: 0.769 (SEM: 0.011)
SVM Linear: < 0.71
PSSM: < 0.69

Although MKL predictor tends to produce the best results, it requires considerably more time to run, due to the necessity to predict secondary structure of the input sequence as a preliminary step.

References »

Fumiko Shinkai-Ouchi, Suguru Koyama, Yasuko Ono, Shoji Hata, Koichi Ojima, Mayumi Shindo, David duVerle, Mika Ueno, Fujiko Kitamura, Naoko Doi, Ichigaku Takigawa, Hiroshi Mamitsuka and Hiroyuki Sorimachi; Predictions of Cleavability of Calpaini Proteolysis by Quantitative Structure-Activity Relationship Analysis Using Newly Determined Cleavage Sites and Catalytic Efficiencies of an Oligopeptide Array. Mol. Cell. Proteomics, 2016, 15, 1262-1280. (Bayesian model)
David A. duVerle, Yasuko Ono, Hiroyuki Sorimachi, Hiroshi Mamitsuka; Calpain Cleavage Prediction Using Multiple Kernel Learning. PLoS ONE, 2011, 6(5), e19035 (SVM, PSSM, and MKL model)