Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The disulfide single chain homology modeling Single Chain HM API creates homology models of single chain proteins using the Rosetta hybridize homology modeling method.   By default, the Rosetta hybridize method will automatically identify templates to use to construct the model. Howeverfor modeling; however, a custom template can be specified by the user.   If the input sequence has cysteine residues capable of forming disulfide bonds, and those residues form disulfide bonds in the selected templates, Rosetta will attempt to maintain the disulfides in the homology model.  Additionally, pairs of cystine residues can be explicitly marked as disulfides by the user.The single chain homology modeling protocol works better on proteins larger than ~20 residues and smaller than 500 residues. 

Proteins between 20-500 residues in size are optimal for the single chain HM protocol. The API is limited to proteins of less than 1000 residues.   There are other Alternative modeling approaches which are better suited to for modeling short peptides and or very large structures.

This tool is a modification of the HM tool in Cyrus Bench; it has added hooks for the disulfide list.  It should otherwise perform very similarly to that tool.

Table of Contents

Inputs

...

Input sequence – a protein sequence

  • CLI argument: --sequence SECVECGGFCPDPEKMGDWCCGRCIRNECRCG

  • Python submit() argument: sequence=”SECVECGGFCPDPEKMGDWCCGRCIRNECRCG”

  • Only canonical amino acids are supported

...

Disulfide List (optional) — A list of residue number pairs which should form disulfide bonds

  • CLI argument: --disulfide-list “1:10,3:5”

  • Python submit() argument: disulfide_list=[(1,10), (3,5)]

  • The numbering scheme is against the input sequence (the first residue is 1).

  • Each residue should only be referenced in one disulfide pair

  • Each residue should be within range for the submitted sequence, and already a cysteine in that sequence.

  • Residues should be in a location such that they can reasonably participate in a disulfide bond without major movement of the protein backbone

  • Rosetta will attempt to satisfy these disulfides but is not guaranteed to form them

  • Other disulfides may be formed depending on the protein sequence and template selection

  • Residues in the disulfide list should not already be in competing disulfides.  For example if the templates have a disulfide from 10 to 30, do not submit 10-15 as a disulfide pair.

  • You may explicitly list disulfides you expect to form that are already in your templates as a reinforcement of the desired patterns.  This will increase the occurrence of those disulfides in your results if Rosetta finds the disulfides marginal.

Custom template (optional) — A PDB file to be used as a custom template.  The input sequence will be threaded along this template.

...

CLI argument: --template template.pdb

...

Python submit() argument: template=”template.pdb”

...

, with added parameters for disulfide formation (See Notes for more information).

Table of Contents

Quickstart

Command Line Examples

Create a homology model for an input sequence:

Code Block
cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG

Create a homology model for an input sequence, using template.pdb as a custom template:

Code Block
cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG --template template.pdb

Create a homology model for an input sequence, with disulfides fixed between residues 3 and 21 and residues 10 and 24:

Code Block
cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG --disulfide-list 3:21,10:24

...

Code Block
languagepy
from engine.single_chain_hm.client import SingleChainHmClient

client = SingleChainHmClient()
job_id = client.submit(sequence="SECVENGGFCPDPEKMGDWCCGRCIRNECRNG", disulfide_list=[(3,21), (10,24)])

Inputs

  • --sequence

    • Input protein sequence sequence

    • CLI argument: --sequence SECVECGGFCPDPEKMGDWCCGRCIRNECRCG

    • Python submit() argument: sequence=”SECVECGGFCPDPEKMGDWCCGRCIRNECRCG”

    • Only canonical amino acids are supported

Options

  • --disulfide-list

    • A list of residue number pairs which should form disulfide bonds (See Notes for more details).

    • CLI argument: --disulfide-list “1:10,3:5”

    • Python submit() argument: disulfide_list=[(1,10), (3,5)]

  • --template

    • A PDB file to be used as a custom template.  The input sequence will be threaded along this template.

    • CLI argument: --template template.pdb

    • Python submit() argument: template=”template.pdb”

    • The template structure should have high sequence homology to the template.

Outputs

...

  • models (directory)

    • 5 PDB files representing the centers of the top-scoring clusters of models generated during the homology modeling process.  

  • score.sc

    • The Rosetta scores associated with the 5 cluster centers.

Notes

Defining disulfides in Single Chain HM

If the input sequence has cysteine residues capable of forming disulfide bonds, and those residues form disulfide bonds in the selected templates, Rosetta will attempt to maintain the disulfides in the homology model.  Additionally, pairs of cystine residues can be explicitly marked as disulfides by the user.

Explicit disulfides with --disulfide-list

  • The numbering scheme is against the input sequence (the first residue is 1).

  • Each residue should only be referenced in one disulfide pair

  • Each residue should be within range for the submitted sequence, and already a cysteine in that sequence.

  • Residues should be in a location such that they can reasonably participate in a disulfide bond without major movement of the protein backbone

  • Rosetta will attempt to satisfy these disulfides but is not guaranteed to form them

  • Other disulfides may be formed depending on the protein sequence and template selection

  • Residues in the disulfide list should not already be in competing disulfides.  For example if the templates have a disulfide from 10 to 30, do not submit 10-15 as a disulfide pair.

  • You may explicitly list disulfides you expect to form that are already in your templates as a reinforcement of the desired patterns.  This will increase the occurrence of those disulfides in your results if Rosetta finds the disulfides marginal.

Output File interpretation

...

If there are 5 distinct predictions, it may mean that the default sampling is insufficient, or that this particular problem is harder than this API is able to accommodate – please let Cyrus know contact us and we can discuss other options for this type of modeling problem.