The disulfide single chain homology modeling Single Chain HM
API creates homology models of single chain proteins using the Rosetta hybridize homology modeling method. By default, the Rosetta hybridize method will automatically identify templates to use to construct the model. Howeverfor modeling; however, a custom template can be specified by the user. If the input sequence has cysteine residues capable of forming disulfide bonds, and those residues form disulfide bonds in the selected templates, Rosetta will attempt to maintain the disulfides in the homology model. Additionally, pairs of cystine residues can be explicitly marked as disulfides by the user.The single chain homology modeling protocol works better on proteins larger than ~20 residues and smaller than 500 residues.
Proteins between 20-500 residues in size are optimal for the single chain HM protocol. The API is limited to proteins of less than 1000 residues. There are other Alternative modeling approaches which are better suited to for modeling short peptides and or very large structures.
This tool is a modification of the HM tool in Cyrus Bench; it has added hooks for the disulfide list. It should otherwise perform very similarly to that tool.
Table of Contents |
---|
Inputs
...
Input sequence – a protein sequence
CLI argument:
--sequence SECVECGGFCPDPEKMGDWCCGRCIRNECRCG
Python submit() argument:
sequence=”SECVECGGFCPDPEKMGDWCCGRCIRNECRCG”
Only canonical amino acids are supported
...
Disulfide List (optional) — A list of residue number pairs which should form disulfide bonds
CLI argument:
--disulfide-list “1:10,3:5”
Python submit() argument:
disulfide_list=[(1,10), (3,5)]
The numbering scheme is against the input sequence (the first residue is 1).
Each residue should only be referenced in one disulfide pair
Each residue should be within range for the submitted sequence, and already a cysteine in that sequence.
Residues should be in a location such that they can reasonably participate in a disulfide bond without major movement of the protein backbone
Rosetta will attempt to satisfy these disulfides but is not guaranteed to form them
Other disulfides may be formed depending on the protein sequence and template selection
Residues in the disulfide list should not already be in competing disulfides. For example if the templates have a disulfide from 10 to 30, do not submit 10-15 as a disulfide pair.
You may explicitly list disulfides you expect to form that are already in your templates as a reinforcement of the desired patterns. This will increase the occurrence of those disulfides in your results if Rosetta finds the disulfides marginal.
Custom template (optional) — A PDB file to be used as a custom template. The input sequence will be threaded along this template.
...
CLI argument: --template template.pdb
...
Python submit() argument: template=”template.pdb”
...
, with added parameters for disulfide formation (See Notes for more information).
Table of Contents |
---|
Quickstart
Command Line Examples
Create a homology model for an input sequence:
Code Block |
---|
cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG |
Create a homology model for an input sequence, using template.pdb as a custom template:
Code Block |
---|
cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG --template template.pdb |
Create a homology model for an input sequence, with disulfides fixed between residues 3 and 21 and residues 10 and 24:
Code Block |
---|
cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG --disulfide-list 3:21,10:24 |
...
Code Block | ||
---|---|---|
| ||
from engine.single_chain_hm.client import SingleChainHmClient client = SingleChainHmClient() job_id = client.submit(sequence="SECVENGGFCPDPEKMGDWCCGRCIRNECRNG", disulfide_list=[(3,21), (10,24)]) |
Inputs
--sequence
Input protein sequence sequence
CLI argument:
--sequence SECVECGGFCPDPEKMGDWCCGRCIRNECRCG
Python submit() argument:
sequence=”SECVECGGFCPDPEKMGDWCCGRCIRNECRCG”
Only canonical amino acids are supported
Options
--disulfide-list
A list of residue number pairs which should form disulfide bonds (See Notes for more details).
CLI argument:
--disulfide-list “1:10,3:5”
Python submit() argument:
disulfide_list=[(1,10), (3,5)]
--template
A PDB file to be used as a custom template. The input sequence will be threaded along this template.
CLI argument:
--template template.pdb
Python submit() argument:
template=”template.pdb”
The template structure should have high sequence homology to the template.
Outputs
...
models
(directory)5 PDB files representing the centers of the top-scoring clusters of models generated during the homology modeling process.
score.sc
–The Rosetta scores associated with the 5 cluster centers.
Notes
Defining disulfides in Single Chain HM
If the input sequence has cysteine residues capable of forming disulfide bonds, and those residues form disulfide bonds in the selected templates, Rosetta will attempt to maintain the disulfides in the homology model. Additionally, pairs of cystine residues can be explicitly marked as disulfides by the user.
Explicit disulfides with --disulfide-list
The numbering scheme is against the input sequence (the first residue is 1).
Each residue should only be referenced in one disulfide pair
Each residue should be within range for the submitted sequence, and already a cysteine in that sequence.
Residues should be in a location such that they can reasonably participate in a disulfide bond without major movement of the protein backbone
Rosetta will attempt to satisfy these disulfides but is not guaranteed to form them
Other disulfides may be formed depending on the protein sequence and template selection
Residues in the disulfide list should not already be in competing disulfides. For example if the templates have a disulfide from 10 to 30, do not submit 10-15 as a disulfide pair.
You may explicitly list disulfides you expect to form that are already in your templates as a reinforcement of the desired patterns. This will increase the occurrence of those disulfides in your results if Rosetta finds the disulfides marginal.
Output File interpretation
...
If there are 5 distinct predictions, it may mean that the default sampling is insufficient, or that this particular problem is harder than this API is able to accommodate – please let Cyrus know contact us and we can discuss other options for this type of modeling problem.