The Single Chain HM API creates homology models of single chain proteins using the Rosetta hybridize method. By default, Rosetta hybridize will automatically identify templates to use for modeling; however, a custom template can be specified by the user.

Proteins between 20-500 residues in size are optimal for the single chain HM protocol. The API is limited to proteins of less than 1000 residues. Alternative modeling approaches are better suited for modeling short peptides or very large structures.

This tool is a modification of the HM tool in Bench, with added parameters for disulfide formation (See Notes for more information).

1 Quickstart
- 1.1 Command Line Examples
- 1.2 Python Examples
2 Inputs
3 Options
4 Outputs
5 Notes
- 5.1 Defining disulfides in Single Chain HM
  - 5.1.1 Explicit disulfides with --disulfide-list
- 5.2 Output File interpretation

Quickstart

Command Line Examples

Create a homology model for an input sequence:

cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG

Create a homology model for an input sequence, using template.pdb as a custom template:

cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG --template template.pdb

Create a homology model for an input sequence, with disulfides fixed between residues 3 and 21 and residues 10 and 24:

cyrus engine submit single-chain-hm SECVENGGFCPDPEKMGDWCCGRCIRNECRNG --disulfide-list 3:21,10:24

Python Examples

Create a homology model for an input sequence:

from engine.single_chain_hm.client import SingleChainHmClient

client = SingleChainHmClient()
job_id = client.submit(sequence="SECVENGGFCPDPEKMGDWCCGRCIRNECRNG")

Create a homology model for an input sequence, using template.pdb as a custom template:

from engine.single_chain_hm.client import SingleChainHmClient

client = SingleChainHmClient()
job_id = client.submit(sequence="SECVENGGFCPDPEKMGDWCCGRCIRNECRNG", template="template.pdb")

Create a homology model for an input sequence, with disulfides fixed between residues 3 and 21 and residues 10 and 24:

from engine.single_chain_hm.client import SingleChainHmClient

client = SingleChainHmClient()
job_id = client.submit(sequence="SECVENGGFCPDPEKMGDWCCGRCIRNECRNG", disulfide_list=[(3,21), (10,24)])

Inputs

--sequence
- Input protein sequence sequence
- CLI argument: --sequence SECVECGGFCPDPEKMGDWCCGRCIRNECRCG
- Python submit() argument: sequence=”SECVECGGFCPDPEKMGDWCCGRCIRNECRCG”
- Only canonical amino acids are supported

Options

--disulfide-list
- A list of residue number pairs which should form disulfide bonds (See Notes for more details).
- CLI argument: --disulfide-list “1:10,3:5”
- Python submit() argument: disulfide_list=[(1,10), (3,5)]
--template
- A PDB file to be used as a custom template. The input sequence will be threaded along this template.
- CLI argument: --template template.pdb
- Python submit() argument: template=”template.pdb”
- The template structure should have high sequence homology to the template.

Outputs

models (directory)
- 5 PDB files representing the centers of the top-scoring clusters of models generated during the homology modeling process.
score.sc
- The Rosetta scores associated with the 5 cluster centers.

Notes

Defining disulfides in Single Chain HM

If the input sequence has cysteine residues capable of forming disulfide bonds, and those residues form disulfide bonds in the selected templates, Rosetta will attempt to maintain the disulfides in the homology model. Additionally, pairs of cystine residues can be explicitly marked as disulfides by the user.

Explicit disulfides with --disulfide-list

The numbering scheme is against the input sequence (the first residue is 1).
Each residue should only be referenced in one disulfide pair
Each residue should be within range for the submitted sequence, and already a cysteine in that sequence.
Residues should be in a location such that they can reasonably participate in a disulfide bond without major movement of the protein backbone
Rosetta will attempt to satisfy these disulfides but is not guaranteed to form them
Other disulfides may be formed depending on the protein sequence and template selection
Residues in the disulfide list should not already be in competing disulfides. For example if the templates have a disulfide from 10 to 30, do not submit 10-15 as a disulfide pair.
You may explicitly list disulfides you expect to form that are already in your templates as a reinforcement of the desired patterns. This will increase the occurrence of those disulfides in your results if Rosetta finds the disulfides marginal.

Output File interpretation

Cyrus’s HM tool returns 5 cluster centers after running a large number of HM trajectories. This clustering is balanced to return 5 models that have good energy within their structural cluster and represent different clusters.

If all 5 models are similar even after clustering, it means that HM was highly converged and/or that the template match was very high. This is a good sign, it means Rosetta has good confidence in this prediction.

If there are 5 distinct predictions, it may mean that the default sampling is insufficient, or that this particular problem is harder than this API is able to accommodate – please contact us and we can discuss other options for this type of modeling problem.