Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Predict 1000 new sequences for an input PDB using an elevated temperature (default temperature is 0.1

Code Block
breakoutModewide
cyrus engine submit protein-mpnn 1ubq.pdb --n-mpnn-designs 1000 --sampling-temperature 0.2

...

  • --pdb-file

    • The path to a PDB file containing the protein backbone you want to design sequences for

  • --n-mpnn-designs

    • The number of sequences to design, default is 1

  • --sampling-temperature

    • The sampling temperature of the model, default is 0.1, higher values will result in more variation of output sequenes

  • --gpu-type

    • The GPU to run the model on, default is t4. Set this to a100 if you are generating a very large number of sequences

  • --fixed-residue-positions

    • A space separated list of fixed residue positions following the format. i.e. ‘A1-10 A12 A15 B1-10'

      Individual positions: <Chain><Start>
      Ex: 'A1 A2 A3 A4'
      Subsequent positions: <Chain><Start>-<End>
      Ex: ‘A1-4’

  • --symmetric-chains

    • Chains which should be treated symmetrically having their sequences matched at each position. They are defined in space separated sets with each chain in the set separated by a -. A tetramer composed of two unique homodimers would be defined as “A-B Y-Z”

  • --tied-positions

    • Residues which should be symmetrically tied together such that mutating one automatically mutates the other. They are defined in sets separated by a space with each element chain separated by a slash and positions in the set being separated by a dash . i.e. “A1-10/B1-10 C5-10/D5-10”

    --fixed-residue-positions

    • A space separated list of fixed residue positions following the format Chain Start - End. i.e. 'A1-10 B1-10'

Outputs

  • designed_sequences.fasta

    • A FASTA file containing all designed sequences. The first record in the file is the native sequence of the protein in the PDB file. The headers of the FASTA file contain score and sequence recovery values for each designed sequence

...