Main page
 

Documentation

 
Register
 
Log in
 
Queue
 
Submit job
 
Log out

Interpreting your results:

1. Interpreting your results from RosettaDesign: 
The energies, structure and sequence output by RosettaDesign are placed in a pdb file. The pdb file has the following sections: 1) coordinates of the design structure 2) a list of scores. Many of these scores are used in ab initio structure prediction and are not particularly relevant to protein design. The scores used during design with the default protocols are: total: the total score using the design energy function (lower is better) LJatr: attractive portion of the lennard-jones potential (rewards close contacts) LJrep: lennard-jones repulsive (penalizes overlaps) LKsol: lazaridis-karplus solvation model (penalizes buried polars) Erot: internal energy of side chain rotamers as derived from dunbrack's statistics Eintra: intra-residue clashes Epair: statistics based pair term, favors salt bridges Eaa_phipsi: P(aa|phi,psi), ramachandran preferences hb_sc: sidechain-sidechain and sidechain-backbone hydrogen bond energy hb_srbb: backbone-backbone hbonds close in primary sequence hb_lrbb: backbone-backbone hbonds distant in primary sequence 3) a table of energies for each residue in the protein: totals are at the bottom LJatr: lennard-jones attractive LJrep: lennard-jones repulsive LKsol: lazaridis-karplus solvation energy Eh2o_sol: solvation using explicit water, in default mode this is not used Eaa_phipsi: prob of an aa given phi and psi Erot: rotamer internal energies Eintra: internal clashes within residues Ehbnd: total hydrogen bonding per residue Epair: pair probabilities derived from the pdb database Eref: reference energy for each amino acid Egb: generalized born solvation energy, in default mode this is not used Eh2o, Eh2o_hb: energies from explicit waters Ecst: constraint energies Eres: total for that residue (lower is better) example of residue energy table: res aa LJatr LJrep LKsol Eh2o_sol Eaa_phipsi Erot Eintra Ehbnd Epair Eref Egb Eh2o Eh2o_hb Ecst Eres 1 MET -4.0 0.3 1.4 0.0 0.0 2.5 0.3 -0.8 0.0 0.3 0.0 0.0 0.0 0.0 -0.7 2 GLN -2.5 0.1 1.5 0.0 0.0 2.9 0.0 -0.7 -0.1 1.0 0.0 0.0 0.0 0.0 0.3 3 ILE -4.1 0.1 1.2 0.0 -0.2 0.1 0.4 -1.6 0.0 -0.2 0.0 0.0 0.0 0.0 -3.9 4 PHE -4.4 0.5 1.8 0.0 -0.3 0.2 0.0 -1.6 0.0 -0.6 0.0 0.0 0.0 0.0 -3.2 5 THR -3.5 0.0 1.8 0.0 0.0 0.0 0.0 -1.4 0.0 0.3 0.0 0.0 0.0 0.0 -3.3 6 LYS -3.0 0.0 1.5 0.0 0.1 0.9 0.1 -1.7 0.0 0.6 0.0 0.0 0.0 0.0 -2.8 7 THR -3.1 0.1 1.9 0.0 -0.7 0.3 0.0 -1.2 0.0 0.3 0.0 0.0 0.0 0.0 -3.0 8 LEU -1.3 0.0 0.5 0.0 0.0 1.6 0.5 0.0 0.0 0.1 0.0 0.0 0.0 0.0 1.2 9 THR -1.4 0.1 1.1 0.0 -0.1 0.8 0.0 -0.2 0.0 0.3 0.0 0.0 0.0 0.0 0.1 10 GLY -0.8 0.1 0.6 0.0 -1.5 -0.0 0.0 -0.3 0.0 0.2 0.0 0.0 0.0 0.0 -2.2 11 LYS -2.8 0.1 2.4 0.0 0.0 4.4 0.2 -0.7 -0.3 0.6 0.0 0.0 0.0 0.0 2.6 4) a table of: measured energies - expected energies (expected energies are derived by calculating the average energies of the different amino acids with a certain number of neighbors in a large set of proteins in the pdb) This table is useful for determining how well packed a residue is. The column Elj compares the actual lennard jones energy of residue to the expected value. Well packed residues should have Elj scores new zero or negative. LJatr: lennard-jones attractive LJrep: lennard-jones repulsive LKsol: Lazaridis-Karplus solvation Eaa_phipsi: P(aa|phi,psi) Erot: rotamer preferences from dunbrack's library Eintra: intra residue clashes Ehbnd: hydrogen bonding Epair: statistics based pair term Elj: lennard-jones total Eres: total per residue SASApack: SASApack is related to the void volume in a protein. Surface areas are computed with a 1.4 angstrom probe and 0.5 angstrom probe and the difference (ASA_0.5 - ASA_1.4) is compared to the expected difference for a particular residue type in a particular environment. A negative value is favorable and indicates that the residue is more tightly packed than is seen in average pdb files. example: energies-average(in pdb) energies, AND rsd SASA packing score res aa nb LJatr LJrep LKsol Eaa_phipsi Erot Eintra Ehbnd Epair Elj Eres SASApack 1 MET 15 -0.8 -0.1 0.0 0.0 -0.2 0.2 -0.1 0.0 -1.0 -1.7 2.38 2 GLN 11 0.2 -0.2 -0.3 0.1 0.3 0.0 -0.1 0.0 0.0 -0.4 8.93 3 ILE 22 0.3 -0.2 -0.3 0.0 -1.2 0.4 -0.6 0.0 0.1 -1.8 18.02 4 PHE 14 -0.3 0.1 0.4 -0.2 -1.0 -0.1 -1.0 0.0 -0.2 -2.5 1.89 5 THR 22 0.2 -0.3 -0.5 0.1 -0.5 0.0 -0.3 0.1 0.0 -1.0 11.62 6 LYS 15 0.3 -0.4 -0.5 0.1 -2.2 0.1 -0.9 0.2 -0.1 -3.6 5.48 7 THR 18 0.1 -0.2 -0.2 -0.6 -0.3 0.0 -0.2 0.1 -0.1 -1.1 10.38 8 LEU 10 0.9 -0.2 -0.6 0.1 0.1 0.5 0.5 0.0 0.8 0.9 1.33 5) a table of, total measured energies - expected energies, for residues in different environments: surface, buried and exposed. When creating novel structures we have found it difficult to get Elj numbers that are zero or negative for the buried residues. example: actual-average(in pdb) energies per residue LJatr LJrep Elj buried -0.1 -0.2 -0.3 middle -0.3 -0.1 -0.3 surfac 0.1 -0.1 0.0 6) a table of starting minus finishing chi angles, and absolute chi angles 7) phi, psi and omega angle for each residue

Tendencies:

RosettaDesign Tendencies
In some cases RosettaDesign does appear to make odd choices, and it helps
to know beforehand what some of these tendencies are.  In these situations it
is probably best to use a resfile to try and point Rosetta away from these
pitfalls.


1) The program likes to put amino acids with similar chemical properties near each other. 
This is primarily because polar residues can hydrogen bond with each other, and hydrophobics 
can pack without burying hbonding groups. The result is that in some cases you may observe a 
large cluster of hydrophobic residues on the surface of a protein, or a cluster of polars in 
the core. In some cases this can be avoided by forcing key residues to be polar or hydrophobic.

2) Sometimes polar groups are buried without a hydrogen bonding partner. The
energy function has been parameterized to try and avoid this, but there is
no filter that prevents it.

File formats:
PDB file format description:


Please go to PDB Format Description for more details.
Record Format:
COLUMNS        DATA TYPE       FIELD         DEFINITION
---------------------------------------------------------------------------------
 1 -  6        Record name     "ATOM  "

 7 - 11        Integer         serial        Atom serial number.

13 - 16        Atom            name          Atom name.

17             Character       altLoc        Alternate location indicator.
18 - 20        Residue name    resName       Residue name.

22             Character       chainID       Chain identifier.

23 - 26        Integer         resSeq        Residue sequence number.

27             AChar           iCode         Code for insertion of residues.

31 - 38        Real(8.3)       x             Orthogonal coordinates for X in
                                             Angstroms.

39 - 46        Real(8.3)       y             Orthogonal coordinates for Y in
                                             Angstroms.

47 - 54        Real(8.3)       z             Orthogonal coordinates for Z in
                                             Angstroms.

55 - 60        Real(6.2)       occupancy     Occupancy.

61 - 66        Real(6.2)       tempFactor    Temperature factor.

73 - 76        LString(4)      segID         Segment identifier, left-justified.

77 - 78        LString(2)      element       Element symbol, right-justified.

79 - 80        LString(2)      charge        Charge on the atom.

2. Resfile format description:
This file specifies which residues will be varied

 Column   2:    Chain
 Column   4-7:  sequential residue number
 Column   9-12: pdb residue number
 Column  14-18: id  (described below)
 Column  20-40: amino acids to be used

 NATAA  => use native amino acid
 ALLAA  => all amino acids
 NATRO  => native amino acid and rotamer
 PIKAA  => select individual amino acids
 POLAR  => polar amino acids
 APOLA  => apolar amino acids

 The following demo lines are in the proper format

 A    1    3 NATAA
 A    2    4 ALLAA
 A    3    6 NATRO
 A    4    7 NATAA
 B    5    1 PIKAA  DFLM
 B    6    2 PIKAA  HIL
 B    7    3 POLAR
 -------------------------------------------------
 start

FAQs:
1.Could RosettaDesign calculate native protein free energy only instead of doing a design?
Yes, RosettaDesign can calculate native free energy. The easiest thing to do is to submit a job which doesn't vary the sequence
( using "create list" option and make all residues fixed ).

2. What is MAX_RES in the log file?
MAX_RES in the log file means that the total number of residues in your native protein. The current limitation of MAX_RES is 1000.

3. What is the difference between "MAX_RES" and "maximum residues can be varied"?
"MAX_RES" is the total number of residues in your native protein. The current limitation of MAX_RES is 1000.
"maximum residues can be varied" is the number of residues you choose to be redesigned.

4. What does "can't find starting residue in pdb file" mean in the log file?
"can't find starting residue in pdb file" usually means that your uploaded file's format is not a typical pdb file so that the server can
not find the start residue to redesign.

5. What is "unrecognized residue:"?
The server read the three letters representation of 20 types of amino acids identities from the pdb file. Sometimes, there are one or more
identities which are not included in the 20 amino acides shows in the pdb files so that the server can not recognized them.

6. What is "missing backbone atoms"?
"missing backbone atoms" means there are part of back bone atoms are missed in the pdb file so that the server can not get correct information
to redesign the backbone.

Papers:
1: Design of a novel globular protein fold with atomic-level accuracy.
Kuhlman B, et al.
Science. 2003 Nov 21;302(5649):1364-8.
PMID: 14631033 [PubMed - in process]
 

2: A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins.
Dantas G, et al.
J Mol Biol. 2003 Sep 12;332(2):449-60.
PMID: 12948494 [PubMed - indexed for MEDLINE]
 

3: Crystal structures and increased stabilization of the protein G variants with switched folding pathways NuG1 and NuG2.
Nauli S, et al.
Protein Sci. 2002 Dec;11(12):2924-31.
PMID: 12441390 [PubMed - indexed for MEDLINE]
 

4: Accurate computer-based design of a new backbone conformation in the second turn of protein L.
Kuhlman B, et al.
J Mol Biol. 2002 Jan 18;315(3):471-7.
PMID: 11786026 [PubMed - indexed for MEDLINE]
 

5: Conversion of monomeric protein L to an obligate dimer by computational protein design.
Kuhlman B, et al.
Proc Natl Acad Sci U S A. 2001 Sep 11;98(19):10687-91. Epub 2001 Aug 28. Erratum in: Proc Natl Acad Sci U S A 2002 May 28;99(11):7809.
PMID: 11526208 [PubMed - indexed for MEDL INE]
 

6: Computer-based redesign of a protein folding pathway.
Nauli S, et al.
Nat Struct Biol. 2001 Jul;8(7):602-5.
PMID: 11427890 [PubMed - indexed for MEDLINE]
 

7: Native protein sequences are close to optimal for their structures.
Kuhlman B, et al.
Proc Natl Acad Sci U S A. 2000 Sep 12;97(19):10383-8. Erratum in: Proc Natl Acad Sci U S A. 2000 Nov 21;97(24):13460.
PMID: 10984534 [PubMed - indexed for MEDLINE]
 



  License RosettaDesign   |   Kuhlman Lab   |   Rosetta Labs   |   Robetta Structure Prediction Server   |   Contact Us

RosettaDesign is available for NON-COMMERCIAL USE ONLY at this time