|
Knowledge-based
potentials in
YASARA
The most successful methods in structural
bioinformatics have usually been those that make extensive use of the
available knowledge, instead of trying to start from first principles.
When predicting a protein structure or analyzing the quality of a
homology model, it is an enormous help to peek at the thousands of
known structures deposited in the PDB to first get an idea of what a
real protein looks like. Then it becomes much easier to judge the
correctness of the model.
The standard way of 'getting an idea' is an
extensive statistical analysis of known protein structures, trying to
extract common structural features and preferences from the 3D
coordinates. Qualitative insights like 'hydrophobic side-chains
like to be in contact' can be converted to quantitative
energies thanks to Boltzmann's formula, which states that a
certain configuration occurs with a frequency that is proportional to
exp(-E/kT), where E is the energy, T the temperature and k the
Boltzmann constant ('exp' is the exponential function). So we only need
to extract the frequency (e.g. of two methyl groups at a distance of 5
Å) from the PDB, and obtain the corresponding energy by turning
Boltzmann's formula around: E ~ -log(frequency)*kT.
The resulting energy functions are called
'knowledge-based potentials' and are widely used today, after
pioneering work in the 1990s, done for one dimensional distance
dependent potentials in ProSA[1] and three dimensional direction
dependent potentials in WHAT IF[2]. YASARA Structure builds on these
cornerstones and provides a number of innovations[3]:
- The statistical analysis is not limited to
proteins. Atom
types and knowledge-based potentials have been derived for all other
molecules in PDB files, so that energies can also be calculated for
DNA/RNA, metal ions and ligands, the latter being especially important
for pharmaceutical research.
- Knowledge-based potentials can be visualized
easily:
Example (A) on the right shows the 1D distance-dependent potential for
two methyl groups, one from Leu 18 in red, and one from Val 15 in
yellow. To aid visualization, the vertical energy axis is mirrored, the
yellow top corresponds to the energy minimum. The red arrow marks the
current distance of 4.72 Å. The arrow adapts in real-time to atom
movements, for example during a simulation.
- Example (B) shows the 3D orientation-dependent
potential of a
carboxyl group carbon around the arginine side-chain, blue indicates
the unfavorable high-energy regions.
- Atomic contact analysis is not the only
application of
knowledge-based potentials. The distribution and interdependence of
dihedral angles can be analyzed equally well[4]. Example © shows the
potential of the backbone dihedral angle φ for threonine. The
yellow φ arrow points from -180 to +180 degrees.
- Two dihedral angles can be combined to a single
2D
potential: example (D) shows the φ/ψ potential of threonine. Again
the yellow tops are the low energy regions, corresponding to
the preferred areas in the Ramachandran plot. The orange arrow
indicates the ψ axis.
- Finally, example (E) shows a 3D potential: the
combined
φ/ψ/χ1 potential of threonine, which captures the interdependence
between the backbone and side-chain conformation. The long red arrow
indicates the χ1 axis.
- Knowledge-based potentials like the ones shown
above have
been incorporated into two new force fields, exclusively available in
YASARA Structure. The contact potentials allow to calculate highly
informative knowledge-based energies, while the dihedral angle
potentials are differentiable and thus permit also force calculations,
resulting in the most accurate force fields for structure prediction
and refinement that YASARA has to offer.
R E F E R E N C E S
[1] Recognition of Errors in
Three-Dimensional Structures of Proteins
Sippl MJ (1993) Proteins
17, 355-362
[2] Quality control of protein
models: Directional atomic contact analysis
Vriend G, Sander C (1993) J.Appl.Cryst.
26, 47-60
[3] Improving
physical realism, stereochemistry, and side-chain accuracy in homology
modeling: Four approaches that performed well in CASP8
Krieger E, Joo K, Lee J, Lee J, Raman S, Thompson J, Tyka M, Baker D,
Karplus K (2009), Proteins 77 Suppl 9,114-122
[4] Improvements
and Extensions in the Conformational Database Potential for the
Refinement of NMR and X-ray Structures of Proteins and Nucleic Acids
Kuszewski J, Gronenborn AM and Clore GM (1997) Journal of Magnetic
Resonance 125, 171-177
|