Homology modeling in
YASARA Structure features a complete homology
modeling module that fully automatically takes all the steps from an
amino acid sequence to
a refined high-resolution model using a CASP approved protocol.
Additionally, YASARA writes a detailed
report about the individual
modeling steps. If available, user-supplied hints (template
alignments) can be included. The individual modeling steps can be
summarized as follows:
- The target sequence is PSI-BLASTed
against Uniprot to build a position-specific scoring matrix (PSSM) from
related sequences, then this profile is used to search the PDB for
potential modeling templates. Common protein purification tags are excluded to avoid false positives. If the
homology is too remote to be detected by PSI-BLAST, the target is
considered difficult and templates have to be provided manually (for
example using one of the many fold recognition servers on the web).
- The templates are ranked based on the alignment
the structural quality according to WHAT_CHECK
obtained from the PDBFinder2 database.
Usually models built using high-resolution X-ray templates are
more accurate than those created from lower resolution X-ray or NMR
templates, even if the latter share a higher percentage sequence
identity. Models are built for the top scoring templates.
- If structure factors have been deposited at the
re-refined template structures are also included, provided that they
are already part of the PDB-redo database.
- Gene fusion events are detected, where the
target sequence spans more than one template molecule. These are
automatically fused in the correct order.
- For each available template, the alignment with the target sequence is obtained
using large amounts of additional information: sequence-based profiles of target
and template are calculated from related Uniprot sequences, optionally augmented
with structure-based profiles from related template structures.
The alignment also considers structural information contained
in the template (avoiding gaps in secondary structure elements, keeping polar residues
exposed etc.), as well as the predicted target secondary structure.
This structure-based alignment correction is partly based on SSALN scoring matrices.
Alternatively, manual alignments can of course also be provided.
- If the alignment is not certain, alternative
high-scoring alignments are created using a stochastic approach,
and models are built for all of them.
- If templates exist in oligomeric states
the PQS database),
models may be built in the same state, so that interactions
between side-chains across the interface can be considered. This
includes all kinds of hetero-oligomers, e.g. a
homo-dimer of two hetero-dimers.
- In case of insertions and deletions, an indexed
of the PDB is used to determine the optimal loop anchor points and
possible loop conformations.
- If templates contain ligands, these
parameterized and fully considered in the homology modeling
procedure, including hydrogen
bonding and other interactions with the peptide chain.
- A graph of the
network is built,
dead-end elimination is used to find an initial rotamer solution in
context of a simple repulsive energy function.
- The loops are optimized by trying hundreds of
conformations and re-optimizing the side-chains for all of them.
- Side-chain rotamers
electrostatic and knowledge-based packing
interactions as well as
- The model's hydrogen
network is optimized,
including pH-dependence and ligands.
- An unrestrained high-resolution refinement with
solvent molecules is run, using the latest
fields. The result is validated to
that the refinement did not move the
the wrong direction.
- The tasks above are performed for all
templates and alignments, per-residue quality indicators for the
- A hybrid model is built, bad regions in the top
model are iteratively replaced with corresponding fragments from the
- A scientific
report with details about all the
above is written automatically, which can serve as the basis for a
Ray-traced figures and per-residue quality plots are included, as
well as an overall judgment of the model quality, ranging from
- CASP evaluation results are
R E F E R E N C E S
physical realism, stereochemistry, and side-chain accuracy in homology
modeling: Four approaches that performed well in CASP8
Krieger E, Joo K, Lee J, Lee J, Raman S, Thompson J, Tyka M, Baker D,
Karplus K (2009), Proteins 77 Suppl 9,114-122
BLAST and PSI-BLAST: a new
generation of protein database search programs
Altschul SF, Madden TL, Schaeffer AA, Zhang J, Zhang Z, Miller W and
Lipman DJ (1997) Nucleic Acids
 Errors in protein structures
Hooft RWW, Vriend G, Sander C, Abola EE (1996) Nature 381,272
 The PDBFINDER database: A summary
of PDB, DSSP and HSSP
information with added value
Hooft RWW, Sander C and Vriend G (1996) CABIOS/Bioinformatics 12, 525-529
and application of the
concepts important for accurate and reliable protein secondary
King RD and Sternberg MJE (1996), Protein
 SSALN: An alignment algorithm
using structure-dependent substitution matrices and gap penalties
learned from structurally aligned protein pairs
Qiu J and Elber R (2006) Proteins
 Stochastic pairwise alignments
Mueckstein U, Hofacker IL and Stadler PF (2002) Bioinformatics 18, Suppl.2 153-160
 A graph-theory algorithm for
rapid protein side-chain prediction
Canutescu AA, Shelenkov AA and Dunbrack RL Jr. (2003), Protein Sci. 12,2001-2014.