Endonuclease PvuII (1PVI) DNA - GATTACAGATTACA
CAP - Catabolite gene Activating Protein (1BER)
DNA - GATTACAGATTACAGATTACA Endonuclease PvuII bound to palindromic DNA recognition site CAGCTG (1PVI) DNA - GATTACAGATTACAGATTACA TBP - TATA box Binding Protein (1C9B)
CAP - Catabolite gene Activating Protein (1BER)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
TBP - TATA box Binding Protein (1C9B)
 

Molecular dynamics simulation setup in YASARA

Water MD

Taking a structure from the PDB and running a simulation can be surprisingly difficult, often requiring help and manual intervention from a simulation expert. During the past decades, we have iteratively taught YASARA how to run simulations for most PDB files (and of course your private structures) fully automatically, with just four mouse clicks.

The initial clean up step can perform these tasks as needed:

  • Delete NMR dummy Q atoms.
  • Detect missing covalent bonds and add them.
  • Detect missing dative bonds to metal ions and add them, but delete them all if there is no real binding site.
  • Rebuild side-chains with missing atoms.
  • Delete terminal residues with incomplete backbone, that often occur in X-ray structures when the chain enters a disordered region.
  • Delete atoms that are present more than once at alternate locations, keeping those with the highest occupancy.
  • Delete molecules that overlap significantly with other molecules and are most likely the result of incorrect PDB format usage, like 1GTV.
  • Delete residues that overlap significantly with other residues and are most likely slightly different ligands bound to the same active site, e.g. BTN/BTQ in 2F01.
  • Delete unknown ligands, i.e. residues named 'UNL' that consist of oxygen only (e.g. 2HBW,3MPR).
  • Delete water molecules that overlap significantly with other atoms (e.g in 1US0).
  • Delete alternate amino acids at the same sequence position that are lacking an alternate location indicator (e.g. Cys 123 in 1DIN).
  • Delete metal ions with missing AltLoc indicators on top of each other (1C35).
  • Delete hydrogens that are incorrectly close (e.g. GLN 22 in 1GQV).
  • Delete wrong surplus bonds at hydrogens, e.g. in 1A08 revision 2.
  • Delete covalent bonds involving metal ions, these should be dative pseudo-bonds instead.
  • Delete 'TER' entries between covalently bound residues with the same molecule name.
  • Make external single atoms bound to a residue part of this residue, for example the three oxygens bound to Cys 25 in 9PAP.
  • Fuse 'residues' consisting of single atoms, which are sometimes generated by small molecule modeling programs that do not have a concept of residues.
  • Add missing backbone atoms, depends on the presence of neighboring residues and side-chain.
  • Add terminal oxygens.
  • Add capping groups to internal chain breaks.
  • Reassign bond orders.
  • Add missing hydrogens.
  • Correct incorrectly shifted atom names.
  • Correct atom names that appear more than once per residue.
  • Correct hydrogen names to ensure that they match the bound heavy atom.
  • Correct hydrogen naming conventions for methylen bridges, amide and guanidine hydrogens to facilitate stereospecific NOE assignments.
  • Correct hydrogen ordering in water molecules (must follow the oxygen to apply fast water simulation algorithms).
  • Correct wrongly defined chemical elements, like OD2 ASP A 186 in 1F6W which is indicated as nitrogen.
  • Check and correct flipped CG and CD atom names in Val and Leu residues.
  • Check and correct OT1/OT2 atom names in amino acids.
  • Check and correct bond orders and oxygen atom names in (de)protonated carboxyl groups.
  • Replace seleno-methionine with methionine.
  • Replace selenium and tellurium atoms with sulfur and rename them accordingly.
  • Add cysteine bridges between close CYS SG atoms that do not already carry a hydrogen.
  • Sort atoms in amino acids and nucleotides in their default order.
  • Sort residues, e.g. re-insert unusual amino acids collected at the end of the soup in the middle of the molecule, where they belong.
  • Split oligosaccharide super-residues (=multiple sugars in a single residue like BCD 371 in 1DMB) into separate residues to that GLYCAM parameters can be used.
  • Merge covalently bound residues with the same ID (residue name, number, insertion code, molecule name).


  • After these initial cleanup steps have been completed, the simulation setup continues with..

  • pH-dependent hydrogen bonding network optimization.[1]
  • pKa prediction and protonation state assignment.[2]
  • Fully automatic embedding for membrane proteins.
  • Fully automatic force field parameter assignment.
  • R E F E R E N C E S

    [1] Assignment of protonation states in proteins and ligands: combining pKa prediction with hydrogen bonding network optimization.
    Krieger E, Dunbrack RL Jr, Hooft RW, Krieger B (2012), Methods Mol Biol.819, 405-421.
    [2] Fast empirical pKa prediction by Ewald summation
    Krieger E, Nielsen JE, Spronk CA, Vriend G (2006) J.Mol.Graph.Model. 25,481-486