Endonuclease PvuII (1PVI) DNA - GATTACAGATTACA
CAP - Catabolite gene Activating Protein (1BER)
DNA - GATTACAGATTACAGATTACA Endonuclease PvuII bound to palindromic DNA recognition site CAGCTG (1PVI) DNA - GATTACAGATTACAGATTACA TBP - TATA box Binding Protein (1C9B)
CAP - Catabolite gene Activating Protein (1BER)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
TBP - TATA box Binding Protein (1C9B)
 

Multi CPU support in YASARA

Parallel MD

Back in 2004, the exponential increase in desktop computing power came to a sudden end. After more than two decades of strict adherence to Moore's law, the laws of physics had their revenge and prevented CPU manufacturers from raising the clock frequency beyond 3 to 4 GHz. The solution to the problem has been borrowed from the supercomputing world: combining multiple CPUs, initially on the same mainboard, and in the mean time also on the same silicon die. Starting with dual core CPUs in 2005, we arrived at quad core CPUs in 2006. On high-end mainboards with four CPU sockets, that sums up to a massive 16 CPU cores in total, which all need to be kept busy.

Even though some compilers provide rudimentary support for multiple CPUs, in practice the application itself must divide and conquer, i.e. break the task into multiple subtasks ("threads") and distribute them among the available CPUs ("multi-threading").

YASARA provides advanced multi-threading functions, that avoid slow-downs caused by the operating system's too general and thus sub-optimal process scheduler and memory allocation interface. As an example, the figure on the right shows the helical SNARE protein complex, kindly provided by Dr. Marc Baaden at the Institut de Biologie Physico-Chimique, CNRS Paris, who is studying SNARE's involvement in membrane fusion. During a parallel simulation, this large system of 360000 atoms is dynamically split into segments, which are assigned to the available CPU cores. The plot below shows the speedups obtained for this simulation when using one to four processors on a workstation with two dual-core Opteron 265 CPUs:

Parallel MD

Figure 1: Number of simulation steps (2 fs) completed per second as a function of available CPUs. AMBER99 force field, periodic boundaries, 7.8 Å cutoff, 360000 protein, membrane and water atoms.

An inherent disadvantage of multi-threading is the loss of reproducibility: if multiple CPUs work in parallel, certain instructions (e.g. additions) will inevitably be executed in a different order. While this does not make a difference from the mathematician's point of view, it makes a tiny small difference from the CPU's point of view, where additions are not completely associative: A+(B+C) may yield a slightly different result than (A+B)+C. The practical consequence is that molecular dynamics simulations become non-reproducible, i.e. running the same simulation a second time will result in a different trajectory. YASARA contains special functionality to avoid this problem and ensure that two simulations run with the same parameters on the same number of CPUs always yield identical trajectories, a feature which is crucial for some important applications of molecular dynamics simulations.