Endonuclease PvuII (1PVI) DNA - GATTACAGATTACA
CAP - Catabolite gene Activating Protein (1BER)
DNA - GATTACAGATTACAGATTACA Endonuclease PvuII bound to palindromic DNA recognition site CAGCTG (1PVI) DNA - GATTACAGATTACAGATTACA TBP - TATA box  Binding Protein (1C9B)
CAP - Catabolite gene Activating Protein (1BER)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
TBP - TATA box  Binding Protein (1C9B)
 

Coarse-grained modeling and simulations with YASARA

Experimental structure determination depends on the availability of large numbers of structurally identical copies that can be averaged, either in an X-ray crystal, in an NMR solution, or on a cryo-EM layer. On the mesoscale, where proteins, nucleic acids, and lipids assemble to form enveloped viruses, vesicles, bacterial cells, or eukaryotic cell compartments, the inherent randomness of the assembly processes makes sure that no two structures are the same. Electron microscopic images show a large variety of sizes and shapes, which preclude high-resolution structure determination. Alternatively, molecular modeling can serve as a route to fill these images with all-atom life, which is needed to improve our structural understanding of these large structures, to test hypotheses, to create starting structures for simulation on supercomputers, or for educational purposes.

YASARA Dynamics and YASARA Structure include the functionality to construct such mesoscale models from modular building blocks and visualize them using all common molecular graphics styles. The model building process relies on an intermediate coarse-grained "pet world" representation whenever all-atom details would be computationally too costly. In the pet world, molecules are scaled to 1/10th the normal size and represented by pet atoms, which requires only 2% of the original atom number (see Figure 2 on the right). This allows for efficient collision detection to generate tight packings as well as large-scale coarse-grained molecular dynamics simulations, e.g. to pack a hypothetical virus genome (see Figure 1 on the right). Nucleic acids and entire genomes with binding proteins can be built directly from a FASTA file, that can contain a secondary structure assignment in dot-bracket notation for single-stranded nucleic acids.

The model building procedure is fully automated and employs a collection of open source YASARA macros, which can easily be adapted to build your own models. Coarse-grained pet proteins are mostly rigid during a simulation, so that all proteins of a certain type keep the same initial shape and can later be visualized with identical instances of one single all-atom model, yielding the data compression needed to handle gigastructures with billions of atoms. So pet world simulations are not a faster replacement for all-atom simulations, but aimed at model building only.

An infrastructure for sharing these macros and the generated mesoscale models has been set up as the PetWorld Database, and you are welcome to contribute your own models in exchange for free access to all YASARA stages. Building mesoscale models is also a fun task for molecular modeling courses, where each participant takes care of modeling one of the proteins, and then they join forces to assemble the complete model.

Additional information:

How YASARA visualizes the gigastructures in real-time, including a video of a model with 3.6 billion atoms.

All the details described in our open access article on the topic[1].

The PetWorld database with a growing collection of mega- and gigastructures.

Detailed building instructions and infos how to get YASARA for free in return for your contribution can be found in the user manual of any YASARA stage (including the free YASARA View) if you browse to Recipes > Build a gigastructure.

R E F E R E N C E S

[1] Assembly of biomolecular gigastructures and visualization with the Vulkan graphics API
Ozvoldik K, Stockner T, Rammner B, and Krieger E (2021). Journal of Chemical Information and Modeling 61, 5293-5303

Figure 1: The video above shows the coarse-grained MD simulation of the packaging of a hypothetical self-assembling viral nucleocapsid (consisting of a nucleic acid wrapped around a protein) into a budding particle. The yellow sphere represents the growing membrane. The movie is a time lapse, the computation takes a few hours.
Space
Holiday junction
Figure 2: The screenshot shows the transformation from an all-atom model of the holiday junction (PDB ID 1XNS) with bound DNA to a coarse-grained pet model (scaled up for comparability). Pet atoms are colored by size from blue (0.2 Å) via magenta, red orange, yellow, green to cyan (1.1 Å) and gray (1.2 Å). Since the nucleic acid is partly single-stranded, the coarse-grained model uses one red pet atom for each nucleotide.