Tutorial I.IV - Notes, Limitations, Summary, and Outlook

This section is part of "Martini 3 protein models - a practical introduction to different structure bias models and their comparison".

In case of issues, please contact duve@fias.uni-frankfurt.de, luis.borges@ens-lyon.fr, or thallmair@fias.uni-frankfurt.de.

If you would like to refer to information of this tutorial, please cite T. Duve, L. Wang, L. Borges-Araújo, S. J. Marrink, P. C. T. Souza, S. Thallmair, Martini 3 Protein Models - A Practical Introduction to Different Structure Bias Models and their Comparison, bioRxiv (2025), doi: 10.1101/2025.03.17.643608.

I.IV.1 Notes and Limitations

Before summarizing, we would like to provide some additional notes to the reader pointing to advanced aspects of Martini 3 protein models. Moreover, we will discuss its most important limitations.

I.IV.1.1 Notes

Note 1 - Further optimization of protein structures. In Section I.II.1.1, it was demonstrated how the biasing layer of Martini proteins could be adjusted by changing global settings such as cut-off or bias strength. However, spending more time on optimization, you could tune these aspects also on a local basis, i.e., by removing or adjusting only part of the biasing network. For instance, to improve regions that are too rigid or too flexible in comparison to all-atom reference simulations (as measured by, e.g., RMSF analysis), or to improve the shape fluctuations of a binding pocket or entrance pathway. Although in principle this can often be done manually, the MAD server offers a graphical interface for adding or removing bonds [40].

Note 2 – Multiple reference structures in one local minimum. In Section I.I, a single reference structure was used to generate the protein model. Potential artifacts and rare conformations of this reference, such as specific conformations of flexible loops or side chain orientations influenced by the crystal packing, are thus incorporated in the model. One way to improve this is to use multiple reference structures. Based on the contact frequency within the set of reference structures, selected contacts can be removed from the structure bias model if they do not occur in the majority of the reference structures. These can be either from experiment, for instance NMR structures, or from atomistic reference simulations. The latter has been shown to improve in particular the flexibility of loops for an exemplary test set of six proteins with the GōMartini 3 protein model [13]. The OLIVES approach currently allows to use multiple references for the generation of the contact map [17].

Note 3 – Reference structures from two conformational states. The multiple-basin GōMartini approach combines two GōMartini 3 models for two distinct states of the protein using an exponential mixing scheme [41]. This extension offers the possibility to investigate conformational transitions of proteins and to identify intermediate conformational states. Note that the program package OpenMM [42] is required for the multiple-basin GōMartini approach. A simpler way of modelling conformational transitions is to switch between two GōMartini models[43]. Reference structures from two conformational states can also be used to generate an OLIVES model [17].

Note 4 – OliGōmers. An extension to the GōMartini 3 model named OliGōmers offers a computationally efficient way to model protein oligomers using the Gō-like model as structure bias model [44]. Multimeric proteins require the definition of separate .itp files for each monomer with the GōMartini 3 model. This is necessary because the virtual sites which take care of the Gō-like interactions do not distinguish between the intra- and intermolecular interactions. Thus, large oligomers such as fibrils are challenging to describe. This issue is solved via a multi-layer virtual site scheme in the OliGōmers implementation.

Note 5 – In silico single molecule force spectroscopy. The use of Gō-like models as structure bias models for Martini proteins opened the way to efficient single molecule force spectroscopy calculations [13, 18, 45, 46]. The key feature of these Martini protein models was the ability of protein unfolding due to breakable Gō-like interactions. To faithfully reproduce experimental force-extension data of protein complexes, it is often required to add Gō-like interactions at the protein-protein interfaces [13, 47].

Note 6 - Visualization of protein structures. Most visualization packages, including the popular VMD[48], cannot draw correct bonds based on Martini structure files only, and require loading of topology files as well. However, many Martini models (including proteins) now make extensive use of interaction types like virtual sites, requiring dedicated methods to allow VMD to correctly depict the CG topology. The MartiniGlass package freely available on GitHub (https://github.com/Martini-Force-Field-Initiative/MartiniGlass) provides a suite of tools and associated scripts to enable visualisation of Martini systems in VMD. The program has a particular focus on being able to visualize protein secondary/tertiary structure networks, although MartiniGlass can in fact be used to reconstruct bonded networks of any Martini molecule.

I.IV.1.2 Limitations

Limitation 1 – Changes in secondary structure. Currently, the Martini 3 protein model is not able to capture folding processes and cannot fully capture changes in secondary structure – at least using the standard protocols to build the CG protein models. One of potential reasons for this limitation is the lack of directionality, which is important for certain interactions such as hydrogen bonds and T-shaped interactions of aromatic units [6]. In addition, some of the bonded terms of the current Martini 3 protein model depend on the secondary structure, if secondary structure information from DSSP is used for the model. One promising development towards secondary structure changes with respect to Martini 2 proteins is, however, that the particle type of the backbone is not secondary structure-dependent anymore [10, 13].

Limitation 2 – Protonation states and Martini sour. In standard molecular dynamics protocols, it is common to assume a fixed protonation state of titratable groups such as the side chains of arginine, glutamic acid, and histidine. However, this is not always realistic, for instance if the pH is close to the pKa value of the titratable group or if the environment modulates its pKa. Constant pH approaches or the titratable Martini model offer ways to introduce changes of protonation states [49–51]. In titratable Martini, a dedicated class of beads, so-called titratable beads, can bind or release a proton particle mimicking H+. The titratable Martini model and constant pH approaches pave the way to study the impact of pH on protein conformations [52].

Limitation 3 – Effective time scale. Inherent to coarse-graining is the loss of degrees of freedom. This smoothens the potential energy surface and results in an effective time scale for CG simulations, which complicates direct comparison to experimental and atomistic simulation data. A challenge is that the scaling factor between real time and the effective CG time is system dependent. Here, we estimated a scaling factor of 4 to compare atomistic and CG data [17, 34]. However, please be aware that this is just an estimate and no universal scaling factor for Martini 3.

Limitation 4 – Hydrophilicity of Martini 3 proteins. In the Martini 3 protein model, the backbone is mapped to one bead type independent of the secondary structure. This is a step towards a protein model which is able to change the secondary structure. However, this entails that the partitioning of the backbone cannot be used anymore to compensate possible differences in hydrophilicity of secondary structure elements. There have been reports of too hydrophilic single-pass transmembrane helices and β-sheet dimerizing peptides [53, 54], and too hydrophobic intrinsically disordered proteins [36, 55, 56]. Adapting the backbone-water interactions (see Section 4.1) and refining the dihedral potentials in the Martini-IDP force field, respectively, offer a way to improve the behaviour of Martini 3 proteins. Note that alternative approaches such as rescaling of the full protein-protein or protein-water interactions have to be taken with care [36, 53, 55, 56], because they also modify the well-balanced side chain partitioning [13].

Limitation 5 – Non-canonical amino acids and post-translational modifications. While simple post-translational modifications such as disulfide bridges, phosphorylation, and capped terminals can be straightforwardly included in Martini 3 protein models via Martinize2 [15], more complex modifications remain a challenge. Lipidation is available in Martini 3 [57], as it was in Martini 2 [58], and PEGylation can be incorporated using Polyply [20, 39]. However, glycosylation is still under development, alongside with the development of carbohydrate models [59, 60]. Furthermore, differences in chirality, such as modelling D-amino acids, are not yet supported, as current protein models only consider L-amino acids. The same applies to other chemical, non-biological modifications often used in the design of biomimetic peptides. These limitations restrict the ability to study non-canonical and heavily modified proteins without extensive manual parameterization or specialized tools

I.IV.2 Summary and Outlook

In this tutorial, we introduced the reader to setting up and characterizing a Martini 3 protein model with different structure bias models, namely Elastic Network [3, 15, 16], GōMartini [13], and OLIVES [17]. In addition, we introduced two options to model IDRs and to model multi-domain proteins with structured and intrinsically disordered regions [13, 38]. Finally, we discussed some advanced aspects of modelling CG proteins as well as limitations of the current Martini 3 protein model. The Martini 3 protein model represents a significant advancement in coarse-grained molecular dynamics, offering improved flexibility and accuracy compared to its predecessors. With its ability to model a wide range of biomolecular systems, including structured and intrinsically disordered proteins, Martini 3 opens the door to studying complex biological phenomena on time and length scales inaccessible to atomistic simulations, extending its potential toward simulations of entire organelles and even cells [61, 62]. However, continued efforts are required to address existing limitations, such as modelling structural changes ranging from local dynamics to global transitions, including the accurate representation of multiple folded states of the same protein. Additionally, further developments are needed to incorporate non-canonical amino acids and fully integrate diverse post-translational modifications. Advances in tools like Martinize2 [15] and Polyply [39], together with ongoing developments of improved models like Martini3-IDP [38], promise to expand the applicability of Martini 3 to even more diverse systems, including biomimetic designs and systems influenced by dynamic environmental conditions like pH. As these developments unfold, Martini 3 is poised to become an even more powerful tool for researchers across biophysics, drug development, and biomaterials design, bridging the gap between detailed atomistic insights and large-scale biological phenomena.