# We install `gromacs` and `dssp` binaries, pre-built for the colab runtime. Moreover, we install some Python packages, and import the ones we'll need:
# 1. The `vermouth` (provides the `martinize2` script) and `insane` packages from the Martini team
# 2. The `MDAnalysis` package
# 3. The `nglview` molecular viewer (requires enabling custom notebook widgets, which we also do below)
import os
if os.getenv("COLAB_RELEASE_TAG"):
!wget https://www.itqb.unl.pt/labs/multiscale-modeling/downloads/gromacs-2023.3-gcolab_build.tar.gz -q -O - | tar --absolute-names -xzf -
!wget https://www.itqb.unl.pt/labs/multiscale-modeling/downloads/dssp-3.1.4-gcolab_build.tar.gz -q -O - | tar --absolute-names -xzf -
!pip install -q vermouth mdanalysis insane
!pip install -q ipywidgets==7.7.2 nglview
# Needed for nglview to work in colab
from google.colab import output
output.enable_custom_widget_manager()
Tutorials designed by Martin Calvelo (martin.calvelo@gmail.com) and Manuel N. Melo (m.n.melo@itqb.unl.pt). Check out their repository.
These tutorials were designed to run in Collab. Alternatively you can follow along in your own system.
Ceramide — VDAC1 binding in a membrane
Tutorial notes
In this tutorial you’ll find two directories. Directory
worked
has all the intermediate steps and results, whereas directoryminimal
has only the essential files to get you going, and you’ll have to do most of the work yourself. Feel free to follow the tutorial using either approach, or even a mix of the two.This tutorial has been designed to be run in a T4 GPU Colab runtime (you can check and change this in the upper right ‘Connect’ menu). Click on “Change runtime type” and select Python 3 (default) and T4 GPU.
Alternatively, you can download it and run it in your own system. You will need to have
gromacs
anddssp
installed, as well as the Python librariesvermouth
,mdanalysis
,insane
, andnglview
.Much of the tutorial involves running shell commands. In Colab, this is achieved by prepending an exclamation mark to the command (
!some shell_command
). You can see this exemplified below in the installation. Note that if you need to change directories, the exclamation mark method won’t work; you need to use a percent sign, and run the command in its own cell (%cd target_dir
).To run any of the following cells of code, you can type Shift + Enter to excute the code in a cell.
Step 0 - Downloading and installing of the tutorial software and materials.
The text output of these steps was minimized so as not to overcrowd the notebook. Feel free to take out the -q
flags or add the -v
flag to tar
if you want full output or for debugging purposes.
import nglview as nv
import MDAnalysis as mda
We download and unpack the tutorial’s material.
!wget https://zenodo.org/records/13760505/files/1-Tutorial_CER-VDAC1.zip
!unzip 1-Tutorial_CER-VDAC1.zip
We will start in subdirectory minimal
, but feel free to change to worked
.
%cd minimal
Step 1 - Introduction
The role of mitochondria in apoptosis is well established, and a number of protein–protein interactions in its outer membrane has been linked to apoptotic pathways. The VDAC (voltage-dependent anionic channels) family of proteins is central to this binding, having been implicated in the retention of apoptotic proteins in the outer mitochondrial membrane\(^{1}\) and self-oligomerization\(^{2}\) — an event also linked to apoptosis.
In a recent study we have ascertained that ceramide binds VDAC proteins (via a binding site involving VDAC1’s Glu73) and that such binding somehow triggers apoptosis.\(^{3}\) An immediate hypothesis is that the presence of ceramide affects the binding profile of VDAC and promotes apoptosis-associated oligomerizations. Since that work was done using the (now outdated) Martini 2 model, we will recreate in this tutorial the same ceramide binding simulations, but using the state-of-the-art Martini 3 model.\(^{4}\)
Sci. Rep. 6, 32994; doi: 10.1038/srep32994 (2016)
Mol. Cell. Biol. 30, 5698; doi: 10.1128/MCB.00165-10 (2010)
Nat. Commun. 10, 1832; doi: 10.1038/s41467-019-09654-4 (2019)
Nat. Methods 18, 382; doi: 10.1038/s41592-021-01098-3 (2021)
Step 2 - System preparation
In your directories you’ll find the 4c69.pdb file, containing the X-ray structure of the murine VDAC1 protein. As it is a beta-barrel, it should be clear how it orients in a membrane.
Before coarse graining anything we visualize our starting protein structure using nglview. Other viewers, such as pymol or VMD are also great, if you run this workflow outside Colab.
= nv.NGLWidget()
view = view.add_component(nv.FileStructure('4c69.pdb'))
at view
Coarse-graining with martinize2
For CG MD simulation of a protein one needs the CG coordinates and the CG topology. Both can be obtained with the martinize2
tool (despite its name, it works with Martini 3).
Run martinize2
with the -h
flag for help deciding how to run it. You’ll need to: * specify 4c69.pdb
as the input structure; * specify an output CG structure (let’s call it VDAC1_cg.pdb
); * specify an output .top
topology master file (let’s call it topol.top
); * specify the target forcefield. Martini 3 has code martini3001
; * ask for an elastic network to be set, restraining secondary/tertiary structure; * ask for any disulfide bridges to be automatically detected; * specify where martinize2
can find a DSSP executable, so that secondary structure per residue can be determined (in the current install it is in your path, under the name mkdssp
); * ask for some non-protein residues that are present in the pdb structure to be ignored (since 4c69.pdb
is being used without any prior cleaning, several extra molecules are present, namely HOH
, ATP
, MC3
, and LDA
).
Additionally, if needed, you can ask martinize2
to ignore some occurring warnings: you can expect several about multiple conformations in 4c69.pdb
, and one about the version of DSSP being used.
#Try yourself!
#Uncomment to run
#!martinize2 -f ...
# @title Solution
!martinize2 -f 4c69.pdb \
-x VDAC1_cg.pdb \
-o topol.top \
-ff martini3001 \
-elastic \
-scfix \
-cys auto \
-dssp mkdssp \
-ignore HOH ATP MC3 LDA \
-maxwarn 42
Have a look at the produced files with and editor or a molecular structure visualizer. Observe the atomic structure and the CG beads overlaid (in white and yellow spheres).
view.remove_component(at)
= nv.FileStructure('4c69.pdb')
aa = nv.FileStructure('VDAC1_cg.pdb')
cg
=False)
view.add_component(aa, default=False)
view.add_component(cg, default
=0)
view.clear_representations(component=1)
view.clear_representations(component
='Protein', component=0)
view.add_licorice(selection=7.5, opacity=0.45, component=1)
view.add_ball_and_stick(aspectRatio
view.center() view
Embedding a protein in a membrane using insane
The insane
script creates CG membranes by placing lipids in a grid. It can also embed proteins as it constructs a membrane. Insane leverages the robustness of CG to start from artificial conditions and eventually equilibrate.
Run insane
with the -h
flag for help deciding how to run it. You’ll need to: * specify VDAC1_cg.pdb
as the input protein to embed; * specify an output .top
topology master file (let’s call it membrane.top
); * specify an output structure (let’s call it membrane_prot.gro
); * specify lipid proportions (95:5 POPC:ceramide) with -l
; * specify the dimensions of the simulation box (x=12, y=12 and z=10) with the -x
, -y
and -z
flags; * ask for centering the protein in the vertical axis; * ask for orienting the protein tilt in the membrane; * add waters (-sol W
) as solute.; * ask for an ionic strength of 0.15 M; * neturalize the system with additional counterions.
#Try yourself!
#Uncomment to run
#!insane -f ...
# @title Solution
!insane -f VDAC1_cg.pdb \
-l POPC:95 -l DPCE:5 \
-x 12 -y 12 -z 10 \
-o membrane_prot.gro \
-p membrane.top \
-center \
-orient \
-sol W \
-salt 0.15 \
-charge auto
You should obtain a protein–membrane structure. Open it with a molecular visualizator. Here, you can see the protein in the membrane, with the POPC lipids in grey and the ceramide in green. Note that, for clarity, water and ions were not displayed.
= nv.NGLWidget()
view_membrane_prot 'membrane_prot.gro'))
view_membrane_prot.add_component(nv.FileStructure('licorice', selection='protein')
view_membrane_prot.add_representation('surface', selection='POPC', color='grey')
view_membrane_prot.add_representation('surface', selection='DPCE', color='green')
view_membrane_prot.add_representation( view_membrane_prot
You may notice that despite the use of the -orient flag, insane was not able to completely align VDAC’s axis with the membrane normal. This is ok, again because the robustness of CG will allow the system to quickly evolve into the correct orientation.
Cleaning up run-files
At this point you’ll have the structure needed to run simulations, but the topologies/topology headers need some cleaning:
- Martinize2 gives not very informative default names to files and molecules. Rename
molecule_0.itp
toVDAC1.itp
. Inside it, also change the molecule name frommolecule_0
toVDAC1
:
!cp molecule_0.itp VDAC1.itp
!awk '{gsub(/molecule_0/, "VDAC1"); print}' VDAC1.itp > temp && mv temp VDAC1.itp
- In
topol.top
:
- See how the file looks:
!cat topol.top
- Add the last 7 lines from membrane.top. These are the number of molecules of each type created by insane:
!tail -n 7 membrane.top >> topol.top
- Change the molecule reference from
molecule_0
toVDAC1
in the molecules section:
!awk '{gsub(/molecule_0/, "VDAC1"); print}' topol.top > temp && mv temp topol.top
- At the top, change the #include statement so it points to
martini_v3.0.0.itp
(this .itp must always be the first one to include):
!awk '{gsub(/include "martini\.itp"/, "include \"martini_v3.0.0.itp\""); print}' topol.top > temp && mv temp topol.top
- Add extra
.itp
files for the lipid, ceramide, water and ions:
!awk 'NR==4 {print "#include \"martini_v3.0.0_phospholipids_v1.itp\""; print "#include \"DPCE.itp\""; print "#include \"martini_v3.0.0_solvents_v1.itp\""; print "#include \"martini_v3.0.0_ions_v1.itp\""} 1' topol.top > temp && mv temp topol.top
- Change the ion names from NA+/CL- to Na and CL, respectively (insane uses older Martini 2 nomenclature):
!awk '{gsub(/NA\+/, "Na "); gsub(/CL-/, "CL "); print}' topol.top > temp && mv temp topol.top
- See how
topol.top
looks now and compare it with before:
!cat topol.top
Step 3 - Simulation
You can now run the simulations. In CG it is enough to perform a short energy minimization and a single pressure/temperature equilibration step before production. In GROMACS, we need to use the command gmx grompp
to create a processed binary file (with the extension .tpr
) for running minimization/MD calculations. The instructions for the calculation are in .mdp
files.
Energy minimization
The em.mdp
file has the instructions for the energy minimization. Create a .tpr
file (let’s call it em.tpr
) for performing it. Use the flag -h
if you need help. Use the flag maxwarn 1
to ignore a warning that crops up because of unimportant mismatches in the way insane names atoms and how they’re defined in topologies.
#Try yourself!
#Uncomment to run
#!gmx grompp ...
# @title Solution
!gmx grompp -f em.mdp -p topol.top -c membrane_prot.gro -o em.tpr -maxwarn 1
Run now the minimization with:
!gmx mdrun -deffnm em
Equilibration
The equilibration step makes use of the eq.mdp
runfile. Prior to using it with grompp
, you’ll have to create an index file defining the beads that are part of the protein/membrane and those that are part of the solvent. This is needed because the temperature of the two phases will be thermostatted independently and we need to tell grompp which atoms get temperature-coupled together. To make the index run:
%%bash
-f em.tpr <<EOF
gmx make_ndx "W" | "ION"
17 Solvent
name ! 17
18 Bilayer
name
a BB
q EOF
Armed with this index and the minimized structure, you can use grompp
and the file eq.mdp
for genereting the .tpr
file to perform the equilibration. Use the flat -n
to provide the index.ndx
file.
#Try yourself!
#Uncomment to run
#!gmx grompp ...
# @title Solution
!gmx grompp -f eq.mdp -p topol.top -c em.gro -n index.ndx -o eq.tpr -maxwarn 1
Run now the equilibration:
!gmx mdrun -v -deffnm eq
Production
Finally, we can create the .tpr
and run the production using the md.mdp
file.
#Try yourself!
#Uncomment to run
#!gmx grompp ...
# @title Solution
!gmx grompp -f md.mdp -p topol.top -c eq.gro -n index.ndx -o md.tpr
Since this step is quite computationally expensive, we’ll take the trajectory file (md.xtc
) from the worked
directory (also, some additional files to avoid incompatibilities).
!cp ../worked/md.xtc .
!cp ../worked/index.ndx .
!cp ../worked/eq.gro .
!cp ../worked/md.tpr .
Step 4 - Analysis
Trajectory fitting
It is useful — mostly for visualization — to have all the protein in all simulation frames centered and fit, so that all observed movement is relative to it, rather than to the box. At the same time, we will output only the lipids/protein to make the resulting trajectories lighter. This can be done with the following series of commands:
%%bash
'Protein' 'Bilayer' | gmx trjconv -f md.xtc -s md.tpr -center -pbc mol -o md_pbc.xtc -n index.ndx echo
With the previous command we are centering on the Protein
and only inlcuding the group Bilayer
(Lipids+Protein, without water and ions) in the output. The flag -pbc mol
will make molecules whole.
Let’s do the same as above but only on the starting structure of the production, producing one centered structure (useful later for use with VMD).
%%bash
'Protein' 'Bilayer' | gmx trjconv -f eq.gro -s md.tpr -center -pbc mol -o md_pbc.gro -n index.ndx echo
Let’s convert the production tpr file into another with fewer atoms compatibl with our centered trajectory (since we now ignore the solvent).
%%bash
'Bilayer' | gmx convert-tpr -s md.tpr -n index.ndx -o fit.tpr echo
Now, fits each frame’s protein to the conformation in fit.tpr
, by aligning backbone beads.
%%bash
'BB' 'Bilayer' | gmx trjconv -f md_pbc.xtc -s fit.tpr -fit rotxy+transxy -o md_fit.xtc -n index.ndx echo
Download md_pbc.gro
and md_fit.xtc
and take a look at the resulting trajectory with VMD — possibly showing only VDAC1 and the ceramides — and see how trajectory fitting highlights protein–ceramide interactions (see section 4.3 for visualization tips and advanced analyses using VMD)
Ceramide contact counting
To count contacts we first get a list of closest distances between any ceramide and each residue of the protein, over time, using gmx mindist
. Display first the help options with the -h
flag.
!gmx mindist -h
Try now to calculate the contacts using gmx mindist. You will need to select the groups Protein
and DPCE
.
#Try yourself!
#Uncomment to run
#%%bash
#echo 'Protein' 'DPCE' | gmx mindist ...
# @title Solution
!echo 'Protein' 'DPCE' | gmx mindist -respertime -s fit.tpr -f md_fit.xtc -or res_dists.xvg
The included script contact_fraction.py
converts these distances into contacts, and represents them as the fraction of the total trajectory time during which a given residue was in contact with a ceramide. It assumes a 6 Å cutoff for considering contacts. Run it like this:
!python3 contact_fraction.py
import numpy as np
import matplotlib.pyplot as plt
= np.loadtxt('res_dists.xvg', comments=('@', '#'))
mindists = mindists[:,1:] <= 0.6
contacts = contacts.mean(axis=0)
avg_contact_fraction 1, len(avg_contact_fraction) + 1), avg_contact_fraction)
plt.plot(np.arange( plt.show()
The script’s output already shows some regions of preferred contact, but it is not so clear among all the contacts whether there is a binding site (ceramides do seem to also bind on the region around Glu73, which is encouraging). In the next section we perform a more visual contact analysis.
Conclusion
Congratulations! You have completed this tutorial! If you want, you can continue by clicking here and performing some analyses with the VMD software that you will find in sections 4.3 and 4.4.
If not, you can take another one of our tutorials. What do you think about the topic of lipidomics and antimicrobial peptides?
For more information, visit the official Martini web https://cgmartini.nl/.