Tricks of the trade

[FORTRAN-90] PDB file format for FORTRAN coding

---------------------------------------------------------------------------

Field |    Column    | FORTRAN |                                         

  No. |     range    | format  | Description                                   

---------------------------------------------------------------------------

   1. |    1 -  6    |   A6    | Record ID (eg ATOM, HETATM)       

   2. |    7 - 11    |   I5    | Atom serial number                            

   -  |   12 - 12    |   1X    | Blank                                         

   3. |   13 - 16    |   A4    | Atom name (eg " CA " , " ND1")   

   4. |   17 - 17    |   A1    | Alternative location code (if any)            

   5. |   18 - 20    |   A3    | Standard 3-letter amino acid code for residue 

   -  |   21 - 21    |   1X    | Blank                                         

   6. |   22 - 22    |   A1    | Chain identifier code                         

   7. |   23 - 26    |   I4    | Residue sequence number                       

   8. |   27 - 27    |   A1    | Insertion code (if any)                       

   -  |   28 - 30    |   3X    | Blank                                         

   9. |   31 - 38    |  F8.3   | Atom's x-coordinate                         

  10. |   39 - 46    |  F8.3   | Atom's y-coordinate                         

  11. |   47 - 54    |  F8.3   | Atom's z-coordinate                         

  12. |   55 - 60    |  F6.2   | Occupancy value for atom                      

  13. |   61 - 66    |  F6.2   | B-value (thermal factor)                    

   -  |   67 - 67    |   1X    | Blank                                         

  14. |   68 - 70    |   I3    | Footnote number                               

---------------------------------------------------------------------------

[FORTRAN-90] A generic model FORTRAN-90 program to read and analyze gro 

!-------------------------------------

! Describe the function of the program

! Language: FORTRAN 90

!-------------------------------------

! Author: Dr. Saumyak Mukherjee

! Date  : 

!-------------------------------------


module error   

  implicit none

  save

  character(1024), parameter :: err1 = 'ERROR! Incorrect input format!'

  character(1024), parameter :: err2 = 'executable <in-file.gro> <out-file> <btime(ps)> <etime(ps)> <dt(ps)> <npoint>'

end module error


program main

  implicit none


  integer         , parameter :: ncode =

  character(*)    , parameter :: gro = '(i5, 2a5, i5, 3f8.3, 3f8.4)'

  double precision, parameter ::


  real             :: start, finish, toe, t1, t2, time

  integer          :: i, j, k

  integer          :: n2, natom, nsol, npoint, nion

  integer          :: bstep, estep, nstep, nskip

  integer          :: resid, atomid

  logical          :: ex

  character (5)    :: res, atom

  character (1024) :: fgro, fout

  double precision :: btime, etime, dt

  double precision :: lx, ly, lz

  double precision :: rx, ry, rz, r


  integer, dimension (ncode) :: code


  double precision, allocatable, dimension (:,:) :: x, y, z


  call system ("rm -f *.mod")

  call cpu_time (start)

  call timeDate


  print*, '----------------------------'

  print*, '     program function       '

  print*, '----------------------------'

  print*, ''


  call argChar (1, fgro  )

  call argChar (2, fout  )

  call argReal (3, btime )

  call argReal (4, etime )

  call argReal (5, dt    )

  call argInt  (6, npoint)


  inquire (file = trim (fgro), exist = ex)

  if (ex .eqv. .false.) then

     print*, ''

     print*, 'ERROR! '//trim (fgro)//' does not exist in this directory!'

     print*, ''

     stop

  endif


  inquire (file = trim (fout), exist = ex)

  if (ex .eqv. .true.) call system ("mv "//trim(fout)//" copy_"//trim(fout))


  open (unit = 1 , file = trim (fgro), action = 'read' , status = 'old'    )

  open (unit = 11, file = trim (fout), action = 'write', status = 'unknown')


  bstep = anint (btime / dt) + 1

  estep = anint (etime / dt) + 1

  nstep  = estep - bstep + 1


  nskip = bstep - 1


  read (1,*)          !

  read (1,*) n2       ! READING THE NUMBER OF ATOMS/WATER MOLECULES

  nsol = n2 / npoint  ! FOR ONLY WATER IN THE SYSTEM

  rewind (1)          !


  call groParams (1, natom, nsol, nion, npoint)  ! READING SYSTEM PARAMETERS FOR PROTEINS SYSTEMS


  print*, '--------------------------------------------'

  print'(x,a33,x,i10)'  , "Total number of atoms           =", n2

  print'(x,a33,x,i10)'  , "Total number of protein atoms   =", natom

  print'(x,a33,x,i10)'  , "Total number of water molecules =", nsol

  print'(x,a33,x,i10)'  , "Total number of ions            =", nion

  print'(x,a33,x,f10.3)', "Start time (ps)                 =", btime

  print'(x,a33,x,f10.3)', "End time (ps)                   =", etime

  print'(x,a33,x,f10.3)', "Time step (ps)                  =", dt

  print'(x,a33,x,f10.3)', "Total calulation time (ps)      =", (nstep-1)*dt

  print'(x,a33,x,i10)'  , "Start time step                 =", bstep

  print'(x,a33,x,i10)'  , "End time step                   =", estep

  print'(x,a33,x,i10)'  , "Number of steps to calculate    =", nstep

  print'(x,a33,x,i10)'  , "Number of steps to skip         =", nskip

  print*, '--------------------------------------------'


  allocate (x(), y(), z(), stat = code(1))


  do i = 1,ncode

     if (code(i) /= 0) then

        print*, "ERROR IN MEMORY ALLOCATION!", i

        print*, ""

        stop

     endif

  enddo


  if (bstep /= 1) then

     print*, 'Skipping initial steps ...'


     do k = 1, nskip

        if (k == 1) call cpu_time (t1)


        read (1,*)

        read (1,*) n2


        do i = 1, n2+1

           read (1,*)

        enddo


        if (k == 1) then

           call cpu_time (t2)

           time = (t2 - t1) * float (nskip)


           if (time >= 60.0 .and. time < 3600.0) then

              time = time / 60.0

              print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "min"

           elseif (time >= 3600.0) then

              time = time / 3600.0

              print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "hrs"

           else

              print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "sec"

           endif

        endif


        call progress (k,nskip)

     enddo

  endif


  do k = 1, nstep

     if (k == 1) call cpu_time (t1)


     read (1,*)

     read (1,*)


     do i = 1, natom

        read (1,gro) resid, res, atom, atomid, x, y, z

     enddo


     do i = 1, nsol

        do j = 1, npoint

           read (1,gro) resid, res, atom, atomid, x, y, z

        enddo

     enddo


     do i = 1, nion

        read (1,gro) resid, res, atom, atomid, x, y, z

     enddo


     read (1,*) lx, ly, lz


     rx = rx - lx * anint (rx / lx)

     ry = ry - ly * anint (ry / ly)

     rz = rz - lz * anint (rz / lz)

     

     r  = dsqrt (rx*rx + ry*ry + rz*rz)


     if (k == 1) then

        call cpu_time (t2)

        time = (t2 - t1) * float (nstep)


        if (time >= 60.0 .and. time < 3600.0) then

           time = time / 60.0

           print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "min"

        elseif (time >= 3600.0) then

           time = time / 3600.0

           print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "hrs"

        else

           print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "sec"

        endif

     endif


     call progress (k, nstep)

  enddo


  call cpu_time (finish)

  toe = finish - start

  print*, ''


  if (toe >= 60.0 .and. toe < 3600.0) then

     toe = toe / 60.0

     print"(x,a28,x,f7.2,x,a3)","Time taken for computation =", toe, "min"

  elseif (toe >= 3600.0) then

     toe = toe / 3600.0

     print"(x,a28,x,f7.2,x,a3)","Time taken for computation =", toe, "hrs"

  else

     print"(x,a28,x,f7.2,x,a3)","Time taken for computation =", toe, "sec"

  endif


  print*, ""

  print*, "File ", trim (fout), " generated!"

end program main


subroutine progress(i, imax)

   implicit none


   integer :: i, imax

   integer :: j, k


   double precision   :: prog

   character(len=117) ::bar


   bar="???% |                                                                                                    |"


   prog = dble (i) / dble(imax)

   j = INT(prog * 100)

   write(unit=bar(1:3),fmt="(i3)") j


   do k=1, j

   bar(6+k:6+k)=">"

   end do


   write(unit=6,fmt="(a1,a117)",advance="no") char(13), bar

   if (i .lt. imax) then

      flush(unit=6)

   else

      write(unit=6,fmt=*)

   end if


   return

end subroutine progress


subroutine argChar (n,file_name)

  use error


  implicit none


  integer        , intent(in)  :: n

  character(1024), intent(out) :: file_name

  character(1024)              :: arg


  call getarg(n,arg)

  if (len_trim(arg) == 0) then

     print*, trim(err1)

     print*, trim(err2)

     stop

  endif

  read(arg,*) file_name

end subroutine argChar


subroutine argInt(n,val)

  use error


  implicit none


  integer     , intent(in)  :: n

  integer     , intent(out) :: val

  character(1024)           :: arg


  call getarg(n,arg)

  if (len_trim(arg) == 0) then

     print*, trim(err1)

     print*, trim(err2)

     stop

  endif

  read(arg,*) val

end subroutine argInt


subroutine argReal (n,val)

  use error


  implicit none


  integer         , intent(in)  :: n

  double precision, intent(out) :: val

  character(1024)               :: arg


  call getarg(n,arg)

  if (len_trim(arg) == 0) then

     print*, trim(err1)

     print*, trim(err2)

     stop

  endif

  read(arg,*) val

end subroutine argReal


subroutine groParams (file_unit, atomnum, solnum, ionnum, npoint)

  implicit none


  character (*), parameter :: gro = "(i5, 2a5, i5, 3f8.3, 3f8.4)"


  integer       :: n2, i, resid

  character (5) :: res


  integer, intent (in)  :: file_unit, npoint

  integer, intent (out) :: atomnum, solnum, ionnum


  read (file_unit,*)

  read (file_unit,*) n2


  atomnum = 0

  solnum  = 0

  ionnum  = 0


  do i = 1, n2

     read (file_unit,gro) resid, res


     if (trim(res) == "SOL") then

        solnum = solnum + 1

     elseif (trim(res) == "CL" .or. trim(res) == "NA") then

        ionnum = ionnum + 1

     else

        atomnum = atomnum + 1

     endif

  enddo


  read (file_unit,*)


  rewind (file_unit)


  if (n2 /= atomnum + solnum + ionnum) then

     print*, "ERROR! INCORRECT PARAMETER COUNT IN GRO FILE!"

     print*, ""

     stop

  endif


  solnum = solnum / npoint

end subroutine groParams


subroutine timeDate

  implicit none

  character(8)  :: date

  character(10) :: time

  character(5)  :: zone


  integer :: hour, minute, second, day, month, year


  integer, dimension(8) :: val


  call date_and_time(date, time, zone, val)


  hour   = val(5)

  minute = val(6)

  second = val(7)

  day    = val(3)

  month  = val(2)

  year   = val(1)


  print*,""

  print"(x,a23,x,i2,a,i2,a,i2,a,x,i2,a,i2,a,i4)", &

       "Calculation started at:",hour,":",minute,":",second,";", &

       day,"/",month,"/",year

  print*,""

end subroutine timeDate

[FORTRAN-90, GROMACS] Process/Analyze GROMACS xtc files in FORTRAN 90

(1) Install libxdrfile

(1.1) Download

wget https://github.com/wesbarnett/libxdrfile/archive/refs/heads/master.zip

(1.2) Install

unzip master.zip

cd libxdrfile-master

mkdir build

cd build

cmake .. -DCMAKE_INSTALL_PREFIX=installation_directory_path (by default /usr/local/) 

make

make install

make test

(2) Install libgmxfort

(2.1) Download 

wget https://github.com/wesbarnett/libgmxfort/archive/refs/heads/master.zip

(2.2) Install

unzip master.zip

cd libgmxfort-master

mkdir build

cd build

cmake .. -DCMAKE_INSTALL_PREFIX=installation_directory_path # (by default /usr/local/) 

make

make test

make install

(3) Add the following line to ~/.basrc and compile.

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path-to-LIBGMXFORT/LIBGMXFORT/libgmxfort-master/build/lib

(4) Sample code

Task: Read 1 frame from the file traj_PBC.xtc and print the coordinates of atom 1.

program main

  use gmxfort_trajectory

  implicit none

  type (Trajectory) :: trj

  real :: myatom(3)

  integer :: n

  call trj%open("traj_PBC.xtc", "index.ndx")

  n = trj%read_next(1)

  call trj%close()

  myatom = trj%x(1, 1)

  print*, myatom(1), myatom(2), myatom(3)

end program main

(5) Compilation

gfortran read_xtc.f90 -I /path-to-LIBGMXFORT/LIBGMXFORT/libgmxfort-master/build/include -L /path-to-LIBGMXFORT/LIBGMXFORT/libgmxfort-master/build/lib -lgmxfort -o readXTC

(6) Execution

./readXTC

(7) Output

Opened traj_PBC.xtc for reading.

48057 atoms present in system.

20001 frames present in trajectory file.

Frames saved: 1

Closed xtc file.

   4.56200027       2.66300011       1.66200006

(8) Further information: https://github.com/wesbarnett/libgmxfort 

N.B. Change "path-to-LIBGMXFORT" according to your destination directory!

[GROMACS] Calculating properties with dynamic indexing

There is no straight-forward way in GROMACS to calculate properties based on dynamic indexing, yet. To explain the concept, let us consider the total dipole moment of protein hydration layers. 

The first step would be to get the index file for hydration layer water atoms

gmx select -f traj.xtc -s topol.tpr -on phl_0.3nm.ndx -select "same residue as name OW and within 0.3 of group protein"

This phl_0.3nm.ndx file has N number of groups where N is the number of time-frames in the trajectory file (traj.xtc). Each group contains the atom indices for the water molecules within 0.3 nm of the protein (aka, the hydration layer) at the respective time frame. Now, if you want to use this for calculating dipole moment using gmx dipoles, it will not work because you will be prompted to select one of the N groups, whose dipole moment trajectory will be calculated. 

There are two ways to deal with the problem.

[GROMACS] gmx select syntax

To select water molecules within 0.3 nm from the surface of protein

gmx select -f traj.xtc -s topol.tpr -oi phl_0.3nm.dat -on phl_0.3nm.ndx -select "same residue as resname SOL and within 0.3 of group protein"

For detailed discussion, click on the following link: selections in GROMACS

[GROMACS] Merge/Stitch two boxes in GROMACS

Suppose you have two simulation boxes that you need to merge to create a larger box with an interface. You need to follow the following steps.

gmx editconf -f box-1.gro -o box-1.pdb

Let's say the boxes are to be stitched along the z-axis. Then you need to shift the coordinates of the atoms in box-2 by the z-length of box-1. x and y coordinates should not be changed.

gmx editconf -f box-2.gro -o box-2.pdb -translate 0 0 box-1_z

cat box-1.pdb box-2.pdb > final_box.pdb

Delete the few lines (4-6 lines) that mark the beginning of the second box in final_box.pdb. Then continue with the following.

gmx editconf -f final_box.pdb -o final_box.gro

gmx genconf -f final_box.gro -o final_box.gro -renumber

[GROMACS] Parameterize a small molecule with GAFF and run on gromacs

Suppose you want to simulate a small molecule in gromacs using GAFF. Youo have the pdb file molecule.pdb. Here are the steps for the process.

#1. If your pdb file does not have hydrogens, then first add H atoms using open babel:

obabel molecule.pdb -O molecule-H.pdb -h

Of course you may use other tools like VMD, PyMol, etc.

#2. Generate AMBER Parameters with Antechamber: Generate mol2 file.

antechamber -i molecule-H.pdb -fi pdb -o molecule.mol2 -fo mol2 -c bcc -s 2

Here charges are added using the AM1-BCC method, which is generally suitable for small organic molecules. For more details click here.

#3. Generate GAFF Force Field Parameter File using Parmchk2

parmchk2 -i molecule.mol2 -f mol2 -o molecule.frcmod

#4. Convert AMBER Parameters to GROMACS Format Using ACPYPE

acpype -i molecule.mol2 -b molecule

-b molecule is the base name for the output files. Click here for details.

Now you are ready to simulate the molecule in GROMACS.

[LINUX] "convert" command in Linux

Conversion between two image formats

convert input.jpg output.png

Making figures backgroundless

convert input.jpg -transparent white output.png

This example considers that the background color is white.

Trimming out white regions in a figure

convert -trim input.jpg output.jpg

For other useful options, type convert -help

[LINUX] Create a video from multiple images in Linux

Make a video by stitching image files using ffmpeg.

a) When the image files are not named sequentially.

ffmpeg -f image2 -framerate 10 -pattern_type glob -i "file_*.jpg" -s 1920x1080 -vcodec libx264 -crf 20 -pix_fmt yuv420p video.mp4

b) When the image files are named sequentially (file_001.jpg, file_002.jpg, ...). 

ffmpeg -f image2 -framerate 10 -start_number 0 -i "file_%03d.jpg" -s 1920x1080 -vcodec libx264 -crf 20 -pix_fmt yuv420p video.mp4

[LINUX] Compress pdf files from Linux terminal

You can use ghostscript to do this. Install it by using the following command:

sudo apt install ghostscript

Now run the following command:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

-dPDFSETTINGS determines the output file size. Following are the options:

[LINUX] Find files with a given "pattern" inside in a hierarchical manner including all sub-directories

find . -type f -exec grep -iF "pattern" /dev/null {} +

[LINUX] parallel: Using all cores of a node in parallel

Suppose you need to run several jobs in a loop. For example, let's say you want to use gmx msd for 1000 groups. You can use a bash for loop to do that as follows.

for i in {1..1000}

do

   echo ${i} | gmx msd -f traj.xtc -s topol.tpr -n index.ndx -o msd-${i}.xvg

done

In this case, each of the 1000 commands are executed o after the other. These are serial jobs. So, if your node has 16 cores, 1 core is used in one iteration. The rest 15 cores are idle. You can accelerate the process by using parallel as follows.

parallel 'echo {} | gmx msd -f traj.xtc -s topol.tpr -n index.ndx -o msd-{}.xvg' ::: $(seq 1 1000)

In this case, all the 16 cores will be used at a time. Which means 16 iterations will run simultaneously.

[LINUX] Shorten remote login details

Suppose you need to login to a remote host named hostname.xxx.yyy.zzz for the user username. You can use ssh in the following way:

ssh -XC username@hostname.xxx.yyy.zzz

Or, if you need to send a file to this remote host, you may use scp in the following way:

scp -r filename username@hostname.xxx.yyy.zzz:/path/to/desired/directory/

However, it is annoying to write the full login details each time. To shorten it, add a file called config to the .ssh directory in home 

vi ~/.ssh/config

and add the following lines:

Host hostname

    HostName hostname.xxx.yyy.zzz

    User username

Save this and next time use the following commands:

ssh -XC hostname

scp -r filename hostname:/path/to/desired/directory/

[MARTINI] Backmapping from MARTINI CG to AA

All related files/scripts/folders can be downloaded from here: link. Details about the usage are also given here.

Backmapping Martini coarse-grained structure to all-atom description

./initram-v5.sh -f initial.gro -o aa_backmapped.gro -to amber99sb -p topol_initial.top

Here, initial.gro is the CG input structure. aa_backmapped.gro is the backmapped output structure. topol_initial.top is the all atom topology file containing only protein atoms. Water and ions should not be included.

[PYTHON] Convert float variable to scientific notation

This function can be used to convert any floating number to *10^x notation in python codes.

(Thanks to Mr. Tobias Prass for collaboration)

def sci_not (inp):

    superscript = {

    "0": "\u2070",

    "1": "\u00B9",

    "2": "\u00B2",

    "3": "\u00B3",

    "4": "\u2074",

    "5": "\u2075",

    "6": "\u2076",

    "7": "\u2077",

    "8": "\u2078",

    "9": "\u2079",

    "-": "\u207B",

    }

    neg = False

    if inp < 0:

        neg = True

        inp = inp * -1

    num = inp < 1

    if num:

        sig = inp

        exp = 0

        while sig < 1:

            sig = sig * 10

            exp += 1

        out = f"{sig:.2f} x 10"+superscript["-"]+superscript[str(exp)]

    elif not num:

        sig = inp

        exp = 0

        while sig > 10:

            sig = sig / 10

            exp += 1

        out = f"{sig:.2f} x 10"+superscript[str(exp)]

    if neg:

        sign = "-"

    elif not neg:

        sign = ""

    return str(sign)+out

[PYTHON] Convert Python-2 scripts to Python-3 syntax

Details in the following link: link

[PYTHON] Parsing command line arguments with flags

This is a python code to generate the input files necessary for running the DoSPT program that calculates entropy using the 2PT method. It uses the python module argparse to read in command line arguments using flags. 

#!/usr/bin/env python

import argparse

import sys

import os


# Command line arguments parsed with flags

parser = argparse.ArgumentParser(description="This python scripts generates all the files necessary for running the DoSPT program", usage="%(prog)s -w <water-model> -ns <num. of steps> -t <traj. length> -T <temperature>", epilog="Copyright reserved by Saumyak Mukherjee")


parser.add_argument('-w' , '--water', type=str, help='Water model (spc/spce/tip3p/tip4p/tip4p-2005/tip4p-ice/disp) [default: %(default)s]', default='spce', metavar='')

parser.add_argument('-ns', '--nstep', type=int, help='Number of steps in the trajectory [default: %(default)s]', default=5001, metavar='')

parser.add_argument('-t' , '--trun', type=float, help='Total length of the trajectory in ps [default: %(default)s]', default=20.0, metavar='')

parser.add_argument('-T' , '--temp', type=float, help='Temperature of the system in K [default: %(default)s]', default=300.0, metavar='')


args = parser.parse_args()


wModel = args.water

nStep  = args.nstep

tRun   = args.trun

temp   = args.temp


s = 2 # symmetry number


# Input and output files

fTraj  = 'traj.gro'

fInput = 'input'

fMass  = 'masses'

fGrp   = 'groups'

fSgrp  = 'supergroups'


# Check for the existence of the traj.gro file

if not os.path.exists(fTraj):

    print(f"Error: The {fTraj} file does not exist in the current directory!")

    sys.exit(1)


# Number of water molecules and box dimensions

nAtoms = {'spc':3, 'spce':3, 'tip3p':3, 'tip4p':4, 'tip4p-2005':4, 'tip4p-ice':4, 'disp':4}

nPoint = nAtoms[wModel]


boxLine = None

with open(fTraj, 'r') as inp:

    for i, line in enumerate(inp):

        if i==1:

            nWat = int(line)

            boxLine = nWat + 2

            nWat = int(nWat / nPoint)

        elif i==boxLine:

            box = line.split()

            lx, ly, lz = box[0], box[1], box[2]

            break


fmt = fTraj.split('.')[-1]


# Generate the "masses" file

def write_mass(model, filename):

    masses = {

        'spc'       : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080)],

        'spce'      : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080)],

        'tip3p'     : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080)],

        'tip4p'     : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080), ('MW', 0.0000)],

        'tip4p-2005': [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080), ('MW', 0.0000)],

        'tip4p-ice' : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080), ('MW', 0.0000)],

        'disp'      : [('OW', 15.9994), ('HW2', 1.0080), ('HW3', 1.0080), ('MW4', 0.0000)]

    }

    if model in masses:

        with open(filename, 'w') as f:

            for atom, mass in masses[model]:

                f.write(f'{atom}  {mass}\n')

    else:

        print(f"Water model '{model}' not supported!")


write_mass(wModel, fMass)

print (f'file "{fMass}" generated"!')


# Generate "input" file

with open(fInput, 'w') as file:

    file.write(f'points = {nStep}\n')

    file.write(f'tau = {tRun:.3f}\n')

    file.write(f'cell = {lx} {ly} {lz}\n')

    file.write(f'temperature = {temp:.2f}\n')

    file.write(f'format = {fmt}\n')

    file.write('estimate_velocities = .true.\n')

    

print (f'file "{fInput}" generated"!')

    

# Generate "groups" file

with open(fGrp, 'w') as file:

    file.write(f"{nWat*nPoint} {nWat}\n")

    

    for i in range(1,nWat+1):

        file.write(f"{nPoint} {s}\n")

        

        atomnums = [str(nPoint*(i-1)+j) for j in range(1, nPoint+1)]

        file.write(" ".join(atomnums)+"\n")

        

print (f'file "{fGrp}" generated"!')        


# Generate "supergroups" file

with open(fSgrp, 'w') as file:

    file.write(f'1-{nWat}')

        

print (f'file "{fSgrp}" generated"!')

To know about the usage of the code, type in

setupDoSPT.py -h

The output should look like the following:

usage: setupDoSPT.py -w <water-model> -ns <num. of steps> -t <traj. length> -T <temperature>


This python scripts generates all the files necessary for running the DoSPT program


optional arguments:

  -h, --help      show this help message and exit

  -w , --water    Water model (spc/spce/tip3p/tip4p/tip4p-2005/tip4p-ice/disp) [default: spce]

  -ns , --nstep   Number of steps in the trajectory [default: 5001]

  -t , --trun     Total length of the trajectory in ps [default: 20.0]

  -T , --temp     Temperature of the system in K [default: 300.0]


Copyright reserved by Saumyak Mukherjee

For details on the DoSPT program follow this link: DoSPT

[PYTHON] Plotting graphs with matplotlib

These are some generic templates that I often have to use. This of course is not exhaustive and needs to be adjusted or modified according to requirement. One may think of this as a basic template.

1. Useful packages/modules to be imported:

%matplotlib inline

import numpy as np # array manipulations

import matplotlib.pyplot as plt # Data plotting

import matplotlib as mpl # Matplotlib

import pandas as pd # I/O handling and much more

from scipy import interpolate # Interpolation or spline fitting

from scipy import optimize as op # Fitting data with a function

from string import ascii_lowercase as alc # Alphabets in alphabetical order in a loop

import matplotlib.patches as patches # Introduce geometric shapes in graphs

from mpl_toolkits.axes_grid1.inset_locator import (inset_axes, InsetPosition, mark_inset) # Inset

import os # For using bash commands in python

import glob # for file handling

import seaborn as sns # Plotting continuous distribution

Some initial lines:

warnings.filterwarnings('ignore') # To not show warnings

mpl.rcParams['axes.linewidth'] = 0.5 # Width of the graph frame

2. Reading in data from a file:

2.1. Using numpy for a single input file

file = "input-file.dat" # Let's say the file has two columns separated by space

data = np.loadtxt(file, unpack=True, skiprows=17)

x = data[0] # as numpy array

y = data[1] # as numpy array

2.2. Using numpy for multiple input files with sequential numbering (let's say 10, 15, 20, ..., 95, 100)

for i in range(10, 100, 5):

 file = f"input-file_{i}.dat"

 # or, an alternative way

 file = "input-file_"+str(i)+".dat"

 data = np.loadtxt(file, unpack=True, skiprows=17)

 x = data[0] # as numpy array

 y = data[1] # as numpy array

2.3. Using numpy for multiple input files with non-sequential numbering

files = glob.glob("input-file_*.xvg")

for file in files:

 data = np.loadtxt(file, unpack=True, skiprows=17)

 x = data[0] # as numpy array

 y = data[1] # as numpy array

2.4. Using python (same method for multiple files as before)

file = "input-file.dat"

data = []

with open(file, "r") as file:

   for line in file:

    if line.startswith("#") or line.startswith("@"): # Skipping comments

         continue

      line = line.split()

      x, y = float(line[0]), float(line[1])

      data.append((x, y))

3. Plotting the data:

3.1. Using matplotlib.pyplot.subplots for a single graph from a single input file.

fig, ax = plt.subplots(1, 1, dpi=300, figsize=(2.5,2), constrained_layout=True)


# plot data

ax.plot(x, y, "--o", lw=0.8, ms=1, color='k', label="xxx")

# plot data along with errorbar

ax.errorbar(x, y, yerr=dy,  #dy to be read from input file or calculated 

            color = 'k', 

            ecolor='k', 

            elinewidth=0.5, 

            lw = 0.8, 

            label='Mean', 

            marker = '.')

# plot histogram

ax.hist(y, bins=100, 

        density=True, 

        color='dodgerblue',

        edgecolor='dodgerblue',

        lw=0.3, 

        alpha=0.4)

# plot continuous distribution

sns.kdeplot(y, linewidth=1, fill=True)


# plot bar-plot

ax.bar(x, y, label="bar label", color='b', edgecolor='k', lw = 0.5, width = 0.5)


ax.set_xlim(0, 100)

ax.set_ylim(0, 1)

ax.set_xlabel('x-axis label', fontsize=10)

ax.set_ylabel('y-axis label', fontsize=10)

ax.tick_params(axis='both', which='major', labelsize=6, length=1.5, width=0.5)

ax.ticklabel_format(style='sci', axis='y', scilimits=(4,4), useMathText=True)

ax.yaxis.offsetText.set_fontsize(6)

ax.yaxis.set_label_coords(-0.2, 1.2) # adjust y-axis label position

ax.grid(visible=True, which='major', axis='both', ls='--', lw=0.2)

ax.set_title("Graph title", fontsize=8, pad=5, y=1.0)

ax.legend(frameon=False, loc=1, fontsize=10, shadow = False, edgecolor='k')

ax.axvline(x=50, color='r', ls="-", lw=0.8)

ax.axhline(y=0, color='k', ls='--', lw=0.8)

ax.text(50,0.5, "Text to be added", fontsize=5, color='b', rotation='vertical')

ax.spines['top'].set_visible(False) # do not show the top line in graph frame

ax.spines['right'].set_visible(False) # do not show the right line in graph frame

plt.setp(ax.get_xticklabels(), visible=False) # do not show the x-ticks

ax.arrow(25,790,0,10, head_width=15, head_length=5, fc='r', ec='r') # Draw downward arrow

ax.arrow(420,780,0,-10, head_width=15, head_length=5, fc='b', ec='b') # Draw upward arrow

ax.fill_between(x, y, 0, color='r', alpha=0.6) # Fill area under a curve with color


# add inset

left, bottom, width, height = [0.6, 0.65, 0.4, 0.35]

ax1 = ax.inset_axes([left, bottom, width, height])

ax1.plot(x, y, lw=0.6, ms=1, "--s", color='r')

loc = [0, 0.2, 0.5, 0.8, 1.0] # x-ticks at given positions

ax1.set_xticks(loc)

ax1.set_ylim([0.2, 0.4])

ax1.set_xlim([0, 1])

ax1.set_facecolor("lavender")

rect = patches.Rectangle((0, 0), 50, 0.8, lw=0, edgecolor='w', facecolor='w') # add a rectangle

ax1.add_patch(rect)


figFile = 'output.jpg'

fig.savefig(figFile,format="jpg", dpi=300, bbox_inches='tight')

os.popen(f'convert -trim {figFile} {figFile}')

3.2. Using matplotlib.pyplot.subplots for a multi-panel graph from multiple input files.

fig, axs = plt.subplots(3,2,dpi=300,figsize=(10,12),constrained_layout=True,sharex=True,sharey=True)


for ax, x, y, ser in zip(axs.flatten(), xVals, yVals, alc): # xVals and yVals are arrays

   ax.plot(x, y, lw=0.6, ms=1, "--s", color='r', label=ser) # ser gives the panel numbering from alc

   add other options as shown before

4. Fitting the data using scipy.optimize 

# triexponential fitting

def func (x, a1, t1, a2, t2, a3, t3):

    return a1*np.exp(-x/t1) + a2*np.exp(-x/t2) + a3*np.exp(-x/t3) + (1-a1-a2-a3)


guess = [0.5,1,0.5,10,0.5,100]

lo = (0,0,0,0,0,0)

hi = (1,10,1,100,1,1000)

maxiter = 1000001


popt, pcov = op.curve_fit(func, x, y, check_finite=True, p0=guess, bounds=(lo,hi), maxfev = maxiter)


ax.plot(x, y, 'ro', lw=1, label='raw data')

ax.plot(x, func(x, *popt), 'r-', lw = 0.8, , label='fitted data')

[PYTHON, LATEX] Generate latex compatible tables from python

You can generate Latex formatted tables from python using pandas. Suppose you have a file input.txt with three columns which you want to put into a latex document. This is a sample piece of code to do that. As a bonus this code also shows how to put multiple index names in a pandas DataFrame.

import numpy as np

import pandas as pd


vals = np.loadtxt (input.txt)


head = [('Index-1a','Index-1b'), ('Index-2a','Index-2b'), ('Index-3a','Index-3b')]

form = ["{:.0f}".format, "{:.0f}".format, "{:.0f}".format] # all integers

cols = pd.MultiIndex.from_tuples(head)

df = pd.DataFrame(vals, columns=cols)

out = df.to_latex(

    index=False, 

    caption='Table caption.', 

    escape=False, 

    formatters=form, 

    multirow=True, 

    column_format = 'lrr',

    label='latex_label')

print(out)

For details click on the following link.

[VMD] Change default options in vmd

Download the .vmdrc file in your user home: download

Change the various parameters in this file as you wish. Be careful to maintain the format. Enjoy!

N.B. The name of the file should be .vmdrc (mind the '.')!

[VMD] Render frames as images after loading molecule in VMD

The following TCL script does the job. Save the code as render.tcl and run the following command on the TK Console:

source render.tcl

Code:

set output_dir "frames"

file mkdir $output_dir 


render TachyonInternal on 

display depthcue off

display antialias on

axes location off


set render_width 1920 

set render_height 1080 

set frame_interval 10


set molID [molinfo top]

set num_frames [molinfo $molID get numframes]


for {set i 0} {$i < $num_frames} {incr i $frame_interval} {

    animate goto $i


    set img_file [format "%s/frame_%04d.tga" $output_dir $i]


    render TachyonInternal $img_file

    puts "Rendered frame $i to $img_file"

}

Use ffmpeg to make a video from the generate images. For details see Create a video from multiple images in Linux above.

[VMD] Select and display whole water molecules within cutoff in VMD

In VMD, for displaying water molecules within a given distance (let's say 5 Angstrom) from protein, we use:

water within 5 of protein

However, this selects atoms of water, whereby you might see only H or O atoms of some waters. To display the whole molecules, use:

same fragment as water within 5 of protein

[XMGRACE] Change default options in xmgrace

(1) Create the following directory in user home.

mkdir .grace/

(2) Inside .grace/ create a file gracerc.user with the following contents.

HARDCOPY DEVICE "JPEG"

DEVICE "JPEG" DPI 300

Here you can change the options for the default output settings for figures.

(3) Here, also create folder templates. Inside templates download the Default.agr file: download

You may change the various parameters in the Default.agr file that you downloaded as per your choices, and you are ready.

Enjoy plotting!