Tricks of the trade
[FORTRAN-90] PDB file format for FORTRAN coding
---------------------------------------------------------------------------
Field | Column | FORTRAN |
No. | range | format | Description
---------------------------------------------------------------------------
1. | 1 - 6 | A6 | Record ID (eg ATOM, HETATM)
2. | 7 - 11 | I5 | Atom serial number
- | 12 - 12 | 1X | Blank
3. | 13 - 16 | A4 | Atom name (eg " CA " , " ND1")
4. | 17 - 17 | A1 | Alternative location code (if any)
5. | 18 - 20 | A3 | Standard 3-letter amino acid code for residue
- | 21 - 21 | 1X | Blank
6. | 22 - 22 | A1 | Chain identifier code
7. | 23 - 26 | I4 | Residue sequence number
8. | 27 - 27 | A1 | Insertion code (if any)
- | 28 - 30 | 3X | Blank
9. | 31 - 38 | F8.3 | Atom's x-coordinate
10. | 39 - 46 | F8.3 | Atom's y-coordinate
11. | 47 - 54 | F8.3 | Atom's z-coordinate
12. | 55 - 60 | F6.2 | Occupancy value for atom
13. | 61 - 66 | F6.2 | B-value (thermal factor)
- | 67 - 67 | 1X | Blank
14. | 68 - 70 | I3 | Footnote number
---------------------------------------------------------------------------
[FORTRAN-90] A generic model FORTRAN-90 program to read and analyze gro
!-------------------------------------
! Describe the function of the program
! Language: FORTRAN 90
!-------------------------------------
! Author: Dr. Saumyak Mukherjee
! Date :
!-------------------------------------
module error
implicit none
save
character(1024), parameter :: err1 = 'ERROR! Incorrect input format!'
character(1024), parameter :: err2 = 'executable <in-file.gro> <out-file> <btime(ps)> <etime(ps)> <dt(ps)> <npoint>'
end module error
program main
implicit none
integer , parameter :: ncode =
character(*) , parameter :: gro = '(i5, 2a5, i5, 3f8.3, 3f8.4)'
double precision, parameter ::
real :: start, finish, toe, t1, t2, time
integer :: i, j, k
integer :: n2, natom, nsol, npoint, nion
integer :: bstep, estep, nstep, nskip
integer :: resid, atomid
logical :: ex
character (5) :: res, atom
character (1024) :: fgro, fout
double precision :: btime, etime, dt
double precision :: lx, ly, lz
double precision :: rx, ry, rz, r
integer, dimension (ncode) :: code
double precision, allocatable, dimension (:,:) :: x, y, z
call system ("rm -f *.mod")
call cpu_time (start)
call timeDate
print*, '----------------------------'
print*, ' program function '
print*, '----------------------------'
print*, ''
call argChar (1, fgro )
call argChar (2, fout )
call argReal (3, btime )
call argReal (4, etime )
call argReal (5, dt )
call argInt (6, npoint)
inquire (file = trim (fgro), exist = ex)
if (ex .eqv. .false.) then
print*, ''
print*, 'ERROR! '//trim (fgro)//' does not exist in this directory!'
print*, ''
stop
endif
inquire (file = trim (fout), exist = ex)
if (ex .eqv. .true.) call system ("mv "//trim(fout)//" copy_"//trim(fout))
open (unit = 1 , file = trim (fgro), action = 'read' , status = 'old' )
open (unit = 11, file = trim (fout), action = 'write', status = 'unknown')
bstep = anint (btime / dt) + 1
estep = anint (etime / dt) + 1
nstep = estep - bstep + 1
nskip = bstep - 1
read (1,*) !
read (1,*) n2 ! READING THE NUMBER OF ATOMS/WATER MOLECULES
nsol = n2 / npoint ! FOR ONLY WATER IN THE SYSTEM
rewind (1) !
call groParams (1, natom, nsol, nion, npoint) ! READING SYSTEM PARAMETERS FOR PROTEINS SYSTEMS
print*, '--------------------------------------------'
print'(x,a33,x,i10)' , "Total number of atoms =", n2
print'(x,a33,x,i10)' , "Total number of protein atoms =", natom
print'(x,a33,x,i10)' , "Total number of water molecules =", nsol
print'(x,a33,x,i10)' , "Total number of ions =", nion
print'(x,a33,x,f10.3)', "Start time (ps) =", btime
print'(x,a33,x,f10.3)', "End time (ps) =", etime
print'(x,a33,x,f10.3)', "Time step (ps) =", dt
print'(x,a33,x,f10.3)', "Total calulation time (ps) =", (nstep-1)*dt
print'(x,a33,x,i10)' , "Start time step =", bstep
print'(x,a33,x,i10)' , "End time step =", estep
print'(x,a33,x,i10)' , "Number of steps to calculate =", nstep
print'(x,a33,x,i10)' , "Number of steps to skip =", nskip
print*, '--------------------------------------------'
allocate (x(), y(), z(), stat = code(1))
do i = 1,ncode
if (code(i) /= 0) then
print*, "ERROR IN MEMORY ALLOCATION!", i
print*, ""
stop
endif
enddo
if (bstep /= 1) then
print*, 'Skipping initial steps ...'
do k = 1, nskip
if (k == 1) call cpu_time (t1)
read (1,*)
read (1,*) n2
do i = 1, n2+1
read (1,*)
enddo
if (k == 1) then
call cpu_time (t2)
time = (t2 - t1) * float (nskip)
if (time >= 60.0 .and. time < 3600.0) then
time = time / 60.0
print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "min"
elseif (time >= 3600.0) then
time = time / 3600.0
print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "hrs"
else
print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "sec"
endif
endif
call progress (k,nskip)
enddo
endif
do k = 1, nstep
if (k == 1) call cpu_time (t1)
read (1,*)
read (1,*)
do i = 1, natom
read (1,gro) resid, res, atom, atomid, x, y, z
enddo
do i = 1, nsol
do j = 1, npoint
read (1,gro) resid, res, atom, atomid, x, y, z
enddo
enddo
do i = 1, nion
read (1,gro) resid, res, atom, atomid, x, y, z
enddo
read (1,*) lx, ly, lz
rx = rx - lx * anint (rx / lx)
ry = ry - ly * anint (ry / ly)
rz = rz - lz * anint (rz / lz)
r = dsqrt (rx*rx + ry*ry + rz*rz)
if (k == 1) then
call cpu_time (t2)
time = (t2 - t1) * float (nstep)
if (time >= 60.0 .and. time < 3600.0) then
time = time / 60.0
print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "min"
elseif (time >= 3600.0) then
time = time / 3600.0
print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "hrs"
else
print"(x,a16,x,f7.2,x,a3)", "Estimated time =", time, "sec"
endif
endif
call progress (k, nstep)
enddo
call cpu_time (finish)
toe = finish - start
print*, ''
if (toe >= 60.0 .and. toe < 3600.0) then
toe = toe / 60.0
print"(x,a28,x,f7.2,x,a3)","Time taken for computation =", toe, "min"
elseif (toe >= 3600.0) then
toe = toe / 3600.0
print"(x,a28,x,f7.2,x,a3)","Time taken for computation =", toe, "hrs"
else
print"(x,a28,x,f7.2,x,a3)","Time taken for computation =", toe, "sec"
endif
print*, ""
print*, "File ", trim (fout), " generated!"
end program main
subroutine progress(i, imax)
implicit none
integer :: i, imax
integer :: j, k
double precision :: prog
character(len=117) ::bar
bar="???% | |"
prog = dble (i) / dble(imax)
j = INT(prog * 100)
write(unit=bar(1:3),fmt="(i3)") j
do k=1, j
bar(6+k:6+k)=">"
end do
write(unit=6,fmt="(a1,a117)",advance="no") char(13), bar
if (i .lt. imax) then
flush(unit=6)
else
write(unit=6,fmt=*)
end if
return
end subroutine progress
subroutine argChar (n,file_name)
use error
implicit none
integer , intent(in) :: n
character(1024), intent(out) :: file_name
character(1024) :: arg
call getarg(n,arg)
if (len_trim(arg) == 0) then
print*, trim(err1)
print*, trim(err2)
stop
endif
read(arg,*) file_name
end subroutine argChar
subroutine argInt(n,val)
use error
implicit none
integer , intent(in) :: n
integer , intent(out) :: val
character(1024) :: arg
call getarg(n,arg)
if (len_trim(arg) == 0) then
print*, trim(err1)
print*, trim(err2)
stop
endif
read(arg,*) val
end subroutine argInt
subroutine argReal (n,val)
use error
implicit none
integer , intent(in) :: n
double precision, intent(out) :: val
character(1024) :: arg
call getarg(n,arg)
if (len_trim(arg) == 0) then
print*, trim(err1)
print*, trim(err2)
stop
endif
read(arg,*) val
end subroutine argReal
subroutine groParams (file_unit, atomnum, solnum, ionnum, npoint)
implicit none
character (*), parameter :: gro = "(i5, 2a5, i5, 3f8.3, 3f8.4)"
integer :: n2, i, resid
character (5) :: res
integer, intent (in) :: file_unit, npoint
integer, intent (out) :: atomnum, solnum, ionnum
read (file_unit,*)
read (file_unit,*) n2
atomnum = 0
solnum = 0
ionnum = 0
do i = 1, n2
read (file_unit,gro) resid, res
if (trim(res) == "SOL") then
solnum = solnum + 1
elseif (trim(res) == "CL" .or. trim(res) == "NA") then
ionnum = ionnum + 1
else
atomnum = atomnum + 1
endif
enddo
read (file_unit,*)
rewind (file_unit)
if (n2 /= atomnum + solnum + ionnum) then
print*, "ERROR! INCORRECT PARAMETER COUNT IN GRO FILE!"
print*, ""
stop
endif
solnum = solnum / npoint
end subroutine groParams
subroutine timeDate
implicit none
character(8) :: date
character(10) :: time
character(5) :: zone
integer :: hour, minute, second, day, month, year
integer, dimension(8) :: val
call date_and_time(date, time, zone, val)
hour = val(5)
minute = val(6)
second = val(7)
day = val(3)
month = val(2)
year = val(1)
print*,""
print"(x,a23,x,i2,a,i2,a,i2,a,x,i2,a,i2,a,i4)", &
"Calculation started at:",hour,":",minute,":",second,";", &
day,"/",month,"/",year
print*,""
end subroutine timeDate
[FORTRAN-90, GROMACS] Process/Analyze GROMACS xtc files in FORTRAN 90
(1) Install libxdrfile
(1.1) Download
wget https://github.com/wesbarnett/libxdrfile/archive/refs/heads/master.zip
(1.2) Install
unzip master.zip
cd libxdrfile-master
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=installation_directory_path (by default /usr/local/)
make
make install
make test
(2) Install libgmxfort
(2.1) Download
wget https://github.com/wesbarnett/libgmxfort/archive/refs/heads/master.zip
(2.2) Install
unzip master.zip
cd libgmxfort-master
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=installation_directory_path # (by default /usr/local/)
make
make test
make install
(3) Add the following line to ~/.basrc and compile.
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path-to-LIBGMXFORT/LIBGMXFORT/libgmxfort-master/build/lib
(4) Sample code
Task: Read 1 frame from the file traj_PBC.xtc and print the coordinates of atom 1.
program main
use gmxfort_trajectory
implicit none
type (Trajectory) :: trj
real :: myatom(3)
integer :: n
call trj%open("traj_PBC.xtc", "index.ndx")
n = trj%read_next(1)
call trj%close()
myatom = trj%x(1, 1)
print*, myatom(1), myatom(2), myatom(3)
end program main
(5) Compilation
gfortran read_xtc.f90 -I /path-to-LIBGMXFORT/LIBGMXFORT/libgmxfort-master/build/include -L /path-to-LIBGMXFORT/LIBGMXFORT/libgmxfort-master/build/lib -lgmxfort -o readXTC
(6) Execution
./readXTC
(7) Output
Opened traj_PBC.xtc for reading.
48057 atoms present in system.
20001 frames present in trajectory file.
Frames saved: 1
Closed xtc file.
4.56200027 2.66300011 1.66200006
(8) Further information: https://github.com/wesbarnett/libgmxfort
N.B. Change "path-to-LIBGMXFORT" according to your destination directory!
[GROMACS] Calculating properties with dynamic indexing
There is no straight-forward way in GROMACS to calculate properties based on dynamic indexing, yet. To explain the concept, let us consider the total dipole moment of protein hydration layers.
The first step would be to get the index file for hydration layer water atoms
gmx select -f traj.xtc -s topol.tpr -on phl_0.3nm.ndx -select "same residue as name OW and within 0.3 of group protein"
This phl_0.3nm.ndx file has N number of groups where N is the number of time-frames in the trajectory file (traj.xtc). Each group contains the atom indices for the water molecules within 0.3 nm of the protein (aka, the hydration layer) at the respective time frame. Now, if you want to use this for calculating dipole moment using gmx dipoles, it will not work because you will be prompted to select one of the N groups, whose dipole moment trajectory will be calculated.
There are two ways to deal with the problem.
[GROMACS] gmx select syntax
To select water molecules within 0.3 nm from the surface of protein
gmx select -f traj.xtc -s topol.tpr -oi phl_0.3nm.dat -on phl_0.3nm.ndx -select "same residue as resname SOL and within 0.3 of group protein"
For detailed discussion, click on the following link: selections in GROMACS
[GROMACS] Merge/Stitch two boxes in GROMACS
Suppose you have two simulation boxes that you need to merge to create a larger box with an interface. You need to follow the following steps.
gmx editconf -f box-1.gro -o box-1.pdb
Let's say the boxes are to be stitched along the z-axis. Then you need to shift the coordinates of the atoms in box-2 by the z-length of box-1. x and y coordinates should not be changed.
gmx editconf -f box-2.gro -o box-2.pdb -translate 0 0 box-1_z
cat box-1.pdb box-2.pdb > final_box.pdb
Delete the few lines (4-6 lines) that mark the beginning of the second box in final_box.pdb. Then continue with the following.
gmx editconf -f final_box.pdb -o final_box.gro
gmx genconf -f final_box.gro -o final_box.gro -renumber
[GROMACS] Parameterize a small molecule with GAFF and run on gromacs
Suppose you want to simulate a small molecule in gromacs using GAFF. Youo have the pdb file molecule.pdb. Here are the steps for the process.
#1. If your pdb file does not have hydrogens, then first add H atoms using open babel:
obabel molecule.pdb -O molecule-H.pdb -h
Of course you may use other tools like VMD, PyMol, etc.
#2. Generate AMBER Parameters with Antechamber: Generate mol2 file.
antechamber -i molecule-H.pdb -fi pdb -o molecule.mol2 -fo mol2 -c bcc -s 2
Here charges are added using the AM1-BCC method, which is generally suitable for small organic molecules. For more details click here.
#3. Generate GAFF Force Field Parameter File using Parmchk2
parmchk2 -i molecule.mol2 -f mol2 -o molecule.frcmod
#4. Convert AMBER Parameters to GROMACS Format Using ACPYPE
acpype -i molecule.mol2 -b molecule
-b molecule is the base name for the output files. Click here for details.
Now you are ready to simulate the molecule in GROMACS.
[LINUX] "convert" command in Linux
Conversion between two image formats
convert input.jpg output.png
Making figures backgroundless
convert input.jpg -transparent white output.png
This example considers that the background color is white.
Trimming out white regions in a figure
convert -trim input.jpg output.jpg
For other useful options, type convert -help
[LINUX] Create a video from multiple images in Linux
Make a video by stitching image files using ffmpeg.
a) When the image files are not named sequentially.
ffmpeg -f image2 -framerate 10 -pattern_type glob -i "file_*.jpg" -s 1920x1080 -vcodec libx264 -crf 20 -pix_fmt yuv420p video.mp4
b) When the image files are named sequentially (file_001.jpg, file_002.jpg, ...).
ffmpeg -f image2 -framerate 10 -start_number 0 -i "file_%03d.jpg" -s 1920x1080 -vcodec libx264 -crf 20 -pix_fmt yuv420p video.mp4
[LINUX] Compress pdf files from Linux terminal
You can use ghostscript to do this. Install it by using the following command:
sudo apt install ghostscript
Now run the following command:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
-dPDFSETTINGS determines the output file size. Following are the options:
-dPDFSETTINGS=/screen — Low quality (72 dpi)
-dPDFSETTINGS=/ebook — Slightly better quality (150 dpi)
-dPDFSETTINGS=/prepress — High quality (300 dpi)
-dPDFSETTINGS=/default — The system chooses the best output
[LINUX] Find files with a given "pattern" inside in a hierarchical manner including all sub-directories
find . -type f -exec grep -iF "pattern" /dev/null {} +
[LINUX] parallel: Using all cores of a node in parallel
Suppose you need to run several jobs in a loop. For example, let's say you want to use gmx msd for 1000 groups. You can use a bash for loop to do that as follows.
for i in {1..1000}
do
echo ${i} | gmx msd -f traj.xtc -s topol.tpr -n index.ndx -o msd-${i}.xvg
done
In this case, each of the 1000 commands are executed o after the other. These are serial jobs. So, if your node has 16 cores, 1 core is used in one iteration. The rest 15 cores are idle. You can accelerate the process by using parallel as follows.
parallel 'echo {} | gmx msd -f traj.xtc -s topol.tpr -n index.ndx -o msd-{}.xvg' ::: $(seq 1 1000)
In this case, all the 16 cores will be used at a time. Which means 16 iterations will run simultaneously.
[LINUX] Shorten remote login details
Suppose you need to login to a remote host named hostname.xxx.yyy.zzz for the user username. You can use ssh in the following way:
ssh -XC username@hostname.xxx.yyy.zzz
Or, if you need to send a file to this remote host, you may use scp in the following way:
scp -r filename username@hostname.xxx.yyy.zzz:/path/to/desired/directory/
However, it is annoying to write the full login details each time. To shorten it, add a file called config to the .ssh directory in home
vi ~/.ssh/config
and add the following lines:
Host hostname
HostName hostname.xxx.yyy.zzz
User username
Save this and next time use the following commands:
ssh -XC hostname
scp -r filename hostname:/path/to/desired/directory/
[MARTINI] Backmapping from MARTINI CG to AA
All related files/scripts/folders can be downloaded from here: link. Details about the usage are also given here.
Backmapping Martini coarse-grained structure to all-atom description
./initram-v5.sh -f initial.gro -o aa_backmapped.gro -to amber99sb -p topol_initial.top
Here, initial.gro is the CG input structure. aa_backmapped.gro is the backmapped output structure. topol_initial.top is the all atom topology file containing only protein atoms. Water and ions should not be included.
[PYTHON] Convert float variable to scientific notation
This function can be used to convert any floating number to *10^x notation in python codes.
(Thanks to Mr. Tobias Prass for collaboration)
def sci_not (inp):
superscript = {
"0": "\u2070",
"1": "\u00B9",
"2": "\u00B2",
"3": "\u00B3",
"4": "\u2074",
"5": "\u2075",
"6": "\u2076",
"7": "\u2077",
"8": "\u2078",
"9": "\u2079",
"-": "\u207B",
}
neg = False
if inp < 0:
neg = True
inp = inp * -1
num = inp < 1
if num:
sig = inp
exp = 0
while sig < 1:
sig = sig * 10
exp += 1
out = f"{sig:.2f} x 10"+superscript["-"]+superscript[str(exp)]
elif not num:
sig = inp
exp = 0
while sig > 10:
sig = sig / 10
exp += 1
out = f"{sig:.2f} x 10"+superscript[str(exp)]
if neg:
sign = "-"
elif not neg:
sign = ""
return str(sign)+out
[PYTHON] Convert Python-2 scripts to Python-3 syntax
Details in the following link: link
[PYTHON] Parsing command line arguments with flags
This is a python code to generate the input files necessary for running the DoSPT program that calculates entropy using the 2PT method. It uses the python module argparse to read in command line arguments using flags.
#!/usr/bin/env python
import argparse
import sys
import os
# Command line arguments parsed with flags
parser = argparse.ArgumentParser(description="This python scripts generates all the files necessary for running the DoSPT program", usage="%(prog)s -w <water-model> -ns <num. of steps> -t <traj. length> -T <temperature>", epilog="Copyright reserved by Saumyak Mukherjee")
parser.add_argument('-w' , '--water', type=str, help='Water model (spc/spce/tip3p/tip4p/tip4p-2005/tip4p-ice/disp) [default: %(default)s]', default='spce', metavar='')
parser.add_argument('-ns', '--nstep', type=int, help='Number of steps in the trajectory [default: %(default)s]', default=5001, metavar='')
parser.add_argument('-t' , '--trun', type=float, help='Total length of the trajectory in ps [default: %(default)s]', default=20.0, metavar='')
parser.add_argument('-T' , '--temp', type=float, help='Temperature of the system in K [default: %(default)s]', default=300.0, metavar='')
args = parser.parse_args()
wModel = args.water
nStep = args.nstep
tRun = args.trun
temp = args.temp
s = 2 # symmetry number
# Input and output files
fTraj = 'traj.gro'
fInput = 'input'
fMass = 'masses'
fGrp = 'groups'
fSgrp = 'supergroups'
# Check for the existence of the traj.gro file
if not os.path.exists(fTraj):
print(f"Error: The {fTraj} file does not exist in the current directory!")
sys.exit(1)
# Number of water molecules and box dimensions
nAtoms = {'spc':3, 'spce':3, 'tip3p':3, 'tip4p':4, 'tip4p-2005':4, 'tip4p-ice':4, 'disp':4}
nPoint = nAtoms[wModel]
boxLine = None
with open(fTraj, 'r') as inp:
for i, line in enumerate(inp):
if i==1:
nWat = int(line)
boxLine = nWat + 2
nWat = int(nWat / nPoint)
elif i==boxLine:
box = line.split()
lx, ly, lz = box[0], box[1], box[2]
break
fmt = fTraj.split('.')[-1]
# Generate the "masses" file
def write_mass(model, filename):
masses = {
'spc' : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080)],
'spce' : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080)],
'tip3p' : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080)],
'tip4p' : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080), ('MW', 0.0000)],
'tip4p-2005': [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080), ('MW', 0.0000)],
'tip4p-ice' : [('OW', 15.9994), ('HW1', 1.0080), ('HW2', 1.0080), ('MW', 0.0000)],
'disp' : [('OW', 15.9994), ('HW2', 1.0080), ('HW3', 1.0080), ('MW4', 0.0000)]
}
if model in masses:
with open(filename, 'w') as f:
for atom, mass in masses[model]:
f.write(f'{atom} {mass}\n')
else:
print(f"Water model '{model}' not supported!")
write_mass(wModel, fMass)
print (f'file "{fMass}" generated"!')
# Generate "input" file
with open(fInput, 'w') as file:
file.write(f'points = {nStep}\n')
file.write(f'tau = {tRun:.3f}\n')
file.write(f'cell = {lx} {ly} {lz}\n')
file.write(f'temperature = {temp:.2f}\n')
file.write(f'format = {fmt}\n')
file.write('estimate_velocities = .true.\n')
print (f'file "{fInput}" generated"!')
# Generate "groups" file
with open(fGrp, 'w') as file:
file.write(f"{nWat*nPoint} {nWat}\n")
for i in range(1,nWat+1):
file.write(f"{nPoint} {s}\n")
atomnums = [str(nPoint*(i-1)+j) for j in range(1, nPoint+1)]
file.write(" ".join(atomnums)+"\n")
print (f'file "{fGrp}" generated"!')
# Generate "supergroups" file
with open(fSgrp, 'w') as file:
file.write(f'1-{nWat}')
print (f'file "{fSgrp}" generated"!')
To know about the usage of the code, type in
setupDoSPT.py -h
The output should look like the following:
usage: setupDoSPT.py -w <water-model> -ns <num. of steps> -t <traj. length> -T <temperature>
This python scripts generates all the files necessary for running the DoSPT program
optional arguments:
-h, --help show this help message and exit
-w , --water Water model (spc/spce/tip3p/tip4p/tip4p-2005/tip4p-ice/disp) [default: spce]
-ns , --nstep Number of steps in the trajectory [default: 5001]
-t , --trun Total length of the trajectory in ps [default: 20.0]
-T , --temp Temperature of the system in K [default: 300.0]
Copyright reserved by Saumyak Mukherjee
For details on the DoSPT program follow this link: DoSPT
[PYTHON] Plotting graphs with matplotlib
These are some generic templates that I often have to use. This of course is not exhaustive and needs to be adjusted or modified according to requirement. One may think of this as a basic template.
1. Useful packages/modules to be imported:
%matplotlib inline
import numpy as np # array manipulations
import matplotlib.pyplot as plt # Data plotting
import matplotlib as mpl # Matplotlib
import pandas as pd # I/O handling and much more
from scipy import interpolate # Interpolation or spline fitting
from scipy import optimize as op # Fitting data with a function
from string import ascii_lowercase as alc # Alphabets in alphabetical order in a loop
import matplotlib.patches as patches # Introduce geometric shapes in graphs
from mpl_toolkits.axes_grid1.inset_locator import (inset_axes, InsetPosition, mark_inset) # Inset
import os # For using bash commands in python
import glob # for file handling
import seaborn as sns # Plotting continuous distribution
Some initial lines:
warnings.filterwarnings('ignore') # To not show warnings
mpl.rcParams['axes.linewidth'] = 0.5 # Width of the graph frame
2. Reading in data from a file:
2.1. Using numpy for a single input file
file = "input-file.dat" # Let's say the file has two columns separated by space
data = np.loadtxt(file, unpack=True, skiprows=17)
x = data[0] # as numpy array
y = data[1] # as numpy array
2.2. Using numpy for multiple input files with sequential numbering (let's say 10, 15, 20, ..., 95, 100)
for i in range(10, 100, 5):
file = f"input-file_{i}.dat"
# or, an alternative way
file = "input-file_"+str(i)+".dat"
data = np.loadtxt(file, unpack=True, skiprows=17)
x = data[0] # as numpy array
y = data[1] # as numpy array
2.3. Using numpy for multiple input files with non-sequential numbering
files = glob.glob("input-file_*.xvg")
for file in files:
data = np.loadtxt(file, unpack=True, skiprows=17)
x = data[0] # as numpy array
y = data[1] # as numpy array
2.4. Using python (same method for multiple files as before)
file = "input-file.dat"
data = []
with open(file, "r") as file:
for line in file:
if line.startswith("#") or line.startswith("@"): # Skipping comments
continue
line = line.split()
x, y = float(line[0]), float(line[1])
data.append((x, y))
3. Plotting the data:
3.1. Using matplotlib.pyplot.subplots for a single graph from a single input file.
fig, ax = plt.subplots(1, 1, dpi=300, figsize=(2.5,2), constrained_layout=True)
# plot data
ax.plot(x, y, "--o", lw=0.8, ms=1, color='k', label="xxx")
# plot data along with errorbar
ax.errorbar(x, y, yerr=dy, #dy to be read from input file or calculated
color = 'k',
ecolor='k',
elinewidth=0.5,
lw = 0.8,
label='Mean',
marker = '.')
# plot histogram
ax.hist(y, bins=100,
density=True,
color='dodgerblue',
edgecolor='dodgerblue',
lw=0.3,
alpha=0.4)
# plot continuous distribution
sns.kdeplot(y, linewidth=1, fill=True)
# plot bar-plot
ax.bar(x, y, label="bar label", color='b', edgecolor='k', lw = 0.5, width = 0.5)
ax.set_xlim(0, 100)
ax.set_ylim(0, 1)
ax.set_xlabel('x-axis label', fontsize=10)
ax.set_ylabel('y-axis label', fontsize=10)
ax.tick_params(axis='both', which='major', labelsize=6, length=1.5, width=0.5)
ax.ticklabel_format(style='sci', axis='y', scilimits=(4,4), useMathText=True)
ax.yaxis.offsetText.set_fontsize(6)
ax.yaxis.set_label_coords(-0.2, 1.2) # adjust y-axis label position
ax.grid(visible=True, which='major', axis='both', ls='--', lw=0.2)
ax.set_title("Graph title", fontsize=8, pad=5, y=1.0)
ax.legend(frameon=False, loc=1, fontsize=10, shadow = False, edgecolor='k')
ax.axvline(x=50, color='r', ls="-", lw=0.8)
ax.axhline(y=0, color='k', ls='--', lw=0.8)
ax.text(50,0.5, "Text to be added", fontsize=5, color='b', rotation='vertical')
ax.spines['top'].set_visible(False) # do not show the top line in graph frame
ax.spines['right'].set_visible(False) # do not show the right line in graph frame
plt.setp(ax.get_xticklabels(), visible=False) # do not show the x-ticks
ax.arrow(25,790,0,10, head_width=15, head_length=5, fc='r', ec='r') # Draw downward arrow
ax.arrow(420,780,0,-10, head_width=15, head_length=5, fc='b', ec='b') # Draw upward arrow
ax.fill_between(x, y, 0, color='r', alpha=0.6) # Fill area under a curve with color
# add inset
left, bottom, width, height = [0.6, 0.65, 0.4, 0.35]
ax1 = ax.inset_axes([left, bottom, width, height])
ax1.plot(x, y, lw=0.6, ms=1, "--s", color='r')
loc = [0, 0.2, 0.5, 0.8, 1.0] # x-ticks at given positions
ax1.set_xticks(loc)
ax1.set_ylim([0.2, 0.4])
ax1.set_xlim([0, 1])
ax1.set_facecolor("lavender")
rect = patches.Rectangle((0, 0), 50, 0.8, lw=0, edgecolor='w', facecolor='w') # add a rectangle
ax1.add_patch(rect)
figFile = 'output.jpg'
fig.savefig(figFile,format="jpg", dpi=300, bbox_inches='tight')
os.popen(f'convert -trim {figFile} {figFile}')
3.2. Using matplotlib.pyplot.subplots for a multi-panel graph from multiple input files.
fig, axs = plt.subplots(3,2,dpi=300,figsize=(10,12),constrained_layout=True,sharex=True,sharey=True)
for ax, x, y, ser in zip(axs.flatten(), xVals, yVals, alc): # xVals and yVals are arrays
ax.plot(x, y, lw=0.6, ms=1, "--s", color='r', label=ser) # ser gives the panel numbering from alc
add other options as shown before
4. Fitting the data using scipy.optimize
# triexponential fitting
def func (x, a1, t1, a2, t2, a3, t3):
return a1*np.exp(-x/t1) + a2*np.exp(-x/t2) + a3*np.exp(-x/t3) + (1-a1-a2-a3)
guess = [0.5,1,0.5,10,0.5,100]
lo = (0,0,0,0,0,0)
hi = (1,10,1,100,1,1000)
maxiter = 1000001
popt, pcov = op.curve_fit(func, x, y, check_finite=True, p0=guess, bounds=(lo,hi), maxfev = maxiter)
ax.plot(x, y, 'ro', lw=1, label='raw data')
ax.plot(x, func(x, *popt), 'r-', lw = 0.8, , label='fitted data')
[PYTHON, LATEX] Generate latex compatible tables from python
You can generate Latex formatted tables from python using pandas. Suppose you have a file input.txt with three columns which you want to put into a latex document. This is a sample piece of code to do that. As a bonus this code also shows how to put multiple index names in a pandas DataFrame.
import numpy as np
import pandas as pd
vals = np.loadtxt (input.txt)
head = [('Index-1a','Index-1b'), ('Index-2a','Index-2b'), ('Index-3a','Index-3b')]
form = ["{:.0f}".format, "{:.0f}".format, "{:.0f}".format] # all integers
cols = pd.MultiIndex.from_tuples(head)
df = pd.DataFrame(vals, columns=cols)
out = df.to_latex(
index=False,
caption='Table caption.',
escape=False,
formatters=form,
multirow=True,
column_format = 'lrr',
label='latex_label')
print(out)
For details click on the following link.
[VMD] Change default options in vmd
Download the .vmdrc file in your user home: download
Change the various parameters in this file as you wish. Be careful to maintain the format. Enjoy!
N.B. The name of the file should be .vmdrc (mind the '.')!
[VMD] Render frames as images after loading molecule in VMD
The following TCL script does the job. Save the code as render.tcl and run the following command on the TK Console:
source render.tcl
Code:
set output_dir "frames"
file mkdir $output_dir
render TachyonInternal on
display depthcue off
display antialias on
axes location off
set render_width 1920
set render_height 1080
set frame_interval 10
set molID [molinfo top]
set num_frames [molinfo $molID get numframes]
for {set i 0} {$i < $num_frames} {incr i $frame_interval} {
animate goto $i
set img_file [format "%s/frame_%04d.tga" $output_dir $i]
render TachyonInternal $img_file
puts "Rendered frame $i to $img_file"
}
Use ffmpeg to make a video from the generate images. For details see Create a video from multiple images in Linux above.
[VMD] Select and display whole water molecules within cutoff in VMD
In VMD, for displaying water molecules within a given distance (let's say 5 Angstrom) from protein, we use:
water within 5 of protein
However, this selects atoms of water, whereby you might see only H or O atoms of some waters. To display the whole molecules, use:
same fragment as water within 5 of protein
[XMGRACE] Change default options in xmgrace
(1) Create the following directory in user home.
mkdir .grace/
(2) Inside .grace/ create a file gracerc.user with the following contents.
HARDCOPY DEVICE "JPEG"
DEVICE "JPEG" DPI 300
Here you can change the options for the default output settings for figures.
(3) Here, also create folder templates. Inside templates download the Default.agr file: download
You may change the various parameters in the Default.agr file that you downloaded as per your choices, and you are ready.
Enjoy plotting!