In [1]:
%load_ext memory_profiler
%load_ext snakeviz
%load_ext cython
import holoviews as hv
hv.extension('bokeh','matplotlib')
from IPython.core import debugger
ist = debugger.set_trace
We load in the position and box information created in the intro notebook. If you haven't run that notebook, this line will not work! (You don't have to read the wall of text, just run the cells...)
In [2]:
import numpy as np
pos = np.loadtxt('data/positions.dat')
box = np.loadtxt('data/box.dat')
#In order to make this test fair and 'pure python',
#we need to convert the numpy position and box size arrays to python lists.
x,y,z = map(list,pos.T)
lx,ly,lz = box
print('Read {:d} positions.'.format(pos.shape[0]))
print('x min/max: {:+4.2f}/{:+4.2f}'.format(pos.min(0)[0],pos.max(0)[0]))
print('y min/max: {:+4.2f}/{:+4.2f}'.format(pos.min(0)[1],pos.max(0)[1]))
print('z min/max: {:+4.2f}/{:+4.2f}'.format(pos.min(0)[2],pos.max(0)[2]))
We are going to use a soft potential without a hard-core here because our random generation of positions is going to leave us with overlaps. This potential doesn't diverge at the origin and will essentially represent our coordinate positions as soft blobs. We implement this here in pure python.
In [3]:
from math import cos,pi
def potentialEnergyFunk(r,width=1.0,height=10.0):
'''
Calculates the (soft) potential energy between two atoms
Parameters
----------
r: float
separation distance between two atoms
height: float
breadth of the potential i.e. where the potential goes to zero
width: float
strength/height of the potential
'''
if r<width:
return 0.5 * height * (1 + cos(pi*r/width))
else:
return 0
Let's plot the soft potential using some fancy HoloViews + Bokeh magic!
In [4]:
%%opts Curve [width=600,show_grid=True,height=350]
dr = 0.05 # spacing of r points
rmax = 10.0 # maximum r value
pts = int(rmax/dr) # number of r points
r = [dr*i for i in range(pts)]
def plotFunk(width,height,label='dynamic'):
# Need to wrap potentialEnergyFunk for map call below to behave
funk = lambda r: potentialEnergyFunk(r,width,height)
U = list(map(funk,r))
return hv.Curve((r,U),kdims=['Separation Distance'],vdims=['Potential Energy'],label=label)
dmap = hv.DynamicMap(plotFunk,kdims=['width','height'])
dmap = dmap.redim.range(width=((1.0,10.0)),height=((1.0,5.0)))
dmap*plotFunk(10.0,5.0,label='width: 10., height: 5.')*plotFunk(1.0,1.0,label='width: 1., height: 1.')
Out[4]:
In [5]:
from math import sqrt
def calcTotalEnergy(x,y,z,lx,ly,lz):
'''
Parameters
----------
x,y,z: lists of floats
1-D lists of cartesian coordinate positions
lx,ly,lz: floats
simulation box dimensions
'''
#sanity check
assert len(x) == len(y) == len(z)
# store half box lengths for minimum image convention
lx2 = lx/2.0
ly2 = ly/2.0
lz2 = lz/2.0
U = 0
#The next two loops simply loop over every element in the x, y, and z arrays
for i,(x1,y1,z1) in enumerate(zip(x,y,z)):
for j,(x2,y2,z2) in enumerate(zip(x,y,z)):
# We only need to consider each pair once
if i>=j:
continue
dx = abs(x1 - x2)
dy = abs(y1 - y2)
dz = abs(z1 - z2)
# The next few lines take care of applying the minimum image convention
# This is neccesary because many/most molecular simulations use periodic boundary conditions
if dx > lx2:
dx -= lx
if dy > ly2:
dy -= ly
if dz > lz2:
dz -= lz
dist = sqrt(dx*dx + dy*dy + dz*dz)
U += potentialEnergyFunk(dist)
return U
In the cell below we are setting up a profiling environment (via the cell magic %%prun) and storing the profiling information in prof/python.prof.
To visualize profiling output: From cmd line, in the same directory as this notebook run (after running the cell):
snakeviz --server prof/python.prof
We then run the calcTotalEnergy function to profile its execution performance.
In [6]:
%%prun -D prof/python.prof
energy = calcTotalEnergy(x,y,z,lx,ly,lz)
While the execution time is extremely important, equally as important is the memory usuage. While we are lucky enough to work on supercomputers which generally have massive amounts of memory, the memory usage in this calculation can scale as poorly as the execution time. In particular, we will see that while certain approaches run significantly faster than pure Python for these (relatively small) system sizes, they would not be sustainable at scale.
Below we profile the memory usage of the python function, and then write the max memory and increment to a file.
In [7]:
memprof = %memit -o calcTotalEnergy(x,y,z,lx,ly,lz)
usage = memprof.mem_usage[0]
incr = memprof.mem_usage[0] - memprof.baseline
with open('prof/python.memprof','w') as f:
f.write('{}\n{}\n'.format(usage,incr))
In [8]:
with open('energy/python.dat','w') as f:
f.write('{}\n'.format(energy))
In [ ]:
In [ ]: