script available on GitHub

Installation and Setup

Installation

Refer to this how-to.

Manage Python Packages

Python has its own package manager "pip" to keep Python self-contained. pip also allows access to new packges even if your OS is out-of-date. If you installed Python using Anaconda, then your package manager will be "conda", which also has a nice documentation.

Running Python

interactive mode

Open a terminal and type "python"

run a python script

  1. Put Python code (eg. print("hello world")) in a file, say "hello.py".
  2. Invoke the Python interpreter with the file as its first argument.
    python hello.py
    

If you plan to do computational research in the future, please pick either emacs or vim. They are excellent command-line editors with many academic users. Command-line editors have steeper learning curves than GUI-based ones, but you can do way more with them over time (good luck using GUI editors on supercomputers). Many excellent online tutorials exist.

  1. Atom
  2. Sublime
  3. Emacs
  4. Vim

Rules!

Rule #1: Write Comments

The more the better!


In [4]:
# I don't know how to write a program but I am charming, 
#  so I will write down the equations to be implemented 
#  and find a friend to write it :)
""" 
It is annoying to have to start each comment with a #, 
 triple quotation allows multi-line comments. 
 
It is always a good idea to write lots of comment to lay out the 
cohesive idea you had while starting to write a piece of code. 
More often than not, we forget that impressive grand plan we started
with as we fight with syntax error and other nitty-gritty of
talking to a computer.
""";

Rule #2: Follow Best Practices

An excellent paper by Greg Wilson et. al. concisely summarizes the best practices of scientific computing. I will steal the most relavant section from the summary paragraph here:

  1. Write programs for people, not computers
    1. A program should not require its readers to hold more than a handful of facts in memory at once.
    2. Make names consistent, distinctive, and meaningful.
    3. Make code style and formatting consistent.
  2. Let the computer do the work.
    1. Make the computer repeat tasks.
    2. Save recent commands in a file for re-use.
    3. Use a build tool (or Jupyter notebook) to automate and save workflows.
  3. Make incremental changes.
    1. Work in small steps with frequent feedback and course correction.
    2. Use a version control system (eg. git,subversion)
    3. Upload all work into version control system
  4. Don't repeat yourself (or others)
    1. Every piece of data must have a single authoritative representation in the system.
    2. Modularize code rather than copying and pasting.
    3. Re-use code (yours or others) instead of rewriting it.
  5. Plan for mistakes
    1. Add assertions to programs to check their operation.
    2. Use an off-the-shelf unit testing library.
    3. Turn bugs into test cases.

Basic Use Cases

Much of the following can be found on "A Beginner's Python Tutorial"

Using python as a calculator

basic arithmetics are built in


In [5]:
1 + 1


Out[5]:
2

In [6]:
2*3


Out[6]:
6

In [7]:
2**3


Out[7]:
8

In [8]:
7/2 # gotcha !


Out[8]:
3

In [9]:
7./2


Out[9]:
3.5

In [10]:
5%2 # modulo


Out[10]:
1

more advanced functions can be accessed using the numpy package


In [11]:
import numpy as np

In [12]:
np.exp(1j)


Out[12]:
(0.54030230586813977+0.8414709848078965j)

In [13]:
np.cos(1) + 1j*np.sin(1)


Out[13]:
(0.54030230586813977+0.8414709848078965j)

In [14]:
np.sqrt(144)


Out[14]:
12.0

Loop and Condition


In [15]:
for my_index in [1,2,3]:
    print(my_index)
# end for


1
2
3

In [16]:
for my_index in range(3):
    print(my_index)
# end for


0
1
2

In [17]:
# while loop may not terminate
my_index = 1
while my_index < 4:
    print(my_index)
    my_index += 1 # try comment this out... JK don't do it!
# end while


1
2
3

python uses indentation to determine blocks, you could easily have done

my_index = 1
while my_index < 4:
    print(my_index)
my_index += 1

that would be a big oopsy

introducing break


In [18]:
# for loop always terminates, thus it is preferred
for my_index in range(10):
    if (my_index>0) and (my_index<=3):
        print(my_index)
    elif (my_index>3):
        break
    # end if
# end for


1
2
3

Functions

define modular functions to maximize code reusability and readability


In [19]:
def boltzmann_factor(energy_in_J,temperature_in_K):
    # 1 joule = 7.243e22 K *kB
    kB = 1.38e-23 # m^2 kg/ s^2 K
    return np.exp(-float(energy_in_J)/kB/temperature_in_K)
# end def

In [20]:
def fermi_dirac_dist(energy_in_J,temperature_in_K,chemical_pot_in_J):
    denomimator = 1.0/boltzmann_factor(
        energy_in_J-chemical_pot_in_J
        ,temperature_in_K
      ) + 1.0
    return 1.0/denomimator
# end def

In [21]:
def bose_einstein_dist(energy_in_J,temperature_in_K,chemical_pot_in_J):
    denomimator = 1.0/boltzmann_factor(
        energy_in_J-chemical_pot_in_J
        ,temperature_in_K
      ) - 1.0
    return 1.0/denomimator
# end def

In [22]:
# 50% occupation near chemical potential
fermi_dirac_dist(1.01e-22,300,1e-22)


Out[22]:
0.49993961352686361

In [23]:
# divergent occupation near chemical potential
bose_einstein_dist(1.01e-22,300,1e-22)


Out[23]:
4139.5000201305456

Tuples, Lists, Dictionaries

list: iterable, extendable, mutable and ordered array of elements

tuple: immutable list

dictionary: iterable, extendable, mutable and un-ordered key-value pairs


In [24]:
mylist = [5,4,2,3,1]
for item in mylist:
    print(item)
# end for


5
4
2
3
1

In [25]:
mylist[2] = 100
mylist.insert(0,50)

In [26]:
for i in range(len(mylist)):
    print( mylist[i] )
# end for


50
5
4
100
3
1

In [27]:
mytuple = (5,4,2,3,1)
for item in mytuple:
    print(item)
# end for


5
4
2
3
1

In [28]:
mytuple[2] = 100
# oopsy-daisies


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-ce2db551d84e> in <module>()
----> 1 mytuple[2] = 100
      2 # oopsy-daisies

TypeError: 'tuple' object does not support item assignment

In [29]:
mydict = {0:5,1:4,2:2,3:3,4:1}
for i in range(len(mydict)):
    print( mydict[i] )
# end for


5
4
2
3
1

In [30]:
mydict = {
    "name":"Paul"
    ,"favorite number":42
    ,"where abouts":"elusive"
    ,"hobbies":["coffee","code"]
}
mydict.keys()


Out[30]:
['favorite number', 'hobbies', 'name', 'where abouts']

In [31]:
mydict["where abouts"]


Out[31]:
'elusive'

In [32]:
mydict["new entry"] = False

In [33]:
for key,value in mydict.iteritems():
    print( "%s : %s" % (str(key),str(value)) )
# end for


favorite number : 42
hobbies : ['coffee', 'code']
name : Paul
new entry : False
where abouts : elusive

List Comprehension


In [34]:
mylist = [5,4,2,3,1]
[item**2 for item in mylist]


Out[34]:
[25, 16, 4, 9, 1]

In [35]:
square_and_shift = lambda x,y:x**2+y
[square_and_shift(item,50) for item in mylist]


Out[35]:
[75, 66, 54, 59, 51]

List splicing


In [36]:
# from index 1 to -2 (wrap around)
mylist[1:-2]


Out[36]:
[4, 2]

In [37]:
# all even indices
mylist[::2]


Out[37]:
[5, 2, 1]

In [38]:
# all odd indices
mylist[1::2]


Out[38]:
[4, 3]

gotcha!


In [39]:
mylist = [5,4,2,3,1]
entry = [1,2]
mylist.append(entry)
# only a reference to entry is saved, NOT a copy, which means ...
#  entry can be changed elsewhere without mylist knowing

In [40]:
mylist


Out[40]:
[5, 4, 2, 3, 1, [1, 2]]

In [41]:
entry[0] = 10
mylist


Out[41]:
[5, 4, 2, 3, 1, [10, 2]]

In [42]:
# use a deep copy to avoid the above problem
from copy import deepcopy
mylist = [5,4,2,3,1]
entry = [1,2]
mylist.append( deepcopy(entry) )
entry[0] = 10
mylist


Out[42]:
[5, 4, 2, 3, 1, [1, 2]]

Variables and Scope

The scope of a variable is the union of all places in the code where the variable can be accessed. Variables in a function are "local" and cannot be access by other parts of the program unless returned.


In [43]:
demon_burn_my_soul = 50.0

def firey_hell(demon_burn_my_soul):
    demon_burn_my_soul += 10.
    
firey_hell(20)
print(demon_burn_my_soul)


50.0

In [44]:
# you can use a global variable, but this is NOT recommended
#  see classes for better solution
global demon_burn_my_soul 
demon_burn_my_soul = 50.0

def firey_hell():
    # side effect! bad! bad! bad!
    global demon_burn_my_soul
    demon_burn_my_soul += 10.
    
firey_hell()
print(demon_burn_my_soul)


60.0

Classes

Classes help bundle together related variables and functions. Well-designed classes are sensible abstract objects that will allow higher level programming without the need to worry about details of implementation.


In [45]:
class RockStar:
    def __init__(self):
        self.demon_burn_my_soul = 50.0
    # end def init
    
    def firey_hell(self):
        self.demon_burn_my_soul += 10.0
    # end def
    
    def cry_my_veins(self):
        return self.demon_burn_my_soul
    # end def cry_my_veins
# end class RockStar

In [46]:
me = RockStar()

In [47]:
me.cry_my_veins()


Out[47]:
50.0

In [48]:
me.firey_hell()
me.cry_my_veins()


Out[48]:
60.0

Basic Plotting


In [49]:
trace_text = """-7.436823 -7.290942 -7.271528 -7.282786 -7.283622 -7.268156 -7.401003
 -7.304412 -7.211659 -7.231061 -7.27238  -7.287718 -7.240896 -7.121189
 -7.098841 -7.169402 -7.16689  -7.161854 -7.204029 -7.284694 -7.260288
 -7.368507 -7.472383 -7.442443 -7.448409 -7.409199 -7.353145 -7.242572
 -7.277459 -7.24589  -7.159036 -7.268178 -7.234837 -7.165567 -7.165357
 -7.137534 -7.231942 -7.225935 -7.16142  -7.183465 -7.257877 -7.279006
 -7.284249 -7.306481 -7.240192 -7.286245 -7.316336 -7.251441 -7.192566
 -7.191351 -7.065362 -7.050815 -7.116456 -7.186705 -7.242357 -7.240123
 -7.284564 -7.385903 -7.468834 -7.427641 -7.378051 -7.315574 -7.287397
 -7.262906 -7.197077 -7.187754 -7.136347 -7.149802 -7.301047 -7.281932
 -7.353314 -7.434607 -7.375526 -7.397572 -7.433974 -7.477175 -7.471739
 -7.474228 -7.51791  -7.525722 -7.52028  -7.534158 -7.539559 -7.53915
 -7.533163 -7.426446 -7.417031 -7.475554 -7.41521  -7.377752 -7.319138
 -7.20372  -7.294216 -7.290163 -7.310827 -7.302531 -7.339285 -7.252367
 -7.232718 -7.275662"""

In [50]:
trace = map(float,trace_text.split())

In [51]:
import matplotlib.pyplot as plt
%matplotlib inline 
# Jupyter-specific magic command, ignore for regular script

In [52]:
stuff = plt.plot(trace)
# plt.show() needed for regular script



In [53]:
# suppose the entries have error
err = np.std(trace) * np.random.rand(len(trace))
plt.errorbar(range(len(trace)),trace,err)


Out[53]:
<Container object of 3 artists>

In [54]:
# see trend (correlation) with exponential smoothing
import pandas as pd
data = pd.Series(trace)
plt.plot( trace )
plt.plot( data.ewm(span=5).mean(),ls="--",lw=2 )


Out[54]:
[<matplotlib.lines.Line2D at 0x7f06321474d0>]

Intermediate Use Cases

vectorized operations with numpy array

python for loops are VERY slow

numpy vectorized operations are about as fast as fortran (LAPACK under the hood)


In [1]:
import numpy as np

def get_mat_vec(nsize):
    mat = np.random.rand(nsize,nsize)
    vec = np.random.rand(nsize)
    return mat,vec
# end def

def mat_vec_np(mat,vec):

    prod = np.dot(mat,vec)
    
    return prod
# end def

def mat_vec_naive(mat,vec):

    prod = np.zeros(nsize)
    for i in range(nsize):
        for j in range(nsize):
            prod[i] += mat[i,j]*vec[j]
        # end for j
    # end for i
    
    return prod
# end def

In [2]:
# verify correctness
nsize = 100
mat,vec = get_mat_vec(nsize)

p1 = mat_vec_np(mat,vec)
p2 = mat_vec_naive(mat,vec)

np.allclose(p1,p2)


Out[2]:
True

In [3]:
# time it
nsize = 1000
mat,vec = get_mat_vec(nsize)

%timeit mat_vec_np(mat,vec)


The slowest run took 36.17 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 163 µs per loop

In [4]:
%timeit -n 10 mat_vec_naive(mat,vec)


10 loops, best of 3: 512 ms per loop

3 orders of magnitude speed difference!

particle swarm optimization example


In [1]:
import numpy as np
from copy import deepcopy
import matplotlib.pyplot as plt
%matplotlib inline

def rastrigin2d(rvec,A=10.):
    ndim  = len(rvec)
    const = A * ndim
    tosum = rvec**2. - A*np.cos(2*np.pi*rvec)
    return const + tosum.sum()
# end def

# put function on a grid for visualization
minx = -5.12
maxx = 5.15
nx   = 100
x = np.linspace(minx,maxx,nx)
grid = np.apply_along_axis(rastrigin2d,1
    ,[np.array([myx,myy]) for myx in x for myy in x] ) # vectorized
grid = grid.reshape(nx,nx) # reshape for plotting

# visualize
fig = plt.figure()
ax  = fig.add_subplot(111,aspect=1)
ax.set_xlabel("x")
ax.set_ylabel("y")

cs = ax.contourf(x,x,grid.T,cmap=plt.cm.magma)
# transpose is needed because matrix index direction and plot axes 
# directions are opposite of one another.

plt.colorbar(cs)
# below I will use pso to find the minimum of this function


Out[1]:
<matplotlib.colorbar.Colorbar at 0x7f01b7a77a10>

In [18]:
# initialize population
pop_size = 20
dim  = 2

pop = (maxx-minx) * np.random.rand(pop_size,dim) + minx

# find personal best
individual_best = np.apply_along_axis(rastrigin2d,1,pop) # vectorized
individual_best_pos = deepcopy(pop) # deep copy for array of arrays

# find population best
min_idx = np.argmin(individual_best)  # find minimum index 
global_best = individual_best[min_idx].copy() # find minimum
global_best_pos = pop[min_idx].copy() # shalow copy sufficient for array

# initialize hopping sizes and directions
max_hop = 0.3
hop = max_hop * np.random.rand(pop_size,dim)

In [19]:
background = plt.figure()
ax = background.add_subplot(111,aspect=1)
ax.set_xlabel("x")
ax.set_ylabel("y")

cs = ax.contourf(x,x,grid.T,alpha=0.3,cmap=plt.cm.magma)
ax.scatter(pop.T[0],pop.T[1],label="current positions")

ax.scatter(individual_best_pos.T[0],individual_best_pos.T[1]
        ,c="g",alpha=0.7,label="individual best",marker='^',s=40)
ax.scatter(global_best_pos[0],global_best_pos[1],color="r"
        ,label="global best",marker="*",s=80)

ax.legend(scatterpoints = 1,fontsize=10,loc="best")
background.colorbar(cs)


Out[19]:
<matplotlib.colorbar.Colorbar at 0x7f01b6f9bd90>

In [49]:
c1 = 2
c2 = 2
max_it = 5
for istep in range(max_it):
    
    # evaluate fitness of population
    fitness = np.apply_along_axis(rastrigin2d,1,pop)
    
    # calculate global best
    min_idx = np.argmin(fitness)
    current_best = fitness[min_idx]
    if current_best < global_best:
        global_best = current_best
        global_best_pos = pop[min_idx].copy()
    # end if
    
    # update individual best
    idx = np.where( np.array(fitness) < np.array(individual_best) )
    individual_best[idx]     = fitness[idx]
    individual_best_pos[idx] = deepcopy( pop[idx] )
    
    # update hopping
    hop += c1*np.random.rand()*(individual_best_pos-pop) + \
           c2*np.random.rand()*(global_best_pos-pop)
    idx = np.where( abs(hop) > max_hop )
    hop[idx] = np.sign(hop[idx])*max_hop
    
    # update populaton
    pop += hop
    
# end for istep

In [50]:
background = plt.figure()
ax = background.add_subplot(111,aspect=1)
ax.set_xlabel("x")
ax.set_ylabel("y")

cs = ax.contourf(x,x,grid.T,alpha=0.3,cmap=plt.cm.magma)
ax.scatter(pop.T[0],pop.T[1],label="current positions")

ax.scatter(individual_best_pos.T[0],individual_best_pos.T[1]
        ,c="g",alpha=0.7,label="individual best",marker='^',s=40)
ax.scatter(global_best_pos[0],global_best_pos[1],color="r"
        ,label="global best",marker="*",s=80)

ax.legend(scatterpoints = 1,fontsize=10,loc="best")
background.colorbar(cs)


Out[50]:
<matplotlib.colorbar.Colorbar at 0x7f01b5ccca50>

In [51]:
global_best


Out[51]:
0.0049272768116175314

In [52]:
global_best_pos


Out[52]:
array([ 0.00051146, -0.00495746])

Text Parsing

plain text file


In [16]:
%%writefile output.txt
# I am an ugly output file, but I have many hidden treasures

BEGIN PASSWORDS OF EVERYONE
test123
1234567890
abcde
hello
passwd
password
END PASSWORDS OF EVERYONE

data follows
3
1.0 2.0 3.0
4.0 5.0 6.0
7.0 8.0 9.0


Overwriting output.txt

In [17]:
# one text block
fhandle = open("output.txt",'r')
# now you have to parse this ugly text
fhandle.read()


Out[17]:
'# I am an ugly output file, but I have many hidden treasures\n\nBEGIN PASSWORDS OF EVERYONE\ntest123\n1234567890\nabcde\nhello\npasswd\npassword\nEND PASSWORDS OF EVERYONE\n\ndata follows\n3\n1.0 2.0 3.0\n4.0 5.0 6.0\n7.0 8.0 9.0'

In [18]:
# line by line
fhandle = open("output.txt",'r')
for line in fhandle:
    print line


# I am an ugly output file, but I have many hidden treasures



BEGIN PASSWORDS OF EVERYONE

test123

1234567890

abcde

hello

passwd

password

END PASSWORDS OF EVERYONE



data follows

3

1.0 2.0 3.0

4.0 5.0 6.0

7.0 8.0 9.0

In [23]:
# smart search
from mmap import mmap
fhandle = open("output.txt",'r+')

mm = mmap(fhandle.fileno(),0) # 0 means read from beginning

In [24]:
# read block
begin_idx = mm.find("BEGIN")
end_idx   = mm.find("END")
good_lines= mm[begin_idx:end_idx].split("\n")[1:-1]
good_lines


Out[24]:
['test123', '1234567890', 'abcde', 'hello', 'passwd', 'password']

In [27]:
# read data section
mm.seek(0)    # go to beginning of file
idx = mm.find("data follows")
mm.seek(idx)  # goto data line
mm.readline() # skip header
ndata = int(mm.readline())
data  = []
for idata in range(ndata):
    data.append( map(float,mm.readline().split()) )
# end for idata
data


Out[27]:
[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]

Database


In [37]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [5]:
# loaded databases is easy
dft = pd.read_json("dft.json")
qmc = pd.read_json("qmc.json")

In [10]:
# first thing to do is look at it? not so useful
dft


Out[10]:
disp energy failed forces imode istep kgrid path pressure stress system volume walltime
0 0.028634 -3.866050 False [[[-0.00172999, -0.00274825, -0.00054565], [-0... 7 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode7-istep1/scf 5034.46 [[0.03353439, -1.233e-05, -0.0001901, 4933.08,... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001414
1 -0.062492 -3.865429 False [[[-0.00664132, 0.00063007, 0.00157616], [0.00... 5 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode5-istep-2/scf 5040.26 [[0.033448, 7e-08, -1.247e-05, 4920.37, 0.01, ... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001408
10 -0.038264 -3.864851 False [[[0.0, 0.0, 0.02720424], [0.0, 0.0, -0.027204... 9 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode9-istep-2/scf 5040.08 [[0.03362799, 0.0, 0.0, 4946.85, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000372
11 -0.019132 -3.865879 False [[[0.0, 0.0, 0.01316107], [0.0, 0.0, -0.013161... 9 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode9-istep-1/scf 5034.29 [[0.03360131, 0.0, 0.0, 4942.92, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000386
12 0.031246 -3.866017 False [[[0.00330763, -0.00031488, 0.00039593], [-0.0... 5 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode5-istep1/scf 5034.34 [[0.03355486, 2e-08, 8.3e-06, 4936.09, 0.0, 1.... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001400
13 0.068304 -3.865809 False [[[0.00173368, -0.00258813, -0.00015099], [-0.... 3 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode3-istep2/scf 5028.84 [[0.03363014, -4.3e-07, 5.7e-06, 4947.16, -0.0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001406
14 0.036879 -3.864642 False [[[0.0, 0.0, 0.02259444], [0.0, 0.0, 0.0225938... 11 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode11-istep2/scf 5045.31 [[0.03413987, 0.0, 0.0, 5022.15, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000372
15 0.034152 -3.866112 False [[[0.00086344, -0.0012863, -3.894e-05], [-0.00... 3 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode3-istep1/scf 5031.47 [[0.03360219, -1.1e-07, 3e-06, 4943.05, -0.02,... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001431
16 -0.019132 -3.865879 False [[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0,... 10 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode10-istep-1/scf 5034.29 [[0.03360131, 0.0, 0.0, 4942.92, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000242
17 -0.068304 -3.865809 False [[[0.00258813, 0.00173368, 0.00015099], [-0.00... 4 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode4-istep-2/scf 5028.84 [[0.03355983, 4.3e-07, 8.48e-06, 4936.82, 0.06... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001419
18 0.038264 -3.864851 False [[[0.0, 0.0, -0.02720426], [0.0, 0.0, 0.027204... 9 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode9-istep2/scf 5040.08 [[0.03362802, 0.0, 0.0, 4946.85, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000381
19 0.031246 -3.866017 False [[[-0.00031488, -0.00330763, -0.00039593], [0.... 6 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode6-istep1/scf 5034.34 [[0.03360891, -2e-08, -8.6e-07, 4944.04, 0.0, ... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001425
2 0.000000 -3.866213 False [[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0,... 0 0 [12.0, 12.0, 12.0] nsize2-tgrid1/reference/scf 5032.35 [[0.03359274, 0.0, 0.0, 4941.66, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000253
20 0.034152 -3.866112 False [[[-0.0012863, -0.00086344, 3.894e-05], [0.001... 4 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode4-istep1/scf 5031.47 [[0.03358447, 1.1e-07, -4.5e-06, 4940.44, 0.02... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001411
21 -0.068304 -3.865809 False [[[-0.00173368, 0.00258813, -0.00015099], [0.0... 3 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode3-istep-2/scf 5028.84 [[0.03363014, -4.3e-07, -5.7e-06, 4947.16, -0.... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001425
22 0.057268 -3.865525 False [[[-0.00538222, 0.00323031, 0.00071572], [-0.0... 8 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode8-istep2/scf 5037.49 [[0.0343124, -7.393e-05, 0.00029029, 5047.53, ... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001411
23 -0.031246 -3.866017 False [[[0.00031488, 0.00330763, -0.00039593], [-0.0... 6 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode6-istep-1/scf 5034.34 [[0.03360891, -2e-08, 8.6e-07, 4944.04, 0.0, 0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001442
24 0.062492 -3.865429 False [[[0.00664132, -0.00063007, 0.00157616], [-0.0... 5 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode5-istep2/scf 5040.26 [[0.033448, 7e-08, 1.247e-05, 4920.37, 0.01, 1... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001414
25 -0.028634 -3.866050 False [[[0.00172999, 0.00274825, -0.00054565], [0.00... 7 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode7-istep-1/scf 5034.46 [[0.03353439, -1.233e-05, 0.0001901, 4933.08, ... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001439
26 -0.018440 -3.865806 False [[[0.0, 0.0, -0.01133915], [0.0, 0.0, -0.01133... 11 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode11-istep-1/scf 5036.19 [[0.0336313, 0.0, 0.0, 4947.33, 0.0, 0.0], [0.... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000389
27 -0.028634 -3.866050 False [[[0.00274825, -0.00172999, 0.00054566], [0.00... 8 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode8-istep-1/scf 5034.46 [[0.03407057, 1.233e-05, -0.00013811, 5011.95,... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001414
28 -0.034152 -3.866112 False [[[0.0012863, 0.00086344, 3.894e-05], [-0.0012... 4 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode4-istep-1/scf 5031.47 [[0.03358447, 1.1e-07, 4.5e-06, 4940.44, 0.02,... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001400
29 0.057268 -3.865525 False [[[-0.00323031, -0.00538222, -0.00071571], [-0... 7 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode7-istep2/scf 5037.49 [[0.03334483, 7.393e-05, -0.00013511, 4905.19,... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001422
3 -0.034152 -3.866112 False [[[-0.00086344, 0.0012863, -3.894e-05], [0.000... 3 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode3-istep-1/scf 5031.47 [[0.03360219, -1.1e-07, -3e-06, 4943.05, -0.02... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001431
30 -0.062492 -3.865429 False [[[0.00063007, 0.00664132, -0.00157616], [-0.0... 6 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode6-istep-2/scf 5040.26 [[0.03365002, -7e-08, 1.33e-06, 4950.09, -0.01... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001403
31 0.038263 -3.864851 False [[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0,... 10 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode10-istep2/scf 5040.07 [[0.03362798, 0.0, 0.0, 4946.85, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000247
32 0.028634 -3.866050 False [[[-0.00274825, 0.00172999, 0.00054566], [-0.0... 8 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode8-istep1/scf 5034.46 [[0.03407057, 1.233e-05, 0.00013811, 5011.95, ... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001411
33 -0.036879 -3.864642 False [[[0.0, 0.0, -0.0225944], [0.0, 0.0, -0.022593... 11 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode11-istep-2/scf 5045.31 [[0.03395917, 0.0, 0.0, 4995.57, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000378
34 -0.057268 -3.865525 False [[[0.00323031, 0.00538222, -0.00071571], [0.00... 7 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode7-istep-2/scf 5037.49 [[0.03334483, 7.393e-05, 0.00013511, 4905.19, ... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001400
35 -0.057268 -3.865525 False [[[0.00538222, -0.00323031, 0.00071571], [0.00... 8 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode8-istep-2/scf 5037.49 [[0.0343124, -7.393e-05, -0.00029029, 5047.53,... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001425
36 -0.038263 -3.864851 False [[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0,... 10 -2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode10-istep-2/scf 5040.08 [[0.03362801, 0.0, 0.0, 4946.85, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000250
4 0.018440 -3.865806 False [[[0.0, 0.0, 0.01133924], [0.0, 0.0, 0.0113389... 11 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode11-istep1/scf 5036.19 [[0.03395998, 0.0, 0.0, 4995.68, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000378
5 0.019132 -3.865879 False [[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0,... 10 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode10-istep1/scf 5034.29 [[0.03360131, 0.0, 0.0, 4942.92, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000244
6 0.068304 -3.865809 False [[[-0.00258813, -0.00173368, 0.00015099], [0.0... 4 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode4-istep2/scf 5028.84 [[0.03355983, 4.3e-07, -8.48e-06, 4936.82, 0.0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001403
7 0.019132 -3.865879 False [[[0.0, 0.0, -0.01316107], [0.0, 0.0, 0.013161... 9 1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode9-istep1/scf 5034.29 [[0.03360132, 0.0, 0.0, 4942.92, 0.0, 0.0], [0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.000372
8 0.062492 -3.865429 False [[[-0.00063007, -0.00664132, -0.00157616], [0.... 6 2 [12.0, 12.0, 12.0] nsize2-tgrid1/imode6-istep2/scf 5040.26 [[0.03365002, -7e-08, -1.33e-06, 4950.09, -0.0... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001408
9 -0.031246 -3.866017 False [[[-0.00330763, 0.00031488, 0.00039593], [0.00... 5 -1 [12.0, 12.0, 12.0] nsize2-tgrid1/imode5-istep-1/scf 5034.34 [[0.03355486, 2e-08, -8.3e-06, 4936.09, 0.0, -... {u'ecutwfc': 40.0, u'tot_charge': 0, u'input_d... 30.4311 0.001397

In [11]:
# look at columns
dft.columns


Out[11]:
Index([u'disp', u'energy', u'failed', u'forces', u'imode', u'istep', u'kgrid',
       u'path', u'pressure', u'stress', u'system', u'volume', u'walltime'],
      dtype='object')

In [30]:
# access interesting columns
dft[["energy","pressure"]]


Out[30]:
energy pressure
0 -3.866050 5034.46
1 -3.865429 5040.26
10 -3.864851 5040.08
11 -3.865879 5034.29
12 -3.866017 5034.34
13 -3.865809 5028.84
14 -3.864642 5045.31
15 -3.866112 5031.47
16 -3.865879 5034.29
17 -3.865809 5028.84
18 -3.864851 5040.08
19 -3.866017 5034.34
2 -3.866213 5032.35
20 -3.866112 5031.47
21 -3.865809 5028.84
22 -3.865525 5037.49
23 -3.866017 5034.34
24 -3.865429 5040.26
25 -3.866050 5034.46
26 -3.865806 5036.19
27 -3.866050 5034.46
28 -3.866112 5031.47
29 -3.865525 5037.49
3 -3.866112 5031.47
30 -3.865429 5040.26
31 -3.864851 5040.07
32 -3.866050 5034.46
33 -3.864642 5045.31
34 -3.865525 5037.49
35 -3.865525 5037.49
36 -3.864851 5040.08
4 -3.865806 5036.19
5 -3.865879 5034.29
6 -3.865809 5028.84
7 -3.865879 5034.29
8 -3.865429 5040.26
9 -3.866017 5034.34

In [21]:
# plot energy vs. displacement
xlabel = "istep"
ylabel = "energy"
plt.xlabel(xlabel)
plt.ylabel(ylabel)
plt.scatter(dft[xlabel],dft[ylabel])
plt.ylim(-3.8665,-3.865)


Out[21]:
(-3.8665, -3.865)

In [23]:
dmc = qmc[qmc["iqmc"]==4]
vmc = qmc[qmc["iqmc"]==0]

In [24]:
xlabel = "istep"
ylabel = "LocalEnergy_mean"
plt.xlabel(xlabel)
plt.ylabel(ylabel)
plt.scatter(dmc[xlabel],dmc[ylabel])


Out[24]:
<matplotlib.collections.PathCollection at 0x7fc4438754d0>

In [31]:
fig = plt.figure()
ax  = fig.add_subplot(111)
ax.set_xlabel("displacement (bohr)")
ax.set_ylabel("Varaince (Ha$^2$)")

marker_style = {0:"s",1:"^"}
colors = {0:"g",1:"r"}

rjas = 1 # use reference jastrow
for rorb in [0,1]:
    mydf = vmc[ vmc["rorb"] == rorb ]
    ax.errorbar(mydf["disp"],mydf["Variance_mean"],mydf["Variance_error"].values,ls="",marker=marker_style[rorb]
              ,color=colors[rorb],label="ref. orb %d"%rorb)
# end for
ax.legend(loc="best",scatterpoints=1)
#ax.set_ylim(1.1,1.5)
fig.tight_layout()
#plt.savefig("variance_vs_disp-rjas1.eps")



In [34]:
# drop bad runs
sel1 = (qmc["imode"]==5) & (qmc["istep"]==-2)
qmc  = qmc.drop(qmc[sel1].index)
sel2 = (qmc["imode"]==10) & (qmc["istep"]==2)
qmc  = qmc.drop(qmc[sel2].index)
dmc = qmc[qmc["iqmc"]==4]
vmc = qmc[qmc["iqmc"]==0]

In [35]:
fig = plt.figure()
ax  = fig.add_subplot(111)
ax.set_xlabel("displacement (bohr)")
ax.set_ylabel("Varaince (Ha$^2$)")

marker_style = {0:"s",1:"^"}
colors = {0:"g",1:"r"}

rjas = 1 # use reference jastrow
for rorb in [0,1]:
    mydf = vmc[ vmc["rorb"] == rorb ]
    ax.errorbar(mydf["disp"],mydf["Variance_mean"],mydf["Variance_error"].values,ls="",marker=marker_style[rorb]
              ,color=colors[rorb],label="ref. orb %d"%rorb)
# end for
ax.legend(loc="best",scatterpoints=1)
#ax.set_ylim(1.1,1.5)
fig.tight_layout()
#plt.savefig("variance_vs_disp-rjas1.eps")



In [38]:
dmc.groupby(["rorb","rjas"]).apply(np.mean)[["LocalEnergy_mean","Variance_mean"]]


Out[38]:
LocalEnergy_mean Variance_mean
rorb rjas
0 0 -7.262998 1.409185
1 -7.264088 1.370508
1 0 -7.238160 1.265554
1 -7.238364 1.252123

In [ ]: