This notebook contains a tutorial to perform MD simulations of an Octamethyl Silsesquioxane molecule with NAMD and the AMBER force field
Octa-methyl Silsesqioxane
Working with python scripts
Fitting of Methyl dihedral barrier of Octamethyl Silsesqiuoxane
Exercise: Obtaining an initial guess through a $\chi^2$ plot
View/Edit the contents on an ASCII file
(Top)
(Top)
Scripts are written in the python language. Any text editor can be used but for convenience, we will use the script editor available in MantidPlot. It has the advantage that all algorithms are automatically loaded into the namespace, thus simplifying the coding. The script editor is opened from the View menu: View $\rightarrow$ Script Window:
The script window has an upper pannel to write your code, and a lower pannel where output is printed.
Commands in the editor are executed from the Execute menu: Execute $\rightarrow$ Execute all. There's also the option to execute only the commands that are higlighted via Execute $\rightarrow$ Execute selection:
In the output panel, only two
and three
are printed. Notice also the green arrow indicating the last line in the script that was executed.
(Top)
The output from the Sassena calculations are files fqt_inc_run1_rms2first.h5, located in in subdirectories $USER/SHUG2015/mantid/refinement/data/K$K/T_200/
, where $K
is any one of the dihedral potential barriers.
The fitting proces involves the following steps:
All these steps are coded in python script refinement.py, which can be copied to the MantidPlot script window, and executed.
In [ ]:
rootd='/SNSlocal/scratch/jbq/me8t8/data'
LoadNexus(Filename='{0}/elastic.nxs'.format(rootd), OutputWorkspace='elastic')
LoadNexus(Filename='{0}/exp200K.nxs'.format(rootd), OutputWorkspace='exp200K')
parametervalues='0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14'
for K in parametervalues.split():
LoadSassena(Filename='{0}/K{1}/T_200/fqt_inc_run1_rms2first.h5'.format(rootd,K), TimeUnit=1.0, OutputWorkspace='incSMK{0}'.format(K))
if K=='0.11':
Transpose(InputWorkspace='incSMK0.11_fqt.Re', OutputWorkspace='incSMK0.11_fqt.Re') #from I(Q,t) to I(t,Q)
Rebin(InputWorkspace='incSMK0.11_fqt.Re', Params=[0.2,0.2,2.0], OutputWorkspace='incSMK0.11_fqt.Re') #Rebin in Q to (0.3, 0.5,..,1.9)
Transpose(InputWorkspace='incSMK0.11_fqt.Re', OutputWorkspace='incSMK0.11_fqt.Re') # from I(t,Q) back to I(Q,t)
Scale(InputWorkspace='incSMK0.11_fqt.Re', factor=0.5, Operation='Multiply', OutputWorkspace='incSMK0.11_fqt.Re')
SassenaFFT(InputWorkspace='incSMK{0}'.format(K), FFTonlyRealpart=1, DetailedBalance=1, Temp=200)
Rebin(InputWorkspace='incSMK{0}_sqw'.format(K), Params=[-0.2,0.0004,0.2], OutputWorkspace='incSMK{0}_sqw'.format(K))
ws=mtd['incSMK{0}_sqw'.format(K)] # simulated S(Q,E)
for iQ in range(9):
Q=0.03+0.02*iQ #not neccessary but just to remind the Q-value
SQE=ws.dataY(iQ)
SQE[499]=SQE[498]
SQE[500]=SQE[501]
for K in parametervalues.split():
ConvolveWorkspaces(Workspace1='elastic', Workspace2='incSMK{0}_sqw'.format(K), OutputWorkspace='simSMK{0}'.format(K))
for K in parametervalues.split():
Scale(InputWorkspace='simSMK{0}'.format(K), Factor=1.0e-05,Operation='Multiply', OutputWorkspace='simSMK{0}'.format(K))
guess={'Scaling':1.0, 'Intensity':1.0, 'K':0.11}
inputworkspaces=' '.join( ['simSMK{0}'.format(K) for K in parametervalues.split()])
for iw in range(8,-1,-1):
Q=0.3+iw*0.2
fit_string = 'name=TabulatedFunction,Workspace=elastic,WorkspaceIndex={0},Scaling={1},constraints=(0.0001<Scaling);'.format(iw,guess['Scaling'])+\
'name=DSFinterp1DFit,InputWorkspaces="{0}",ParameterValues="{1}",'.format(inputworkspaces,parametervalues) +\
'LoadErrors=0,LocalRegression=1,RegressionType=quadratic,RegressionWindow=4,' +\
'WorkspaceIndex={0},Intensity={1},TargetParameter={2},'.format(iw,guess['Intensity'],guess['K']) +\
'constraints=(0.0001<Intensity);' +\
'name=LinearBackground,A0=0.0,A1=0.0'
Fit(fit_string, InputWorkspace='exp200K', WorkspaceIndex=iw, StartX=-0.13, EndX=0.1, CreateOutput = 1, Output='fit{0}'.format(iw) )
ws=mtd['fit{0}_Parameters'.format(iw)]
guess['Scaling']=ws.row(0)['Value']
guess['Intensity']=ws.row(2)['Value']
Koptimal=ws.row(3)['Value']
print 'Q=', Q, 'L={0:5.1f}'.format( 6.2832/Q), 'K={0:6.4f}'.format(Koptimal), 'Chi2={0:3.1f}'.format(ws.row(6)['Value'])
A detailed explanation of this script follows:
In [ ]:
rootd='/SNSlocal/scratch/jbq/me8t8/data'
Working directory.
In [ ]:
LoadNexus(Filename='{0}/elastic.nxs'.format(rootd), OutputWorkspace='elastic')
LoadNexus(Filename='{0}/exp200K.nxs'.format(rootd), OutputWorkspace='exp200K')
Load into MantidPlot the experimental resolution file (elastic.nxs) and the experimental structure factor file exp200K.nxs. The file contains $S(Q,E)$ for a powser sample of mPOSS at T=200K. It contains 9 spectra, each at a different Q value. ($Q=$0.3, 0.5, 0.7,..,1.9) and defined in the energy domain $E=$[-0.15, 0.15]meV.
In [ ]:
parametervalues='0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14'
The list of values for the dihedral potential barrier $K$ for which we did simulations
In [ ]:
for K in parametervalues.split():
We will perform a series of commands for each of the simulated $I(Q,t)$
In [ ]:
LoadSassena(Filename='{0}/K{1}/T_200/fqt_inc_run1_rms2first.h5'.format(rootd,K), TimeUnit=1.0, OutputWorkspace='incSMK{0}'.format(K))
In [ ]:
if K=='0.11':
Transpose(InputWorkspace='incSMK0.11_fqt.Re', OutputWorkspace='incSMK0.11_fqt.Re') #from I(Q,t) to I(t,Q)
Rebin(InputWorkspace='incSMK0.11_fqt.Re', Params=[0.2,0.2,2.0], OutputWorkspace='incSMK0.11_fqt.Re') #Rebin in Q to (0.3, 0.5,..,1.9)
Transpose(InputWorkspace='incSMK0.11_fqt.Re', OutputWorkspace='incSMK0.11_fqt.Re') # from I(t,Q) back to I(Q,t)
Scale(InputWorkspace='incSMK0.11_fqt.Re', factor=0.5, Operation='Multiply', OutputWorkspace='incSMK0.11_fqt.Re')
What happens if our simulation has different Q-values than the experiment? This could happen if we did the simulations before the experiment. The solution is to Rebin
our simulated $I(Q,t)$ along the Q-axis, so that the binning corresponds to that of the experiment.
We showcase this problem with the simulation that was done for $K=0.11$.
The required steps are:
Rebin
algorithm on $I(t,Q)$. Params=[0.2, 0.2, 2.0]
indicates a bin width $w=0.2$ (the second value), and initial $Q_i=0.2+w/2=0.3$ and a final $Q_f=2.0-w/2=1.9$.
In [ ]:
SassenaFFT(InputWorkspace='incSMK{0}'.format(K), FFTonlyRealpart=1, DetailedBalance=1, Temp=200)
Fourier transform $I(Q,t) \rightarrow S(Q,E)$ with algorithm SassenaFFT. It will read in group workspace incSMK0.2
and append workspace incSMK0.02_sqw
In [ ]:
Rebin(InputWorkspace='incSMK{0}_sqw'.format(K), Params=[-0.2,0.0004,0.2], OutputWorkspace='incSMK{0}_sqw'.format(K))
The energy binning of the computed $S(Q,E)$ goes from $E_i=-2meV$ to $E_f=2meV$ with bin width $0.1\mu eV$. This binning is different than the experimental binning ($E_i=-0.15meV$, $E_f=0.15meV$, bin-width=$0.4\mu eV$), hence the rebinning step. Note that our final binnin has the same width than experiment, but over a slightly broader dynamic range. This is neccessary for the convolution step that will be performed later.
In the picture above it is shown $S(Q=1.9,E, K=0.2)$, with a prominent elastic line and a quasi-elastic broadening (notice the Log-scale).
The elastic line represents the long-time self-correlation of the hydrogen atoms. Accurately reproducing this long-time self-correlation is the weakest point in our simulations due to the disparity in simulated and experimental environment surrounding each mPOSS molecule:
Our simulations cannot reproduce the diffusive motions of the molecule CoM. Thus, our elastic line will have a higher intensity than in experiments.
In [ ]:
ws=mtd['incSMK{0}_sqw'.format(K)] # simulated S(Q,E)
for iQ in range(9):
Q=0.03+0.02*iQ #not neccessary but just to remind the Q-value
SQE=ws.dataY(iQ)
SQE[499]=SQE[498]
SQE[500]=SQE[501]
Remove the elastic line from the computed $S(Q,E)$. Later we will include an elastic line in our fitting model. We will treat the elastic line as an additional fitting parameter.
incSMK0.02_sqw
In [ ]:
for K in parametervalues.split():
ConvolveWorkspaces(Workspace1='elastic', Workspace2='incSMK{0}_sqw'.format(K), OutputWorkspace='simSMK{0}'.format(K))
In [ ]:
for K in parametervalues.split():
Scale(InputWorkspace='simSMK{0}'.format(K), Factor=1.0e-05,Operation='Multiply', OutputWorkspace='simSMK{0}'.format(K))
In [ ]:
guess={'Scaling':1.0, 'Intensity':1.0, 'K':0.11}
The initial guess, $K=0.11$ was the default value in the force-field
In [ ]:
for iw in range(8,-1,-1):
loop over the 9 spectra, beginning with spectrum of index=8 (corresponding to $Q=1.9$) and ending with spectrum of index=0 (corresponding to $Q=0.3$)
In [ ]:
fit_string = 'name=TabulatedFunction,Workspace=elastic,' +\
'WorkspaceIndex={0},Scaling={1},constraints=(0.0001<Scaling),XScaling=1.0,ties=(XScaling=1.0);'.format(iw,guess['Scaling'])+\
'name=DSFinterp1DFit,InputWorkspaces="{0}",ParameterValues="{1}",'.format(inputworkspaces,parametervalues) +\
'LoadErrors=0,LocalRegression=1,RegressionType=quadratic,RegressionWindow=4,' +\
'WorkspaceIndex={0},Intensity={1},TargetParameter={2},'.format(iw,guess['Intensity'],guess['K']) +\
'constraints=(0.0001<Intensity);' +\
'name=LinearBackground,A0=0.0,A1=0.0'
This is our fitting model. Instead of using the graphical interface, we can write it out as a (complex) string.
$I_{el}\cdot R(Q,E)$ is the elastic line which we removed from the computed $S(Q,E)$
$I_{QE}\cdot S_{sim}(Q,E)$ is the interpolated structure factor via fitting function DSFinterp1DFit
. The inputs for this fitting function are:
InputWorkspaces
: a list of simulated structure factors, convolved with the experimental resolution.ParameterValues
: a list of K-values for each of the InputWorkspaces
.LocalRegression
: perform a regression for each $S(Q,E,K)$ versus $K$.RegressionType
, RegressionWindow
: regression settings
In [ ]:
Fit(fit_string, InputWorkspace='exp200K', WorkspaceIndex=iw, StartX=-0.13, EndX=0.1, CreateOutput = 1, Output='fit{0}'.format(iw) )
Calling the fitting algorithm Fit
. Results are saved in the following workspaces:
Fit8_Workspaces
: workspace containing experimental expectrum 8, best fit, and residuals.Fit8_NormalisedCovarianceMatrix
: correlation matrix between the fitting parameters.Fit8_Parameters
: optimized parameters for spectrum 8.
In [ ]:
ws=mtd['fit{0}_Parameters'.format(iw)]
guess['Scaling']=ws.row(0)['Value']
guess['Intensity']=ws.row(2)['Value']
Get a handle (ws
) to the workspace optimized parameters, and retrieve optimized values for Scaling
and Intensity
. These optimized values will be the initial guess for the optimization of the next spectrum. This scheme is a sequential fitting of the spectra.
In [ ]:
Koptimal=ws.row(3)['Value']
print 'Q=', Q, 'L={0:5.1f}'.format( 6.2832/Q), 'K={0:6.4f}'.format(Koptimal), 'Chi2={0:3.1f}'.format(ws.row(6)['Value'])
Print some results of the fitting to the lower panel of the fitting window. The optimal dihedral potential barrier $K$ is quite conserved along the spectra.
The value of $K$ averaged over all spectra is $\bar{K}=0.0643$. The dihedral potential barrier is actually twice this quantity.
Each methyl gropu is subject to 9 different dihedrals thus the total activation energy to methyl rotation is $E_a=9\cdot 2\cdot \bar{K}=1.16Kcal/mol$. By comparison, $E_a$ obtained from a temperature scan of $S(Q,E)$ at the BASIS beamline was $1.22Kcal/mol$.
(Top)
Load the simulated $S(Q,E)$ and convolve with experimental resolution
Open the script window in MantidPlot, and paste the relevant lines from refinement.py. Be sure to actualize value of variable rootd
.
Use algorith [DSFinterp](http://archive.mantidproject.org/DSFinterp) to produce 20 interpolated structure factors
Below are the python line to call algorith DSFinterp with all needed arguments. Most of them need to be filled up.
In [ ]:
DSFinterp(Workspaces= ,LoadErrors=0, TargetParameters=, OutputWorkspaces=, LocalRegression=1, RegressionWindow=4, RegressionType='quadratic')
Workspaces
needs to be passed a python list containing the names of the simulated structure after convolved with the experimental resolutio and rescaled.ParameterValues
needs to be passed a python list containing the values of $K$ corresponding to Workspace
. NOTE: the list is make up of real numbers, not strings.OutputWorkspaces
needs to be passed a python list containing the names of the workspaces where you will store the interpolated structure factors. For instance, somethink like [interp0,...,interp19]
TargetParameterValues
are the values of $K$ corresponding to the output workspaces. For instance, the 20 values [0.03, 0.035,..,0.125]
.Fit spectrum with index 8 of the simulated structure factors against experimental spectrum with index 8
We will do a fit against experiment for each of the 20 interpolated structure factors, hence we need a loop:
In [ ]:
for i in range(20):
#command 1
#command 2
#....
It's our task to write code for those commands within the loop.
We start writing a python string for the fitting model. Our fitting model is as before:
elastic
We will use fitting function TabulatedFunction, which loads a particular spectrum from a workspace. Below is the string containing the model. There are two fitting functions. The first is for the elastic line and the second is for one of the interpolated structure factors. Substitute the symbols ???
with the approriate workspace names.
In [ ]:
fit_string ='name=TabulatedFunction,Workspace=???,WorkspaceIndex=8,Scaling=1.0,Shift=0.0;'
fit_string +='name=TabulatedFunction,Workspace=???,WorkspaceIndex=8,Scaling=1.0,Shift=0.0;'.format(i)
fit_string +='name=LinearBackground,A0=0.0,A1=0.0'
Next we need to call the Fit
algorithm. Inspect the line of refinement.py and then write a python line calling Fit
to fit each of our 20 models agains spectrum with index 8 of the experimental structure factor in the energy range [-0.13, 0.10]. Pass the following value to argument Output
to store the output of the fits
In [ ]:
Output='fit{0}'.format(i)
Next, get a handle to the workspace containing the fit parameters. Identify the row containing the Cost function value
, which is the $\chi^2$ value, and save to variable chi2
. Finally, print it along with the corresponding $K$ value
In [ ]:
print targetvalues[i], chi2
These are all the commands that have to be included withing the "for i in range(20)
" loop.
When you run the script, $K$ and $chi^2$ values will be printed in two columns in the lower panel of the script window:
Copy these values and save them in a file. Plot the file with your favorite Linux plotting tool. I like xmgrace
which can be invoked from the terminal:
In [ ]:
%%bash
xmgrace myfile.dat
Estimate the value of $K$ for the minimum of $\chi^2(K)$
Solution: script initial_guess.py
(Top)
1 File is shown in the terminal:
In [ ]:
%%bash
cat file #will dump all contents of the file to the terminal
less file #will fill the terminal with beginning of the file, and wait for user input
vi file #the best Linux text editor according to vi fans
emacs -nw file #the best Linux text editor according to emacs fans
2 File is shown in a separate window
In [ ]:
%%bash
emacs file &
gedit file & #recommended for those familiar with windows, similar to notepad
openoffice.org file & #like WORD, maybe to much for a simple ASCII file