In [1]:
import chemcoord as cc
import time
In [2]:
water = cc.Cartesian.read_xyz('water_dimer.xyz', start_index=1).get_zmat()
small = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1).get_zmat()
Let's have a look at it:
In [3]:
water
Out[3]:
This is the normal zmatrix with the only difference being that the upper right triangle is filled with references to the origin and canonical unit vectors. Chemically speaking this fixes translational and rotational degrees of freedom and allows to preserve the information about the absolute position in space.
In this tutorial we concentrate on operations you can do with zmatrices.
Keep in mind, that there is an own tutorial dedicated to transformation from cartesian space to internal coordinates and back.
The slicing operations are the same as for pandas.DataFrames
. (http://pandas.pydata.org/pandas-docs/stable/indexing.html)
If the 'x'
axis is of particular interest you can slice it out with:
In [4]:
water['bond']
# or explicit label based indexing
water.loc[:, 'bond']
# or explicit integer based indexing
water.iloc[:, 2]
Out[4]:
Now it is very easy to e.g. count the atoms:
In [5]:
water['atom'].value_counts()
Out[5]:
The indexing behaves like Indexing and Selecting data in
Pandas.
You can slice with Zmat.loc[key]
, Zmat.iloc[keys]
, and Zmat[key]
.
The only question is about the return type.
If the information in the columns is enough to draw a molecule,
an instance of the own class (e.g. Zmat
)
is returned.
If the information in the columns is not enough to draw a molecule, there
are two cases to consider:
pandas.Series
instance is returned for one dimensional slicesA pandas.DataFrame
instance is returned in all other cases:
molecule.loc[:, ['atom', 'b', 'bond', 'a', 'angle', 'd', 'dihedral']] returns a `Zmat`.
molecule.loc[:, ['atom', 'b']]`` returns a `pandas.DataFrame`.
molecule.loc[:, 'atom']`` returns a `pandas.Series`.
If rows are omitted, there is never a Zmat
instance returned.
There exist four different methods to perform assignments:
safe_loc
, unsafe_loc
, safe_iloc
, and unsafe_iloc
.
As in pandas safe_loc
and unsafe_loc
are used for label based indexing and
safe_iloc
and unsafe_iloc
for row based indexing.
The safe methods assert that assignments do not lead to zmatrices, which can't be transformed back to cartesian coordinates. They also insert dummy atoms where necessary.
The unsafe methods are wrappers around their pandas.DataFrame
counterparts and are a lot faster than the safe assignments.
Unless there is a good (performance) reason, it is recommended to use the safe assignment methods!
It is possible to use symbolic expressions from sympy.
In [6]:
import sympy
sympy.init_printing()
d = sympy.Symbol('d')
In [7]:
symb_water = water.copy()
In [8]:
symb_water.safe_loc[4, 'bond'] = d
In [9]:
symb_water
Out[9]:
In [10]:
symb_water.subs(d, 2)
Out[10]:
In [11]:
cc.xyz_functions.view([symb_water.subs(d, i).get_cartesian() for i in range(2, 5)])
# Uncomment if viewer cannot open molden files
# for i in range(2, 5):
# symb_water.subs(d, i).get_cartesian().view()
# time.sleep(1)
The general rule is that mathematical operations using the binary operators
(+ - * /)
and the unary operators (+ - abs)
are only applied to the ['bond', 'angle', 'dihedral']
columns.
Addition/Subtraction/Multiplication/Division:
The most common case is to add another Zmat instance.
In this case it is tested, if the used references are the same.
Afterwards the addition in the ['bond', 'angle', 'dihedral']
columns
is performed.
If you add a scalar to a Zmat it is added elementwise onto the
['bond', 'angle', 'dihedral']
columns.
If you add a 3-dimensional vector, list, tuple... the first element of this
vector is added elementwise to the 'bond'
column of the
Zmat instance and so on.
The third possibility is to add a matrix with
shape=(len(Zmat), 3)
which is again added elementwise.
The same rules are true for subtraction, division and multiplication.
In [12]:
distortion = water.copy()
distortion.unsafe_loc[:, ['bond', 'angle', 'dihedral']] = 0
distortion.safe_loc[4, 'bond'] = d
In [13]:
water + distortion
Out[13]:
In [14]:
(water + distortion).subs(d, 3)
Out[14]:
In [15]:
(water + distortion).subs(d, 3).get_cartesian().view()
In [16]:
cartesians = cc.xyz_functions.read_molden('MIL53_ht_lt_movement.molden')
STEPS = 5
A typical approximation is the small angular approximation when calculating activation barriers for the transition from one allotrope to another.
Let's import the following allotropes:
In [17]:
m1, m2 = cartesians[0], cartesians[-1]
The easiest approach for obtaining structures for the movement from m1
to m2
, is to linearize the movement in cartesian space.
For geometric reasons this leads to shortened bond lengths, when bond angles change. The larger the change in angles, the larger this effect which leads to an overestimation of the activation energy.
In [18]:
cc.xyz_functions.view([m1 + (m2 - m1) * i / STEPS for i in range(STEPS)])
# Uncomment if viewer cannot open molden files
# for molecule in [m1 + (m2 - m1) * i / STEPS for i in range(STEPS)]:
# molecule.view()
# time.sleep(1)
Another approach is to build two Zmatrices with the exact same references/construction table and linearise the movement in the space of internal coordinates.
In [19]:
zm1 = m1.get_zmat()
c_table = zm1.loc[:, ['b', 'a', 'd']]
zm2 = m2.get_zmat(c_table)
The term zm2 - zm1
is not convertible to cartesian coordinates.
For this reason we have to switch of the testing for the validity of zmatrices when using operators.
The minimize_dihedrals
method chooses the minimal absolute value representation of an angle equivalence class.
So it will move e.g. by -30° instead of 270°.
In [20]:
with cc.TestOperators(False):
zmats = [zm1 + (zm2 - zm1).minimize_dihedrals() * i / STEPS for i in range(STEPS)]
The resulting structures are a lot more reasonable for the interconversion of allotropes.
(Look for example at the C-H
distance in methyl groups).
In [21]:
cc.xyz_functions.view([x.get_cartesian() for x in zmats])
# Uncomment if viewer cannot open molden files
# for molecule in [x.get_cartesian() for x in zmats]:
# molecule.view()
# time.sleep(1)
In [ ]: