Introduction

Welcome to the tutorial for ChemCoord (http://chemcoord.readthedocs.org/).

The manipulation of the coordinates is a lot easier, if you can view them on the fly. So please install a molecule viewer, which opens xyz-files. A non complete list includes: molcas gv, avogadro, vmd, and pymol

Cartesian


In [1]:
import chemcoord as cc
from chemcoord.xyz_functions import get_rotation_matrix
import numpy as np
import time

In [2]:
water = cc.Cartesian.read_xyz('water_dimer.xyz', start_index=1)
small = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1)
middle = cc.Cartesian.read_xyz('MIL53_middle.xyz', start_index=1)

Let's have a look at it:


In [3]:
water


Out[3]:
Cartesian
atom x y z
1 O 0.000000 0.0 0.000000
2 H 0.758602 0.0 0.504284
3 H 0.260455 0.0 -0.872893
4 O 3.000000 0.5 0.000000
5 H 3.758602 0.5 0.504284
6 H 3.260455 0.5 -0.872893

It is also possible to open it with an external viewer. I use Molcas gv.exe so you have to change it accordingly to your program of choice.


In [4]:
water.view(viewer='gv.exe')

To make this setting permament, execute:


In [5]:
cc.settings['defaults']['viewer'] = 'gv.exe'  # replace by your viewer of choice

Slicing

The slicing operations are the same as for pandas.DataFrames. (http://pandas.pydata.org/pandas-docs/stable/indexing.html)

If the 'x' axis is of particular interest you can slice it out with:


In [6]:
water['x']
# or explicit label based indexing
water.loc[:, 'x']
# or explicit integer based indexing
water.iloc[:, 1]


Out[6]:
1    0.000000
2    0.758602
3    0.260455
4    3.000000
5    3.758602
6    3.260455
Name: x, dtype: float64

With boolean slicing it is very easy to cut all the oxygens away:


In [7]:
water[water['atom'] != 'O'].view()

This can be combined with other selections:


In [8]:
water[(water['atom'] != 'O') & (water['x'] < 1)].view()

Returned type

The indexing behaves like Indexing and Selecting data in Pandas. You can slice with Cartesian.loc[key], Cartesian.iloc[keys], and Cartesian[key]. The only question is about the return type. If the information in the columns is enough to draw a molecule, an instance of the own class (e.g. Cartesian) is returned. If the information in the columns is not enough to draw a molecule, there are two cases to consider:

  • A pandas.Series instance is returned for one dimensional slices
  • A pandas.DataFrame instance is returned in all other cases:

      molecule.loc[:, ['atom', 'x', 'y', 'z']] returns a `Cartesian`.
    
      molecule.loc[:, ['atom', 'x']]`` returns a `pandas.DataFrame`.
    
      molecule.loc[:, 'atom']`` returns a `pandas.Series`.

Sideeffects and Method chaining

Two general rules are:

  1. All functions are sideeffect free unless stated otherwise in the documentation.
  2. Where possible the methods return an instance of the own class, to allow method chaining.

Have a look at the unmodified molecule


In [9]:
middle.view()

Chain the methods:


In [10]:
middle.cut_sphere(radius=5, preserve_bonds=False).view()

The molecule itself remains unchanged.


In [11]:
middle.view()

Chemical bonds

One really important method is get_bonds(). It returns a connectivity table, which is represented by a dictionary. Each index points to set of indices, that are connected to it.


In [12]:
water.get_bonds()


Out[12]:
{1: {2, 3}, 2: {1}, 3: {1}, 4: {5, 6}, 5: {4}, 6: {4}}

Now the focus switches to another molecule (MIL53_middle)

Let's explore the coordinationsphere of the Cr atom with the index 7.


In [13]:
for i in range(3):
    middle.get_coordination_sphere(13, n_sphere=i, only_surface=False).view()
    time.sleep(1)

Binary operators

Mathematical Operations:

Binary operators are supported in the logic of the scipy stack, but you need python3.x for using the matrix multiplication operator @.

The general rule is that mathematical operations using the binary operators (+ - * / @) and the unary operatos (+ - abs) are only applied to the ['x', 'y', 'z'] columns.

Addition/Subtraction/Multiplication/Division: If you add a scalar to a Cartesian it is added elementwise onto the ['x', 'y', 'z'] columns. If you add a 3-dimensional vector, list, tuple... the first element of this vector is added elementwise to the 'x' column of the Cartesian instance and so on. The last possibility is to add a matrix with shape=(len(Cartesian), 3) which is again added elementwise. The same rules are true for subtraction, division and multiplication.

Matrixmultiplication: Only leftsided multiplication with a matrix of shape=(n, 3), where n is a natural number, is supported. The usual usecase is for example np.diag([1, 1, -1]) @ cartesian_instance to mirror on the x-y plane. Note that if A is the matrix which is multiplied from the left, and X is the shape=(n, 3)-matrix consisting of the ['x', 'y', 'z'] columns. The actual calculation is: (A @ X.T).T


In [14]:
(water + 3).view()

In [15]:
(get_rotation_matrix([1, 0, 0], np.radians(90)) @ water).view()
# If you use python2.x the @ operator is not supported. then you have to use xyz_functions.dot

Comparison:

The comparison operators == and != are supported and require molecules indexed in the same way:

In some cases it is better to test for numerical equality $ |a - b| < \epsilon$. This is done using allclose or isclose (elementwise)


In [16]:
water == water + 1e-15


Out[16]:
atom x y z
1 True False False False
2 True False False False
3 True False False False
4 True False False False
5 True False False False
6 True False False False

In [17]:
cc.xyz_functions.isclose(water, water + 1e-15)


Out[17]:
atom x y z
1 True True True True
2 True True True True
3 True True True True
4 True True True True
5 True True True True
6 True True True True

In [18]:
cc.xyz_functions.allclose(water, water + 1e-15)


Out[18]:
True

Symbolic evaluation

It is possible to use symbolic expressions from sympy.


In [19]:
import sympy
sympy.init_printing()
x = sympy.Symbol('x')

In [20]:
symb_water = water.copy()

In [21]:
symb_water['x'] = [x + i for i in range(len(symb_water))]

In [22]:
symb_water


Out[22]:
Cartesian
atom x y z
1 O $x$ 0.0 0.000000
2 H $x + 1$ 0.0 0.504284
3 H $x + 2$ 0.0 -0.872893
4 O $x + 3$ 0.5 0.000000
5 H $x + 4$ 0.5 0.504284
6 H $x + 5$ 0.5 -0.872893

In [23]:
symb_water.subs(x, 2)


Out[23]:
Cartesian
atom x y z
1 O 2.0 0.0 0.000000
2 H 3.0 0.0 0.504284
3 H 4.0 0.0 -0.872893
4 O 5.0 0.5 0.000000
5 H 6.0 0.5 0.504284
6 H 7.0 0.5 -0.872893

In [24]:
symb_water.subs(x, 2).view()

Alignment


In [25]:
moved = get_rotation_matrix([1, 2, 3], 1.1) @ middle + 15

In [26]:
moved.view()

In [27]:
m1, m2 = middle.align(moved)

In [28]:
cc.xyz_functions.view([m1, m2])
# If your viewer of choice does not support molden files, you have to call separately:
# m1.view()
# m2.view()

Symmetry

It is possible to detect the point group and symmetrize a molecule. Let's distort a $C_{2,v}$ symmetric molecule and symmetrize it back:


In [29]:
np.random.seed(77)
dist_molecule = small.copy()
dist_molecule += np.random.randn(len(dist_molecule), 3) / 25

In [30]:
dist_molecule.get_pointgroup(tolerance=0.1)


Out[30]:
C1

In [31]:
eq = dist_molecule.symmetrize(max_n=25, tolerance=0.3, epsilon=1e-5)

In [32]:
eq['sym_mol'].get_pointgroup(tolerance=0.1)


Out[32]:
C2v

In [33]:
a, b = small.align(dist_molecule)
a, c = small.align(eq['sym_mol'])
d1 = (a - b).get_distance_to()
d2 = (a - c).get_distance_to()

In [34]:
cc.xyz_functions.view([a, b, c])
# If your viewer of choice does not support molden files, you have to call separately:
# a.view()
# b.view()
# c.view()

As we can see, the symmetrised molecule is a lot more similar to the original molecule. The average deviation from the original positions decreased by 35 %.


In [35]:
(d1['distance'].sum() - d2['distance'].sum()) / d1['distance'].sum()


Out[35]:
$$0.34609079359070377$$

In [ ]: