Forte Tutorial 1: Running psi4 jobs in Jupyter notebooks


In this tutorial we are going to explore how to run psi4 in Jupyter notebooks via psi4's Python API.

Import psi4

The first step necessary to run psi4 in Jupyter is to import psi4


In [ ]:
import psi4

Specify the molecular geometry

Next, we specify the molecular geometry. We will consider an hydrogen molecule with a bond distance of 1 Å and specify the geometry using the zmat format. We also specify the charge (0) and multiplicity (1 = singlet) at the very top of the geometry input.


In [ ]:
charge = 0
multp = 1
rHH = 1.0 # Ångstrom
geom = f'{charge} {multp}\nH\nH 1 {rHH:.6f}' # we use formatted strings
print(geom)

Creating the molecule object and accessing its properties

We can now pass the geometry to psi4 and build a Molecule object. (see psi4/src/libmints/molecule.h)


In [ ]:
mol = psi4.geometry(geom)

The molecule object can queried for information


In [ ]:
print(f"Number of atoms = {mol.natom()}")
print(f"Nuclear repulsion energy = {mol.nuclear_repulsion_energy()}")

and we can even list information about all functions


In [ ]:
help(mol)

Generate Hartree-Fock orbitals using psi4

Using the molecule object we can now compute the MOs using psi4. We first set the options:

  1. basis: the basis set (string)
  2. scf_type: the type of SCF computation (string). 'pk' stands for the SCF algorithm with conventional integrals

In [ ]:
# set basis/options
basis = 'cc-pVDZ'
reference = 'rhf'

psi4.core.clean()

psi4.set_options({'basis': basis,'scf_type': 'pk', 'reference' : reference})

# pipe output to the file output.dat
psi4.core.set_output_file('output.dat', False)

# run scf and return the energy and a wavefunction object (will work only if pass return_wfn=True)
E_scf, wfn = psi4.energy('scf', return_wfn=True)

In [ ]:
print(f'SCF Energy: {E_scf}')

Extracting useful information from psi4

The wavefunction object returned by psi4 is full of useful information. For example, we can ask how many orbitals are there in total


In [ ]:
print(f"Number of orbitals = {wfn.nmo()}")
print(f"Number of alpha electrons = {wfn.nalpha()}")

or we can get information about symmetry


In [ ]:
print(f"Number of irreducible representations (irreps) = {wfn.nirrep()}")
nirrep = wfn.nirrep()

In this case, psi4 detects D2h symmetry, which has eight irreducible representations. We can also find out how many orbitals there are for each irrep.


In [ ]:
# number of occupied molecular orbitals per irrep (mopi). Stored as a Dimension object
nmopi = wfn.nmopi()

# here we convert the psi4 Dimension object to a python tuple
print(f"Number of orbitals in each irreducible representation = {nmopi.to_tuple()}")

A digression on symmetry in psi4

General symmetry information can be found in the molecule object (because this information is independent of the details of the computation like basis set, method, etc). The information is found in the PointGroup object contained in the Molecule class.


In [ ]:
point_group = mol.point_group()
print(f'Point group = {point_group.symbol()}')
point_group_symbol = point_group.symbol()

With a bit of work, we can even extract the symbols associated to each irrep


In [ ]:
char_table = point_group.char_table()
for h in range(nirrep):
    print(f'Irrep {h} = {char_table.gamma(h).symbol()}')

# let's save the irrep labels
irrep_labels = [char_table.gamma(h).symbol() for h in range(nirrep)]

The irreps are arranged according to Cotton ordering. The product of two irredicible representations can be compute using the bitwise exclusive-OR operator (^). For example, the product of the B1g (1) and B2g (2) irreps is the B3g irrep


In [ ]:
irrep_labels[1 ^ 2]

and compute the group product table


In [ ]:
line = [f"{irrep_labels[h]:3s}" for h in range(nirrep)]
print(f"{point_group_symbol} group product table")
print(f"      {'  '.join(line)}\n")
for h1 in range(nirrep):
    line = [f"{irrep_labels[h1 ^ h2]:3s}" for h2 in range(nirrep)]
    print(f"{irrep_labels[h1]:3s}   {'  '.join(line)}")

Later we will need to know how the electrons are distributed in the orbitals, that is, how many orbitals are occupied in each irrep. This information is found in wfn


In [ ]:
print(f"Number of alpha electrons = {wfn.nalpha()}")
print(f"Number of beta electrons = {wfn.nbeta()}")

In [ ]:
print(f"Number of alpha electrons in each irreducible representation = {wfn.nalphapi().to_tuple()}")
print(f"Number of beta electrons in each irreducible representation = {wfn.nbetapi().to_tuple()}")

Excercises

  • Copy this notebook and modify the input to run a computation on water triplet in a linear symmetric geometry. After running the SCF procedure use the orbital occupation numbers (stored in wfn.nalphapi()/wfn.nbetapi()) to compute the symmetry of the final state ($\Gamma_\mathrm{tot}$). You can evaluate this property by computing the product of all the irreps of the occupied orbitals as $$ \Gamma_\mathrm{tot} = \prod_i^\mathrm{occ} \Gamma_i, $$ where $\Gamma_i$ is the irrep of occupied orbital $\phi_i$.
  • Repeate the computation above for the quartet state of water cation.

Appendix. Psi4 Conventions for orbitals, electrons, and bases

Orbital Dimensions

There are a number of different names used to refer to the basis set size. These may seem redundant, but they have subtly different meanings, as detailed below.

A calculation can use either pure (5D, 7F, 9G, etc.) basis functions or Cartesian (6D, 10F, 15G, etc.), as dictated by the input file / basis set specification. Also, the basis can be represented in terms of atomic orbitals (AO) or symmetry-adapted orbitals (SO). Further complications come from the fact that a nearly linearly-dependent basis set will have functions removed from it to prevent redundancies. With all of these factors in mind, here are the conventions used internally:

  • nao The number of atomic orbitals in Cartesian representation.
  • nso The number of atomic orbitals but in the pure representation if the current basis uses pure functions, number of Cartesian AOs otherwise.
  • nbf The number of basis functions, which is the same as nso.
  • nmo The number of basis functions, after projecting out redundancies in the basis.

When molecular symmetry is utilized, a small array of sizes per irrep is usually allocated on the stack, and is named by augmenting the name above with a pi (per-irrep), e.g. nmopi. Note that the number of irreps is always the singular nirrep, and that the index variable h is always used in a for-loop traverse of irreps.

Electronic Dimensions

As with basis sets, a number of names are used to refer to refer to the quantity of electrons, virtuals, and active sub-quantities of a PSI4 calculation. All of these can be defined per irrep as above. Some common conventions are:

  • nelec The number of electrons, rarely used due to specialization of alphas and betas or soccs and doccs.
  • nalpha The number of alpha electrons.
  • nbeta The number of beta electrons
  • docc The number of doubly-occupied orbitals
  • socc The number of singly-occupied orbitals (Almost always alpha, we don't like open-shell singlets much).
  • nvir The number of virtual orbitals

Common Bases

Many different working bases (the internal linear algebraic basis, not the name of the Gaussian basis) are used within PSI4, each with a unique and important purpose. It is critical to keep them all distinct to prevent weird results from occurring.

AO (Atomic Orbitals): Cartesian Gaussians (6D, 10F, etc.), (L + 1)(L + 2)/2 functions per shell of angular momentum L. The ordering of Cartesian exponents for a given L is given by the standard ordering below (MATLAB code):

ncart = (L + 1) * (L + 2) / 2;
exps = zeros(ncart,3);
index = 1;
for i = 0:L
    for j = 0:i
        lx = L - i;
        ly = i - j;
        lz = j;
        exps(index,:) = [lx ly lz];
      index = index + 1;
    end
end
  • SO (Spherical Atomic Orbitals): Pure Gaussians (5D, 7F, etc.) or Cartesian Gaussians, as determined by the user. This is typically the first layer encountered, Libmints handles the transform from AO to SO automatically. If Cartesian functions are used, the number of functions per shell remains (L + 1)(L + 2)/2, and the ordering remains the same as above. Note that the individual functions are not normalized for angular momentum as in most codes: the self-overlap of a PSI4 Cartesian D or higher function with more than one nonzero Cartesian exponent (e.g., lx = 1, ly = 1, lz = 0) will be less than one. If Spherical Harmonics are used, 2L + 1 real combinations of the spherical harmonics are built from the (L+1)(L+2)/2 Cartesian Gaussians, according to H. Schlegel and M. Frish, IJQC, 54, 83-87, 1995. Unlike Cartesian functions these functions are all strictly normalized. Note that in PSI4, the real combinations of spherical harmonic functions (see the paragraph below Eq. 15 in the Schlegel paper) are ordered as: 0, 1+, 1-, 2+, 2-, ....

  • USO (Unique Symmetry-Adapted Orbitals): Spatial symmetry-adapted combinations of SOs, blocked according to irrep. The total number of USOs is the same as the number of SOs, but the number of USOs within each irrep is usually much smaller, which can lead to significant performance improvements. Note that this basis is sometimes unfortunately referred to as the SO basis, so it's a bit context specific.

  • OSO (Orthogonal Symmetry-Adapted Orbitals): USOs orthogonalized by Symmetric or Canonical Orthogonalization. The number of OSOs may be slightly smaller than the total number of USOs, due to removal of linear dependencies via Canonical Orthogonalization. The OSOs are rarely encountered, as usually we go straight from USOs to MOs.

  • MO (Molecular Orbitals): The combination of OSOs that diagonalizes the Fock Matrix, so each basis function is a Hartree-Fock (or Kohn-Sham) molecular orbital. The number of OSOs and MOs is always the same. MOs are orthonormal.

  • LO (Localized Orbitals): Localized occupied orbitals, a different combination of the occupied molecular orbitals which enhances spatial locality. LOs do not diagonalize the occ-occ block of the Fock Matrix, but remain orthonormal to each other and the virtual space.