Jeff's initial writeup -- 2019_01_21

Comparing molecule loading using RDKit and OpenEye

It's really important that both RDKitToolkitWrapper and OpenEyeToolkitWrapper load the same file to equivalent OFFMol objects. If the ToolkitWrappers don't create the same OFFMol from a single file/external molecule representation, it's possible that the OpenForceField toolkit will assign the same molecule different parameters depending on whether OE or RDK is running on the backend(1). This notebook is designed to help us see which molecule formats can be reliably loaded using both toolkits.

The first test that is performed is loading from 3D. We have OpenEye load MiniDrugBank_tripos.mol2. RDKit can not load mol2, but can load SDF, so I've used openbabel(2) to convert MiniDrugBank_tripos.mol2 to SDF (MiniDrugBank.sdf), and then we have RDKit load that.

How do we compare whether two OFFMols are equivalent?

This is, surprisingly, an unresolved question. "Equivalent" for our purposes means "would have the same SMIRKS-based parameters applied, for any valid SMIRKS". Since we don't have an offxml file containing parameters for all valid SMIRKS (because there are infinity valid SMIRKS, and because we'd also need an infinitely large molecule test set to contain representatives of all those infinity substructures), I'd like to find some sort of test other than SMIRKS matching that will answer this question.

This notebook uses three comparison methods:

  1. graph isomorphism: Convert the molecules to NetworkX graphs where nodes are atoms (described by element, is_aromatic, stereochemistry) and the edges are bonds (described by order, is_aromatic, stereochemistry). Then test to see if the graphs are isomorphic.
  2. SMILES comparison using OpenEye: Convert each OFFMol to an OEMol, and then write them to canonical, isomeric, explicit hydrogen SMILES(3). Test for string equality.
  3. SMILES comparison using RDKit: Convert each OFFMol to an RDMol, and then write them to canonical, isomeric, explicit hydrogen SMILES. Test for string equality.

_As a note for future efforts, it's possible that, since each SMILES is also a valid SMIRKS, we could test for equality between OFFMol_1 and OFFMol_2 by seeing if OFFMol_1's SMILES is found in a SMIRKS substructure search of OFFMol2, and vice versa.

Each of these has some weaknesses.

  • Graph isomorphism may be too strict -- eg. a toolkit may mess with the kekule structure around aromatic systems, selecting different resonance forms of bond networks. This would result in a chemically equivalent structure that fails an isomorphism test involving bond orders.
  • SMILES mismatches are hard to debug. Are the original OFFMols really meaningfully different? Or is our to_rdkit. or to_openeye function not implemented correctly? Or is the mismatch due to some strange toolkit quirk or quiet sanitization that goes on?

What if RDK and OE get different representations when loading from 3D?

This isn't the end of the world. We'd just remove SDF from RDKit's ALLOWED_READ_FORMATS list(4). RDKit's main input method would then just be SMILES. But actually we should ensure that...

At a minimum, RDKit and OpenEye should be able to get the same OFFMol from the same SMILES

My initial thought here is that we could have both toolkits load molecules from a SMILES database. "I converted MiniDrugBank_tripos.mol2 to SDF so RDKit could read it, why not convert it to SMILES too?".

But conversion from 3D to SMILES requires either OpenEye or RDKit, which relies on their 3D molecule interpretation being correct... which is what we're testing here in the first place!

So, I couldn't think of a clear resolution to this problem.

For background, Shuzhe Wang did some really good foundational work on toolkit molecule perception equivalence earlier in this project. Shuzhe's successful test of forcefield parameter application equivalence (5) used OE to load a test databse, convert the molecules to SMILES, and then had RDKit load those SMILES. It then tested whether the OE-loaded molecule and the RDKit-loaded molecule got identical forcefield parameters assigned. But this didn't test whether RDKit can load from 3D, just from SMILES. And not just any SMILES -- SMILES that came from OpenEye (including any sneaky sanitizaton that it could have done when loading from 3D). And since we didn't have a toolkit-independent molecule representation then (no OFFMols), he tested whether both molecules got the same forcefield parameters applied, which suffers from the "our current SMIRKS may not be fine-grained enough to catch all meaningful molecule differences" problem.

Shuzhe's approach does highlight one way to get a test database of SMILES -- Load a 3D database using OpenEyeToolkitWrapper, then write all the resulting OFFMols to SMILES, and then have RDKitToolkitWrapper read all the SMILES and create OFFMols, then test whether both sets of OFFMols are identical.

And, in case OE's molecule santiization is making us end up with unusually clean SMILES, we can do the same test in reverse -- that is, use RDKitToolkitWrapper to load a 3D SDF, write the resulting OFFMols to SMILES, and then have OpenEyeToolkitWrapper read the SMILES, and compare the resulting OFFMols.

What this notebook does

This notebook creates four OFFMol test sets:

  1. oe_mols_from 3d: The results of calling OFFMol.from_file('MiniDrugBank_tripos.mol2', toolkit_registry=an_oe_toolkit_wrapper).

  2. rdk_mols_from 3d: The results of calling OFFMol.from_file('MiniDrugBank.sdf', toolkit_registry=a_rdk_toolkit_wrapper).

  3. rdk_mols_from_smiles_from_oe_mols_from_3d: The result of taking oe_mols_from_3d, converting them all to SMILES using OpenEyeToolkitWrapper, then reading SMILES to OFFMols using RDKitToolkitWrapper.

  4. oe_mols_from_smiles_from_rdk_mols_from_3d: The result of taking rdk_mols_from_3d, converting them all to SMILES using RDKitToolkitWrapper, then reading SMILES to OFFMols using OpenEyeToolkitWrapper.

This notebook then compares the OFFMol test sets using three methods (same as above):

  1. Graph isomorphism

  2. OFFMol.to_smiles() using OpenEyeToolkitWrapper

  3. OFFMol.to_smiles using RDKitToolkitWrapper

Results

When the notebook is run using the current codebase, the important outputs are as follows:

In comparing oe_mols_from_3d to rdk_mols_from_3d, I found 284 graph matches, 293 OE SMILES matches (26 comparisons errored), and 293 RDKit SMILES matches (7 comparisons errored) out of 346 molecules

This is the core test that we wanted to do (loading the same molecules from 3D), and these are pretty good results. We pass the strictest test (graph isomorphism) for 284 out of 346 molecules. We see the the SMILES tests are a little more forgiving, both giving 293/346 successes.

In comparing oe_mols_from_3d to rdk_mols_from_smiles_from_oe_mols_from_3d, I found 293 graph matches, 293 OE SMILES matches (26 comparisons errored), and 332 RDKit SMILES matches (0 comparisons errored) out of 365 molecules
In comparing oe_mols_from_smiles_from_rdk_mols_from_3d to rdk_mols_from_3d, I found 310 graph matches, 219 OE SMILES matches (0 comparisons errored), and 304 RDKit SMILES matches (7 comparisons errored) out of 319 molecules

Conclusions

Based on this test, we can expect identical parameterization between RDKit and OpenEye ToolkitWrappers at least 82% (284/346) of the time. The real number is likely to be higher, as the SMILES comparisons are more realistic measures of identity. Also, this limit assumes infinitely specific SMIRKS.

Passing molecules FROM rdkit TO OE is troublesome largely because rdkit won't recognize stereo around trivalent nitrogens, whereas OE will. See discussion about "how much stereochemistry info do we want specified" here: https://github.com/openforcefield/openforcefield/issues/146

Further work

There is low-hanging fruit that will improve reliability.

To continue debugging differences between OE and RDK, consider setting the "verbose" argument to compare_mols_using_smiles to True in the final cell of this workbook. Or, look at the molecule loading errors during dataset construction and add logic to catch those cases.

Footnotes

(1) Some molecule perception differences will be covered up by the fact that our SMIRKSes are somewhat coarse-grained currently (one SMIRKS can cover a lot of chemical space). But, this won't always be the case, as we may add finer-grained SMIRKS later. So, taking care of toolkit perception differences now can keep us from getting "surprised" in the future.

(2)conda list --> openbabel 1!2.4.0 py37_2 omnia

(3) OFFMol.to_smiles() actually operates by converting the OFFMol to an OEMol or RDMol (depending on which toolkit is available), and then uses the respective toolkit to create a SMILES. You can control which toolkit is used by instantialing an openforcefield.utils.toolkits.OpenEyeToolkitWrapper or RDKitToolkitWrapper, and passing it as the toolkit_registry argument to to_smiles (eg. OFFMol.to_smiles(toolkit_registry=my_toolkit_wrapper))

(4) If they really wanted to load from SDF using RDKit, a determined OFF toolkit user could just make an RDMol manually from SDK (Chem.MolSupplierFromFile() or something like that), and then run the molecules through RDKitToolkitWrapper.from_rdkit() to get OFFMol. However, this would put the burden of correct chemical perception on them.

(5) https://github.com/openforcefield/openforcefield/blob/swang/examples/forcefield_modification/RDKitComparison.ipynb, search for molecules =


In [1]:
from openforcefield.utils.toolkits import RDKitToolkitWrapper, OpenEyeToolkitWrapper, UndefinedStereochemistryError
from openforcefield.utils import get_data_file_path
import difflib

OETKW = OpenEyeToolkitWrapper()
RDKTKW = RDKitToolkitWrapper()

Load molecules from SDF using RDKit


In [2]:
from openforcefield.topology.molecule import Molecule
from rdkit import Chem


rdk_mols_from_3d = Molecule.from_file(get_data_file_path('molecules/MiniDrugBank.sdf'), 
                                      toolkit_registry = RDKTKW,
                                      file_format='sdf',
                                      allow_undefined_stereo=True)

# Known loading problems (numbers mean X: "DrugBank_X"):
# Pre-condition violations ("Stereo atoms should be specified...") for 3787 and 2684 are due to R-C=C=C-R motifs
# Sanitization errors: 
#     Nitro : 2799, 5415, 472, 794, 3739, 1570, 1802, 4865, 2465
#     C#N : 3046, 3655, 1594, 6947, 2467
#     C-N(=O)(C) : 6353
#     Complicated aromatic situation with nitrogen?: 1659, 1661(?), 7049
#     Unknown: 4346


DrugBank_2799 Sanitization error: Explicit valence for atom # 21 N, 4, is greater than permitted
DrugBank_5415 Sanitization error: Explicit valence for atom # 12 N, 4, is greater than permitted
DrugBank_3046 Sanitization error: Explicit valence for atom # 3 N, 4, is greater than permitted
DrugBank_472 Sanitization error: Explicit valence for atom # 16 N, 4, is greater than permitted
WARNING: RDMol has unspecified stereochemistry
RDMol name: DrugBank_5847
DrugBank_794 Sanitization error: Explicit valence for atom # 10 N, 4, is greater than permitted
DrugBank_3655 Sanitization error: Explicit valence for atom # 4 N, 4, is greater than permitted
DrugBank_3739 Sanitization error: Explicit valence for atom # 16 N, 4, is greater than permitted
Unable to check stereochemistry for DrugBank_3787. Original error:
Pre-condition Violation
	Stereo atoms should be specified before specifying CIS/TRANS bond stereochemistry
	Violation occurred on line 285 in file Code/GraphMol/Bond.h
	Failed Expression: what <= STEREOE || getStereoAtoms().size() == 2
	RDKIT: 2018.09.1
	BOOST: 1_65_1

WARNING: RDMol has unspecified stereochemistry
RDMol name: DrugBank_3787
DrugBank_6353 Sanitization error: Explicit valence for atom # 18 N, 4, is greater than permitted
DrugBank_1570 Sanitization error: Explicit valence for atom # 6 N, 4, is greater than permitted
DrugBank_1594 Sanitization error: Explicit valence for atom # 5 N, 4, is greater than permitted
DrugBank_1659 Sanitization error: Explicit valence for atom # 28 N, 4, is greater than permitted
DrugBank_1661 Sanitization error: Explicit valence for atom # 7 N, 4, is greater than permitted
DrugBank_4346 Sanitization error: Explicit valence for atom # 35 O, 3, is greater than permitted
DrugBank_1802 Sanitization error: Explicit valence for atom # 36 N, 4, is greater than permitted
DrugBank_6947 Sanitization error: Explicit valence for atom # 6 N, 4, is greater than permitted
WARNING: RDMol has unspecified stereochemistry
RDMol name: DrugBank_4662
DrugBank_7049 Sanitization error: Explicit valence for atom # 11 N, 4, is greater than permitted
DrugBank_4865 Sanitization error: Explicit valence for atom # 10 N, 4, is greater than permitted
DrugBank_2465 Sanitization error: Explicit valence for atom # 7 N, 4, is greater than permitted
DrugBank_2467 Sanitization error: Explicit valence for atom # 19 N, 4, is greater than permitted
Unable to check stereochemistry for DrugBank_2684. Original error:
Pre-condition Violation
	Stereo atoms should be specified before specifying CIS/TRANS bond stereochemistry
	Violation occurred on line 285 in file Code/GraphMol/Bond.h
	Failed Expression: what <= STEREOE || getStereoAtoms().size() == 2
	RDKIT: 2018.09.1
	BOOST: 1_65_1

WARNING: RDMol has unspecified stereochemistry
RDMol name: DrugBank_2684

Load 3D molecules from Mol2 using OpenEye


In [3]:
# NBVAL_SKIP
from openforcefield.topology import Molecule

#molecules = Molecule.from_file(get_data_file_path('molecules/MiniDrugBank.sdf'), 
oe_mols_from_3d = Molecule.from_file(get_data_file_path('molecules/MiniDrugBank_tripos.mol2'), 
                                     toolkit_registry = OETKW,
                                     file_format='sdf',
                                     allow_undefined_stereo=False)


Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): DrugBank_2987
Problematic atoms are:
Atom atomic num: 7, name: N4, idx: 20, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C5, idx: 4, aromatic: True, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C17, idx: 16, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 8, name: O8, idx: 28, aromatic: False, chiral: False

Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): DrugBank_3502
Problematic atoms are:
Atom atomic num: 7, name: N4, idx: 20, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C5, idx: 4, aromatic: True, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C17, idx: 16, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 8, name: O8, idx: 28, aromatic: False, chiral: False

Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): DrugBank_4161
Problematic atoms are:
Atom atomic num: 7, name: N6, idx: 21, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C13, idx: 12, aromatic: True, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C16, idx: 15, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 7, name: N4, idx: 19, aromatic: False, chiral: False

Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): DrugBank_4162
Problematic atoms are:
Atom atomic num: 7, name: N6, idx: 21, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C14, idx: 13, aromatic: True, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C16, idx: 15, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 7, name: N4, idx: 19, aromatic: False, chiral: False

Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): DrugBank_6531
Problematic atoms are:
Atom atomic num: 7, name: N5, idx: 23, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C3, idx: 2, aromatic: True, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C15, idx: 14, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C19, idx: 18, aromatic: False, chiral: False

Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): DrugBank_1700
Problematic atoms are:
Atom atomic num: 7, name: N1, idx: 9, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C2, idx: 1, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C4, idx: 3, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C8, idx: 7, aromatic: False, chiral: True

Convert each OE-derived OFFMol to SMILES, and then use RDKitToolkitWrapper to read the SMILES.


In [4]:
# NBVAL_SKIP
rdk_mols_from_smiles_from_oe_mols_from_3d = []
for oe_mol in oe_mols_from_3d:
    if oe_mol is None:
        continue
    smiles = oe_mol.to_smiles(toolkit_registry=OETKW)
    try:
        new_mol = Molecule.from_smiles(smiles, 
                                       toolkit_registry=RDKTKW,
                                       hydrogens_are_explicit=True
                                      )
        new_mol.name = oe_mol.name
        rdk_mols_from_smiles_from_oe_mols_from_3d.append(new_mol)
    except Exception as e:
        print(smiles)
        print(e)
        print()
        pass

Convert each RDKit-derived OFFMol to SMILES, and then use OpeneyeToolkitWrapper to read the SMILES.


In [5]:
oe_mols_from_smiles_from_rdk_mols_from_3d = []
for rdk_mol in rdk_mols_from_3d:
    try:
        smiles = rdk_mol.to_smiles(toolkit_registry=RDKTKW)
        new_mol = Molecule.from_smiles(smiles,
                                       toolkit_registry=OETKW,
                                       hydrogens_are_explicit=True
                                      )
        new_mol.name = rdk_mol.name
        oe_mols_from_smiles_from_rdk_mols_from_3d.append(new_mol)
    except UndefinedStereochemistryError as e:
        print(smiles)
        print(e)
        print()
        pass


[H][O][C](=[O])[C]([H])([H])[N]([H])[C](=[O])[C@]([H])([N]([H])[C](=[O])[C]([H])([H])[C]([H])([H])[C@@]([H])([C](=[O])[O][H])[N]([H])[H])[C]([H])([H])[S][C@]([H])([O][H])[N]([O][H])[C]1=[C]([H])[C]([H])=[C]([I])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 40, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 36, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 8, name: , idx: 41, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 43, aromatic: True, chiral: False\n')

[H][C]1=[C]([H])[C]2=[C]([C]([H])=[C]1[Cl])[C]([H])=[C]([C](=[O])[N]([H])[C@@]1([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C@@]1([H])[N]([H])[C](=[O])[C]1=[N][C]3=[C]([S]1)[C]([H])([H])[N]([C]([H])([H])[H])[C]([H])([H])[C]3([H])[H])[N]2[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 45, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 42, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 46, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 50, aromatic: False, chiral: False\n')

[H][O][C](=[O])[C]([H])([H])[N]([H])[C](=[O])[C@]([H])([N]([H])[C](=[O])[C]([H])([H])[C]([H])([H])[C@@]([H])([C](=[O])[O][H])[N]([H])[H])[C]([H])([H])[S][C@]([H])([O][H])[N]([O][H])[C]1=[C]([H])[C]([H])=[C]([Br])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 40, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 36, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 8, name: , idx: 41, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 43, aromatic: True, chiral: False\n')

[H][O][C](=[O])[C]([H])([H])[C@]([H])([O][H])[C]([H])([H])[C@]([H])([O][H])/[C]([H])=[C](\[H])[C]1=[C]([C]2=[C]([H])[C]([H])=[C]([F])[C]([H])=[C]2[H])[N]=[C]([N]([C]([H])([H])[H])[S](=[O])(=[O])[C]([H])([H])[H])[N]=[C]1[C]([H])([C]([H])([H])[H])[C]([H])([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 37, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 36, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 38, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 42, aromatic: False, chiral: False\n')

[H][O][C@]1([H])[C]2=[C]([S][C]([S](=[O])(=[O])[N]([H])[H])=[C]2[H])[S](=[O])(=[O])[N]([C]([H])([H])[C]2=[C]([H])[C]([H])=[C]([H])[S]2)[C]1([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 19, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 16, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 20, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 31, aromatic: False, chiral: False\n')

[H][C]1=[C]([H])[C]([S](=[O])(=[O])[N]([H])[H])=[C]([H])[C]([H])=[C]1[N]1[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[S]1(=[O])=[O]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 16, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 15, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 29, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 17, aromatic: False, chiral: False\n')

[H][O][C]1=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]1[O][C]1=[N][C]([H])=[C]([H])[C]([C]2=[C]([C]3=[C]([H])[C]([H])=[C]([F])[C]([H])=[C]3[H])[C](=[O])[O][N]2[C]([H])([H])[O][C]([H])([H])[C]([H])([H])[H])=[N]1
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 36, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 8, name: , idx: 35, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 20, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 37, aromatic: False, chiral: False\n')

[H][C]1=[C]([H])[C]([C](=[O])[C]2=[C]([H])[C]([H])=[C]([O][C]([H])([H])/[C]([H])=[C](\[H])[C]([H])([H])[N]([C]([H])([H])[H])[C]3([H])[C]([H])([H])[C]3([H])[H])[C]([H])=[C]2[H])=[C]([H])[C]([H])=[C]1[Br]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 24, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 21, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 25, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 29, aromatic: False, chiral: False\n')

[H][O][N]([H])[C](=[O])[C@]([H])([N]([C]([H])([H])[C]1=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]1[H])[P@](=[O])([C]1=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]1[H])[C]([H])([H])[H])[C]([H])([H])[C]([H])([C]([H])([H])[H])[C]([H])([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 8, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 6, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 9, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 15, name: , idx: 23, aromatic: False, chiral: True\n')

[H][C]1=[N][N]=[C]([H])[N]1[N]([C]1=[C]([H])[C]([H])=[C]([C]#[N])[C]([H])=[C]1[H])[C]([H])([H])[C]1=[C]([H])[C]([Br])=[C]([O][S](=[O])(=[O])[N]([H])[H])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 7, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 7, name: , idx: 6, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 8, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 20, aromatic: False, chiral: False\n')

[H][C]1=[N][N]=[C]([H])[N]1[N]([C]1=[C]([H])[C]([H])=[C]([C]#[N])[C]([H])=[C]1[H])[C]([H])([H])[C]1=[C]([H])[C]([H])=[C]([O][S](=[O])(=[O])[N]([H])[H])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 7, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 7, name: , idx: 6, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 8, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 20, aromatic: False, chiral: False\n')

[H][O][P@](=[O])([O][C@@]1([H])[C]([H])([H])[C@@]2([H])[N]([C]([H])([H])[H])[C@]([H])([C]([H])([H])[C]2([H])[H])[C@@]1([H])[C](=[O])[O][C]([H])([H])[H])[C]1=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 12, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 10, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 13, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 17, aromatic: False, chiral: True\n')

[H][C]1=[C]([O][C]([H])([H])[H])[C]([O][C]([H])([H])[H])=[C]([O][C]([H])([H])[H])[C]([H])=[C]1[N]([C]([H])([H])[H])[C]([H])([H])[C@]1([H])[C]([H])([H])[C]2=[C]([N]([H])[C@@]([H])([N]([H])[H])[N]=[C]2[N]([H])[H])[C]([H])([H])[C]1([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 23, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 22, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 24, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 28, aromatic: False, chiral: False\n')

[H][O][C@@]1([H])[C@]([H])([C]([H])([H])[O][P](=[O])([OH])[OH])[O][C@@]([H])([N]2[C]([OH])=[C]([C](=[O])[N]([H])[H])[N+]([H])([H])[C]2([H])[H])[C@]1([H])[O][H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 17, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 15, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 29, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 18, aromatic: False, chiral: False\n')

[H][C]1=[C]([H])[C]([H])=[C]2[C](=[C]1[H])[C]1=[C]([N]2[C]([H])([H])[C]([H])([H])[C]([H])([H])[N]([H])[S](=[O])(=[O])[C]([H])([C]([H])([H])[H])[C]([H])([H])[H])[C@@]2([H])[N]([C]([H])([H])[C]1=[O])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]2([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 39, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 37, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 40, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 45, aromatic: False, chiral: False\n')

[H][C]1=[N][C]([H])=[C]([H])[C]([C]2=[C]([H])[C]([N]3[C]([H])([H])[C]([H])([H])[N]([H])[C@@]([H])([C]([H])([H])[C]4=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]4[H])[C]3([H])[H])=[N][N]=[C]2[C]2=[C]([H])[C]3=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]3[C]([H])=[C]2[H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 12, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 11, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 37, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 13, aromatic: False, chiral: False\n')

[H][O][C]1=[C]([C](=[O])/[N]=[C]2\[C]([H])=[C]([C]([H])([H])[H])[O][N]2[H])[N]([C]([H])([H])[H])[S](=[O])(=[O])[C]2=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]12
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 18, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 3, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 19, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 23, aromatic: False, chiral: False\n')

[H][O][C]1([H])[N]([C]([H])([H])[C]2=[C]([H])[C]([N]([H])[H])=[C]([H])[C]([H])=[C]2[H])[C@@]([H])([C]([H])([H])[C]2=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]2[H])[C@]([H])([O][H])[C@@]([H])([O][H])[C@]([H])([C]([H])([H])[C]2=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]2[H])[N]1[C]([H])([H])[C]1=[C]([H])[C]([N]([H])[H])=[C]([H])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 4, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 2, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 5, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 21, aromatic: False, chiral: True\nAtom atomic num: 7, name: , idx: 61, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 45, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 2, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 62, aromatic: False, chiral: False\n')

[H][C]1=[C]([H])[C]([H])=[C]([C]([H])([H])[N]2[C](=[O])[C]([H])([H])[O][C@@]([H])([C]3=[C]([H])[C]4=[C]([C]([H])=[C]3[H])[N]([C]([H])([H])[H])[S](=[O])(=[O])[C]4([H])[H])[C]2([H])[H])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 28, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 23, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 29, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 33, aromatic: False, chiral: False\n')

[H][C]1=[C]([Cl])[C]([H])=[C]([H])[C]([N]([H])[C](=[O])[C](=[O])[N]([H])[C@@]2([H])[C]([H])([H])[C]([H])([H])[C@]([H])([C](=[O])[N]([C]([H])([H])[H])[C]([H])([H])[H])[C]([H])([H])[C@@]2([H])[N]([H])[C](=[O])[C]2=[N][C]3=[C]([S]2)[C]([H])([H])[N]([C]([H])([H])[H])[C]([H])([H])[C]3([H])[H])=[N]1
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 55, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 52, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 56, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 60, aromatic: False, chiral: False\n')

[H][N]([C](=[O])[N]([N]([C]([H])([H])[C]([H])([H])[Cl])[S](=[O])(=[O])[C]([H])([H])[H])[S](=[O])(=[O])[C]([H])([H])[H])[C]([H])([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 5, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 7, name: , idx: 4, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 6, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 13, aromatic: False, chiral: False\n')

[H][O][P](=[O])([O][H])[N]([H])[C]1=[N][C](/[C](=[N]/[O][C]([H])([H])[C]([H])([H])[H])[C](=[O])[N]([H])[C@]2([H])[C](=[O])[N]3[C]([C](=[O])[OH])=[C]([S][C]4=[N][C]([C]5=[C]([H])[CH]([H])[N]([C]([H])([H])[H])[C]([H])=[C]5[H])=[C]([H])[S]4)[C]([H])([H])[S][C@@]32[H])=[N][S]1
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 44, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 42, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 45, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 49, aromatic: False, chiral: False\n')

[H][O][P](=[O])([O][H])[N]1[C](=[O])[N]=[C]([C]([H])([H])[N]2[C]([H])([H])[C]([H])([H])[O][C@]([H])([O][C@@]([H])([C]3=[C]([H])[C]([C]([F])([F])[F])=[C]([H])[C]([C]([F])([F])[F])=[C]3[H])[C]([H])([H])[H])[C@]2([H])[C]2=[C]([H])[C]([H])=[C]([F])[C]([H])=[C]2[H])[N]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 14, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 11, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 48, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 15, aromatic: False, chiral: False\n')

[H][O][N]([H])[C](=[O])[C@]1([H])[N]([S](=[O])(=[O])[C]2=[C]([H])[C]([H])=[C]([O][C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[H])[C]([H])=[C]2[H])[C]([H])([H])[C@@]([H])([N]([H])[S](=[O])(=[O])[C]([H])([H])[H])[C]1([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 8, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 6, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 9, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 36, aromatic: False, chiral: False\n')

[H][O][C]([H])([H])[C@]1([H])[O][C@@]([H])([N]2[C]3=[C]([C]([N]([H])[H])=[N][C]([H])=[N]3)[C@]([H])([I])[N]2[H])[C@]([H])([O][H])[C@@]1([H])[O][H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 10, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 8, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 7, name: , idx: 24, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 11, aromatic: True, chiral: False\n')

[H][O][C](=[O])[C]([H])([H])[N]([C]([H])([H])[H])[C]([H])([N]([H])[H])[N]([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 7, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 4, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 8, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 12, aromatic: False, chiral: False\n')

[H][O][C]([H])([H])[C@]([H])([O][S]([OH])([OH])[OH])[C@@]([H])([O][H])[C]([H])([H])[N]1[C]([H])([H])[C@@]([H])([O][H])[C@]([H])([O][H])[C@@]1([H])[C]([H])([H])[O][H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 19, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 16, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 31, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 20, aromatic: False, chiral: False\n')

[H][N]([H])[C@]1([H])[N]([H])[C@@]2([H])[C]([H])([H])[C]([H])([H])[C@]([H])([C]([H])([H])[N]3[C]([H])([H])[C]([H])([H])[C@@]4([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C@]34[H])[C]([H])([H])[C@@]2([H])[C@]([H])([N]([H])[H])[N]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 20, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 17, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 41, aromatic: False, chiral: True\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 21, aromatic: False, chiral: False\n')

[H][C]1=[C]([H])[C]2=[C]([H])[C](=[C]1[H])[C]([H])([H])[C@]([N]([H])[H])([C]([H])([H])[H])[C](=[O])[O][C]([H])([H])[C]1=[C]([H])[C](=[N][C]([N]([C]([H])([H])[H])[S](=[O])(=[O])[C]([H])([H])[H])=[C]1[H])[C]([H])([H])[C]([H])([H])[C@]([H])([C]([H])([H])[H])[C]([H])([H])[C]2([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 33, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 32, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 34, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 38, aromatic: False, chiral: False\n')

[H][O][C]1=[C]([C]2=[C]([H])[C]([O][H])=[C]([O][H])[C]([H])=[C]2[H])[O][C]2=[C]([N]3[C]([H])([H])[C]([H])([H])[C]([H])([H])[S]3(=[O])=[O])[C]([H])=[C]([C]([H])([H])[H])[C]([H])=[C]2[C]1=[O]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 20, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 19, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 30, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 21, aromatic: False, chiral: False\n')

[H][O]/[N]=[C](\[N]([H])[H])[N]1[C]([H])([H])[C]([H])([H])[C]([H])([C]([H])([H])[N]([H])[C](=[O])[C@@]2([H])[N]3[C](=[O])[C]([H])([H])[N]([S](=[O])(=[O])[C]([H])([H])[C]4=[C]([H])[C]([H])=[C]([H])[C]([H])=[C]4[H])[C]([H])([H])[C@]3([H])[C]([H])([H])[C]2([H])[H])[C]([H])([H])[C]1([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 31, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 28, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 32, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 49, aromatic: False, chiral: False\n')

[H][C]1=[C]([H])[C]([N]2[C]([H])([H])[C]([H])([H])[C]([H])([H])[S]2(=[O])=[O])=[C]([H])[C]2=[C]1[N]([H])[N]=[C]2[N]([H])[C](=[O])[C]([H])([H])[C]1=[C]([H])[C]([H])=[C]([N]2[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]2([H])[H])[C]([H])=[C]1[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 5, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 4, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 15, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 6, aromatic: False, chiral: False\n')

[H][O][C]([H])([H])[C]([H])([H])[N]([C]1=[C]([C]2([H])[C]([H])([H])[C]2([H])[H])[C]([H])=[C]2[C](=[C]1[H])[O][C]([C]1=[C]([H])[C]([H])=[C]([F])[C]([H])=[C]1[H])=[C]2[C](=[O])[N]([H])[C]([H])([H])[H])[S](=[O])(=[O])[C]([H])([H])[H]
(UndefinedStereochemistryError(...), 'Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): \nProblematic atoms are:\nAtom atomic num: 7, name: , idx: 8, aromatic: False, chiral: True with bonds:\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 5, aromatic: False, chiral: False\nbond order: 1, chiral: False to atom atomic num: 6, name: , idx: 9, aromatic: True, chiral: False\nbond order: 1, chiral: False to atom atomic num: 16, name: , idx: 47, aromatic: False, chiral: False\n')

Define a few helper functions for molecule comparison


In [6]:
def print_smi_difference(mol_name, 
                         smi_1, smi_2, 
                         smi_1_label='OE: ',
                         smi_2_label='RDK:'):
    print(mol_name)
    print(smi_1_label, smi_1)
    print(smi_2_label, smi_2)
    differences = list(difflib.ndiff(smi_1, smi_2))
    msg = ''
    for i,s in enumerate(differences):
        if s[0]==' ': 
            continue
        elif s[0]=='-':
            msg += u'Delete "{}" from position {}\n'.format(s[-1],i)
        elif s[0]=='+':
            msg += u'Add "{}" to position {}\n'.format(s[-1],i)  
    # Sometimes the diffs get really big. Skip printing in those cases
    if msg.count('\n') < 5:
        print(msg)
    print()
    
    
def compare_mols_using_smiles(mol_1, mol_2, 
                              toolkit_wrapper, 
                              mol_1_label, 
                              mol_2_label, 
                              verbose=False):
    mol_1_smi = mol_1.to_smiles(toolkit_registry=toolkit_wrapper)
    mol_2_smi = mol_2.to_smiles(toolkit_registry=toolkit_wrapper)
    if mol_1_smi == mol_2_smi:
        return True
    else:
        if verbose:
            print_smi_difference(mol_1.name,
                                 mol_1_smi, mol_2_smi, 
                                 smi_1_label=mol_1_label, 
                                 smi_2_label=mol_2_label)
        return False

    
def compare_mols_using_nx(mol_1, mol_2):
    return mol_1 == mol_2

And perform the actual comparisons


In [7]:
# NBVAL_SKIP
mol_sets_to_compare = ((('oe_mols_from_3d', oe_mols_from_3d), 
                        ('rdk_mols_from_3d', rdk_mols_from_3d)),
                       #(('oe_mols_from_smiles', oe_mols_from_smiles), 
                       # ('rdk_mols_from_smiles', rdk_mols_from_smiles)),
                       (('oe_mols_from_3d', oe_mols_from_3d), 
                        ('rdk_mols_from_smiles_from_oe_mols_from_3d', rdk_mols_from_smiles_from_oe_mols_from_3d)),
                       (('oe_mols_from_smiles_from_rdk_mols_from_3d', oe_mols_from_smiles_from_rdk_mols_from_3d),
                        ('rdk_mols_from_3d', rdk_mols_from_3d)),
                      )


for (set_1_name, mol_set_1), (set_2_name, mol_set_2) in mol_sets_to_compare:
    set_1_name_to_mol = {mol.name:mol for mol in mol_set_1 if not(mol is None)}
    set_2_name_to_mol = {mol.name:mol for mol in mol_set_2 if not(mol is None)}
    names_in_common = set(set_1_name_to_mol.keys()) & set(set_2_name_to_mol.keys())
    print()
    print()
    print()
    print('There are {} molecules in the {} set'.format(len(mol_set_1), set_1_name))
    print('There are {} molecules in the {} set'.format(len(mol_set_2), set_2_name))
    print('These sets have {} molecules in common'.format(len(names_in_common)))
    
    graph_matches = 0
    rdk_smiles_matches = 0
    oe_smiles_matches = 0
    
    errored_graph_comparisons = 0
    errored_rdk_smiles_comparisons = 0
    errored_oe_smiles_comparisons = 0
    
    for name in names_in_common:
        set_1_mol = set_1_name_to_mol[name]
        set_2_mol = set_2_name_to_mol[name]
        nx_match = compare_mols_using_nx(set_1_mol, set_2_mol)
        if nx_match:
            graph_matches += 1
            
        try:
            rdk_smi_match = compare_mols_using_smiles(set_1_mol, set_2_mol, 
                                                      RDKTKW, 
                                                      'OE--(RDKTKW)-->SMILES: ','RDK--(RDKTKW)-->SMILES:',
                                                      verbose = False)
            if rdk_smi_match:
                rdk_smiles_matches += 1
        except Exception as e:
            errored_rdk_smiles_comparisons += 1
            print(e)
        try:
        #if 1:
            oe_smi_match = compare_mols_using_smiles(set_1_mol, set_2_mol, 
                                                      OETKW, 
                                                      'OE--(OETKW)-->SMILES: ','RDK--(OETKW)-->SMILES:',
                                                      verbose = False)
            if oe_smi_match:
                oe_smiles_matches += 1
        except:
            errored_oe_smiles_comparisons += 1
                
    print("In comparing {} to {}, I found {} graph matches, {} OE SMILES matches ({} comparisons errored), and " \
          "{} RDKit SMILES matches ({} comparisons errored) out of {} molecules".format(set_1_name, 
                                                                                        set_2_name, 
                                                                                        graph_matches, 
                                                                                        rdk_smiles_matches, 
                                                                                        errored_rdk_smiles_comparisons,
                                                                                        oe_smiles_matches, 
                                                                                        errored_oe_smiles_comparisons,
                                                                                        len(names_in_common)))




There are 371 molecules in the oe_mols_from_3d set
There are 352 molecules in the rdk_mols_from_3d set
These sets have 346 molecules in common
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
In comparing oe_mols_from_3d to rdk_mols_from_3d, I found 284 graph matches, 293 OE SMILES matches (26 comparisons errored), and 293 RDKit SMILES matches (7 comparisons errored) out of 346 molecules



There are 371 molecules in the oe_mols_from_3d set
There are 365 molecules in the rdk_mols_from_smiles_from_oe_mols_from_3d set
These sets have 365 molecules in common
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
'_CIPCode'
In comparing oe_mols_from_3d to rdk_mols_from_smiles_from_oe_mols_from_3d, I found 293 graph matches, 293 OE SMILES matches (26 comparisons errored), and 332 RDKit SMILES matches (0 comparisons errored) out of 365 molecules



There are 319 molecules in the oe_mols_from_smiles_from_rdk_mols_from_3d set
There are 352 molecules in the rdk_mols_from_3d set
These sets have 319 molecules in common
In comparing oe_mols_from_smiles_from_rdk_mols_from_3d to rdk_mols_from_3d, I found 310 graph matches, 219 OE SMILES matches (0 comparisons errored), and 304 RDKit SMILES matches (7 comparisons errored) out of 319 molecules