In [1]:
from rdkit.Chem.Draw import IPythonConsole
import logging
logger = logging.getLogger('molvs')
logger.setLevel(logging.INFO)
In [2]:
from molvs import standardize_smiles
standardize_smiles('C[n+]1c([N-](C))cccc1')
Out[2]:
While this is convenient for one-off cases, it's inefficient when dealing with multiple molecules and doesn't allow any customization of the standardization process.
The Standardizer
class provides flexibility to specify custom standardization stages and efficiently standardize multiple molecules.
In [3]:
from rdkit import Chem
import molvs
from molvs import Standardizer
In [4]:
mol = Chem.MolFromSmiles('[Na]OC(=O)c1ccc(C[S+2]([O-])([O-]))cc1')
mol
Out[4]:
In [5]:
s = Standardizer()
smol = s.standardize(mol)
smol
Out[5]:
In [6]:
Chem.MolToSmiles(smol)
Out[6]:
The Standardizer
class takes a number of initialization parameters to customize its behaviour:
In [7]:
from molvs.normalize import Normalization
norms = (
Normalization('Nitro to N+(O-)=O', '[*:1][N,P,As,Sb:2](=[O,S,Se,Te:3])=[O,S,Se,Te:4]>>[*:1][*+1:2]([*-1:3])=[*:4]'),
Normalization('Pyridine oxide to n+O-', '[n:1]=[O:2]>>[n+:1][O-:2]'),
)
my_s = Standardizer(normalizations=norms)
smol = my_s.standardize(mol)
smol
Out[7]:
Notice that the sulfone group wasn't normalized in this case, because when initializing the Standardizer we only specified two Normalizations.
The default list of normalizations is molvs.normalize.NORMALIZATIONS
.
It is possible to resuse a Standardizer instance on many molecules once it has been initialized with some parameters:
In [8]:
my_s.standardize(Chem.MolFromSmiles('C1=C(C=C(C(=C1)O)C(=O)[O-])[S](O)(=O)=O.[Na+]'))
Out[8]:
In [9]:
my_s.standardize(Chem.MolFromSmiles('[Ag]OC(=O)O[Ag]'))
Out[9]:
In [ ]: