In [15]:
import skchem
import pandas as pd
pd.options.display.max_rows = pd.options.display.max_columns = 10
scikit-chem expands on the scikit-learn Pipeline object to support filtering. It is initialized using a list of Transformer objects.
In [10]:
pipeline = skchem.pipeline.Pipeline([
skchem.standardizers.ChemAxonStandardizer(keep_failed=True),
skchem.forcefields.UFF(),
skchem.filters.OrganicFilter(),
skchem.descriptors.MorganFeaturizer()])
The pipeline will apply each in turn to objects, using the the highest priority function that each object implements, according to the order transform_filter > filter > transform.
For example, our pipeline can transform sodium acetate all the way to fingerprints:
In [11]:
mol = skchem.Mol.from_smiles('CC(=O)[O-].[Na+]')
In [4]:
pipeline.transform_filter(mol)
Out[4]:
It also works on collections of molecules:
In [12]:
mols = skchem.read_smiles('https://archive.org/download/scikit-chem_example_files/example.smi', name_column=1); mols
Out[12]:
In [16]:
pipeline.transform_filter(mols)
Out[16]:
In [ ]: