Pandas objects are the main data structures used for collections of molecules. scikit-chem provides convenience functions to load objects into pandas.DataFrames from common file formats in cheminformatics.
The scikit-chem functionality is modelled after the pandas API. To load an csv file using pandas you would call:
In [1]:
df = pd.read_csv('https://archive.org/download/scikit-chem_example_files/iris.csv',
header=None); df
Out[1]:
Analogously with scikit-chem:
In [2]:
smi = skchem.read_smiles('https://archive.org/download/scikit-chem_example_files/example.smi')
Currently available:
In [3]:
[method for method in skchem.io.__dict__ if method.startswith('read_')]
Out[3]:
scikit-chem also adds convenience methods onto pandas.DataFrame objects.
In [4]:
pd.DataFrame.from_smiles('https://archive.org/download/scikit-chem_example_files/example.smi')
Out[4]:
Again, this is analogous to pandas:
In [5]:
from io import StringIO
sio = StringIO()
df.to_csv(sio)
sio.seek(0)
print(sio.read())
In [6]:
sio = StringIO()
smi.iloc[:2].to_sdf(sio) # don't write too many!
sio.seek(0)
print(sio.read())