Creating the DataPot object.
In [1]:
import datapot as dp
In [2]:
datapot = dp.DataPot()
In [3]:
from datapot.utils import csv_to_jsonlines
csv_to_jsonlines('../data/transactions.csv', '../data/transactions.jsonlines')
In [4]:
ftr = open('../data/transactions.jsonlines')
Let's call the fit method. It automatically finds appropriate transformers for the fields of jsonlines file. The parameter 'limit' means how many objects will be used to detect the right transformers.
In [6]:
datapot.detect(ftr, limit=100)
Out[6]:
In [7]:
datapot.fit(ftr)
Out[7]:
In [8]:
datapot
Out[8]:
Let's remove the SVDOneHotTransformer
In [9]:
datapot.remove_transformer('merchant_id', 0)
Out[9]:
In [ ]:
data = datapot.transform(ftr)
In [9]:
data.head()
Out[9]:
In [10]:
data.columns
Out[10]: