In [1]:
from netflowpandasmodel import dataFactory

In [2]:
assert not dataFactory.csv.find_duplicates("csv_data")
dat = dataFactory.csv.create_tic_dat("csv_data", freeze_it=True)

dat now has the data in TicDat format (i.e. just dicts of dicts). For example -


In [3]:
dat.cost


Out[3]:
{('Pencils', 'Denver', 'Boston'): _td:{'cost': 40.0},
 ('Pencils', 'Denver', 'New York'): _td:{'cost': 40.0},
 ('Pencils', 'Denver', 'Seattle'): _td:{'cost': 30.0},
 ('Pencils', 'Detroit', 'Boston'): _td:{'cost': 10.0},
 ('Pencils', 'Detroit', 'New York'): _td:{'cost': 20.0},
 ('Pencils', 'Detroit', 'Seattle'): _td:{'cost': 60.0},
 ('Pens', 'Denver', 'Boston'): _td:{'cost': 60.0},
 ('Pens', 'Denver', 'New York'): _td:{'cost': 70.0},
 ('Pens', 'Denver', 'Seattle'): _td:{'cost': 30.0},
 ('Pens', 'Detroit', 'Boston'): _td:{'cost': 20.0},
 ('Pens', 'Detroit', 'New York'): _td:{'cost': 20.0},
 ('Pens', 'Detroit', 'Seattle'): _td:{'cost': 80.0}}

We can easily create a copy where each table is a pandas.DataFrame.


In [4]:
pandat = dataFactory.copy_to_pandas(dat)

By default, the primary key fields are represented in the index of the tables and not the columns of the tables.


In [5]:
pandat.cost


Out[5]:
cost
commodity source destination
Pencils Denver Boston 40
New York 40
Seattle 30
Detroit Boston 10
New York 20
Seattle 60
Pens Denver Boston 60
New York 70
Seattle 30
Detroit Boston 20
New York 20
Seattle 80

However, this is easy to change if you'd rather none of the columns be dropped.


In [6]:
pandat = dataFactory.copy_to_pandas(dat, drop_pk_columns=False)
pandat.cost


Out[6]:
commodity source destination cost
commodity source destination
Pencils Denver Boston Pencils Denver Boston 40
New York Pencils Denver New York 40
Seattle Pencils Denver Seattle 30
Detroit Boston Pencils Detroit Boston 10
New York Pencils Detroit New York 20
Seattle Pencils Detroit Seattle 60
Pens Denver Boston Pens Denver Boston 60
New York Pens Denver New York 70
Seattle Pens Denver Seattle 30
Detroit Boston Pens Detroit Boston 20
New York Pens Detroit New York 20
Seattle Pens Detroit Seattle 80

In [ ]: