Demonstration of how to make a test suite compatible .sql file from a real world .xls file

ticdat makes it very easy to translate data between different file formats. This facilitates creating a unittest based test suite.

Here, we assume that we have a file named simple_data.xlsx. We want to convert this Excel file to a text file that can be stored in a source code control system like GitHub.

The first step is to import the input_schema from the netflow package. input_schema describes the input schema, and thus is the right object for converting Excel data into an .sql file. We could also store our test suite files in in .json format. (In theory, we could store Excel files directly, but in general we should store human readable files in our repository.)


In [1]:
from netflow import input_schema

The next step is is to read the simple_data.xlsx file into a TicDat object.

To be safe, we first check for duplicate records in the .xls file. (The other data integrity checks are done on the TicDat object itself, when it is solved).


In [2]:
input_schema.xls.find_duplicates("simple_data.xlsx")


Out[2]:
{}

As there are no duplicate records, we can go ahead and read the data from the .xls file without worry.


In [3]:
dat = input_schema.xls.create_tic_dat("simple_data.xlsx")

I always like to assess the table lengths that were read just as a sanity check.


In [4]:
{t:len(getattr(dat, t)) for t in input_schema.all_tables}


Out[4]:
{'arcs': 6, 'commodities': 2, 'cost': 12, 'inflow': 10, 'nodes': 5}

Now we can write out the simple_data.sql file.


In [5]:
input_schema.sql.write_sql_file(dat, "simple_data.sql")

You can now copy this file into the data directory in the example-netflow repository.