ticdat
makes it very easy to translate data between different file formats. This facilitates creating a unittest
based test suite.
Here, we assume that we have a file named simple_data.xlsx
. We want to convert this Excel file to a text file that can be stored in a source code control system like GitHub.
The first step is to import the input_schema
from the netflow
package. input_schema
describes the input schema, and thus is the right object for converting Excel data into an .sql file. We could also store our test suite files in in .json format. (In theory, we could store Excel files directly, but in general we should store human readable files in our repository.)
In [1]:
from netflow import input_schema
The next step is is to read the simple_data.xlsx
file into a TicDat
object.
To be safe, we first check for duplicate records in the .xls file. (The other data integrity checks are done on the TicDat
object itself, when it is solved).
In [2]:
input_schema.xls.find_duplicates("simple_data.xlsx")
Out[2]:
As there are no duplicate records, we can go ahead and read the data from the .xls file without worry.
In [3]:
dat = input_schema.xls.create_tic_dat("simple_data.xlsx")
I always like to assess the table lengths that were read just as a sanity check.
In [4]:
{t:len(getattr(dat, t)) for t in input_schema.all_tables}
Out[4]:
Now we can write out the simple_data.sql
file.
In [5]:
input_schema.sql.write_sql_file(dat, "simple_data.sql")
You can now copy this file into the data directory in the example-netflow
repository.