nbconvert

nbconvert is a Jupyter command line tool that can convert Jupyter notebooks to other output formats, and also execute them before converting. It's very useful for logging "blind" processing of a notebook.

You can use it like:

jupyter nbconvert --to html --execute my_notebook.ipynb

which will execute the input cells in my_notebook.ipynb and save the entire output as HTML.

Full documentation is available from the developers.

Input filename wrangling

Unfortunately, nbconvert is a little limited to implement the bake.py usage we used to have in Databaker where you could specify filenames as it does not support passing in arguments to the notebook, e.g. so that you can change a variable, such as filename.

So, we've written a wrapper, databaker_nbconvert around this that allows you to specify a notebook filename and an input filename. The notebook and the input file should be in the same directory. The notebook filename you specify can be an absolute path, but the input file should be just the filename without any path. Simplest way is to just stick everything in one directory and run databaker_nbconvert from there; it should work as a standalone command.

Here's a very simple demo that shows this in action. We're not doing any processing of the spreadsheets here, but is only designed to show how you could switch a filename at the command line, while still being able to specify the filename within the notebook for development.


In [ ]:
import databaker.framework

databaker.framework.DATABAKER_INPUT_FILE is just a string of the filename to use. we specify the input filename that we're using within this notebook. By default, this is the file that will get used.


In [ ]:
databaker.framework.DATABAKER_INPUT_FILE = 'example1.xls'

getinputfilename() is a function that gives you back the spreadsheet filename that we've passed to databaker_nbconvert or, if that's not the case, it gives us back the DATABAKER_INPUT_FILE value specified above.

This way, we can leave f unspecified and allows us to do the following:

  • if we process the notebook here, then we will process example1.xls.

  • if we process with databaker_nbconvert with a specified spreadsheet filename, then we override the example1.xls here with whichever filename we specified to databaker_nbconvert.

(This is actually a little bit of a hack that uses operating system environment variables to pass the values in, and we wrap this in another Python script, so this is transparent to the user, and also simplifies how this works across Windows and Linux.)


In [ ]:
f = databaker.framework.getinputfilename()
print(f)

Below, you'll see the loaded XLS details. If you process this notebook with databaker_nbconvert and enter ott.xls as a spreadsheet filename, e.g.

databaker_nbconvert "nbconvert_demo.ipynb" "ott.xls"

you'll see that's what gets loaded, not the example1.xls we specified above (but is ignored).


In [ ]:
databaker.framework.loadxlstabs(f)