It is a never ending story: every time I want to write an article to publish on my website, I change my blog engine instead. At some point I even created my own static website generator. All this fuzz just to avoid actually writing things to ... you know ... put on a website :). This might be excused by the fact that I really enjoy spending my leisure time tinkering with things aimlessly rather then actually producing something that might be useful, but I finally started seeing through my evil self-sabotage mechanisms and was determined to put a stop to it! So I did the natural thing: I went to my lab1 and tinkered with the engine.
I played with a lot of blog engines over the years - while never really blogging anything. I mostly work in backend development, but there is something that fascinates me about web design and web development. It's one of these things I guess :).
I am particularly fond of static website generators that provide a workflow that is similar to developing software. I played with pelican, jekyll, hugo, nikola, flask + frozen-flask and the lot. As already mentioned: I even wrote my own sphinx based generator ... while still never really blogging anything.
At the beginning of 2017 I made a deal with a colleague that I would finally write a blog article about the pytest development sprint and a bit about my involvement. It would have been boring though if I would have used the site I had already online (last incarnation was a simple mkdocs driven thing). It would also have been boring to use one of the engines I already knew. Using something utterly profane like medium or wordpress was obviously completely out of the question! I mean, I could have just written the article then and be done with it. Who wants that? Right. Not me. So I started looking around for the next thing that could keep me from writing that article and I stumbled over lektor. Now this was something that could keep me busy for a while as it is not simply a static website generator, but rather something that you can use to build a static website generator with - a website generator generator! Long story short: I set that up from scratch with a simple sass style, wrote a little plugin to integrate that into lektors development server, and finally actually wrote and published that article. Nobody ever made a deal with me again that forced me to write another article, so that was it. I had unlocked the "i-have-a-blog-but-i-never-blog"-achievement once again - only on a higher level. Until very recently.
Because very recently I realized that I had produced a lot of material while trying to teach Python and test automation to all kinds of folks over the last years. I finally wanted to start sharing some of these materials on my website. As making strange deals seems to work with my contorted psyche, I made a deal with myself to publish at least one article a month for at least a year.
This time I was determined to resist the temptation to start from scratch and resolved to adjust the existing setup to fit my new needs. The new needs arose from the fact that I work mostly in Jupyter Notebooks nowadays, when creating learning materials and I like it, so I want to write articles like that and have them integrate into my website.
If you don't know Lektor at all yet, here is the minimum amount of knowledge necessary to follow this article:
To build a page, Lektor takes a folder, a data model defined in an .ini
and a Jinja2 template. The folder for the page needs at least a contents.lr
file - a simple lektor specific text file format. This file contains data that fits the user defined data model. Usually this is some meta data about the content and the main content of the page formatted in markdown.
Lektor is extensible via a plugin system that involves creating an installable package and implementing some methods in a class inheriting from lektor.pluginsystem.Plugin
. Methods in that plugin class will be called, when Lektor emits events in different phases of the build process.
To integrate Jupyter notebooks, there is already a plugin that hasn't been worked on for a while and is more of a prof of concept. It was a good starting point though to see what it does and to decide what I want:
contents.lr
file to be changed to trigger a new page build. I want this to work correctly..ipynb
files directly to HTML using nbconvert
with the basic
template.2 I don't want to struggle with HTML and CSS to make the rendered notebook fit the style of whatever theme the web page has. I also want all the markdown features available that Lektor and a whole range of plugins offer. I also want the style of the rendered notebook to automatically fit the rest of the web page.notebook-file name without extension == page name
).Markdown
class which is used to render markdown content in contents.lr
, so I suspect that this is actually defeating the purpose instead of helping it. I want code that works and is easier to understand.lektor-jupyter-preprocess
is a Lektor plugin that does the following:
my-page
and the page folder contains a my-page.ipynb
, the contents.lr
is generated from that notebook (e.g. see the page folder for this article).%load
cell magic before executing to update dependent code from other files. contents.lr
. The rendered notebook is markdown. Lektor can then treat this as if it was a hand written contents.lr
.!!!!! 99.9%: existing ecosystem, 0.1% wrapping it to integrate it into Lektor.
Generating markdown from a notebook comes out of the box via nbconvert
- so if you take a notebook that looks like this in the browser:
... and convert it with jupyter-nbconvert --to markdown example-notebook.ipynb
, out drops an example-notebook-markdown.md
that contains this:
This is somehow already what I want, but I want the output to be marked properly and I don't want the whole traceback - just the name and message of the error is enough. So there is a little bit of massaging to be done. The question is: when should that happen? I could try to massage the generated output to my liking, but I'd rather poke a finger into my eye, so this has to happen when I can still work with the data.
Thanks to the friendly Jupyter Development Team, nbconvert
is written in a way that it is not too hard to make this possible by inheriting from ExecutePreprocessor
. This lets you hook into the execution of individual code cells and massage the contents there. So this is what I came up with:
Can anyone tell me from which film this gif is? I found it on tenor and would like to give proper credit↩
This seems to be the the usual approach though, when looking at other blog engines. For Nikola there is a theme with inbuilt Jupyter support. Same for pelican.↩
In [ ]:
%load -s ArticleExecutePreprocessor ../../../packages/lektor-jupyter-preprocess/lektor_jupyter_preprocess.py
{"metadata.execute": False}
In [ ]:
%load -s pre_process ../../../packages/lektor-jupyter-preprocess/lektor_jupyter_preprocess.py
{"metadata.execute": False}
In [ ]:
%load -s post_process ../../../packages/lektor-jupyter-preprocess/lektor_jupyter_preprocess.py
{"metadata.execute": False}
This is how it is configurable at the moment:
In [ ]:
%load -r 23-50 ../../../packages/lektor-jupyter-preprocess/lektor_jupyter_preprocess.py
{"metadata.execute": False}
The idea is to hook into the part of the conversion process where the notebook is preprocessed before the actual conversion to markdown.
In my case the necessary preprocessing means:
%load
magic: execute it to load the code from the file into the cell (always - commented out or not)The complete plugin code is in lektor_jupyter_preprocess.py
The first incarnation of this used the before-build
event that is called indiscriminately before a source is built. This caused an eternal build loop when generating the contents.lr
from a notebook. I prevented this by adding caching to detect if the notebook had changed since the last build. This worked but was ugly. I had set out to only make this work, so I decided I was finished. The next weekend though I couldn't help but having another look.
The current incarnation adds a new build program that slightly modifies the attachment build behaviour for "notebook powered" pages to preprocess the notebook as part of the normal build process. Still not knowing much about Lektor this might be less wrong, but more importantly: it works reliably without needing extra caching, and is easier to understand in the context of a build. I also like it more, because this motivated me to look a bit into how Lektor works. Which was very interesting.
These events are used in the plugin:
setup-env
is used to prepare the system on startup. This is when config values of the plugin can be populated for later use by the templates and when specialized build programs can be added.before-build
is called for all sources before it is decided if a build of that source actually needs to happen, this is why it is not very useful to do any actual preprocessing there (that then might not be necessary and create a build loop as mentioned), but it can be used to provide more context for the build templates - in the case of jupyter-preprocess
, whether the page that is about to be built is generated from a notebook or not.before-build-all
initializes an in-process cache to prevent duplicate builds if several artifact are updated at once (e.g. after a lektor clean
) - there might be a better way to do all this via Lektors dependency tracking on the context but I didn't have the chance yet to look into this closer - I also have a hunch that turning contents.lr
into a build artifact itself like the plugin is doing it, breaks a lot of assumptions and needs some special handling anyway. This is what the plugin class looks like:
In [2]:
%load -s JupyterPreprocessPlugin ../../../packages/lektor-jupyter-preprocess/lektor_jupyter_preprocess.py
{"metadata.execute": False}
In the simplest case:
<your project>/content/my-notebook-test-page/my-notebook-test-page.ipynb
lektor serve
running! Keep in mind that the complete contents.lr
gets rendered from the notebook, so you need to create the same structure like a normal contents.lr
would look like (for my use case this is just right atm, but it would be not too hard to extend that in a way that the generated markdown does not clobber the file if it finds a special marker where to put the generated markdown inside the contents.lr
).1
Additionally to generating the notebook the plugin injects the global variable JUPYTER_PREPROCESS
into the jinja template which at the moment contains:
configs/jupyter-preprocess.ini
) or a direct link to the jupyter notebook attachmentHere is how a footer with a link could look like using that data:
<footer>
{# accessing information for lektor-jupyter-preprocess plugin #}
{% if this.path in JUPYTER_PREPROCESS.paths %}
<div>
Page generated from a Jupyter notebook —
{% if JUPYTER_PREPROCESS.url_source is defined %}
<a href="{{ JUPYTER_PREPROCESS.url_source }}/{{ this.path }}">
view sources
</a>
{% else %}
<a href="{{ this.path }}">
download notebook
</a>
{% endif %}
</div>
<br>
{% endif %}
</footer>
!!! See this small example project to see how it works in the simplest case
To cap it all off: if a project hasn't got a tox.ini
that wraps all important activities of the project into a neat package it doesn't feel like a real project. So this is what tox -av
tells me about the workflow of my Lektor project (in case I don't write another article for the next three years and I come back to it and have no idea how stuff works 😉):
$ tox -av
default environments:
serve -> run custom wrapper around the lektor server
additional environments:
clean -> tidy up to start from a clean slate
build -> build the website at ../build
serve-build -> serve ../build at http://localhost:7777
serve-notebooks -> serve jupyter notebooks
deploy -> build and push master (website build) to github
test -> run tests for lebut
Now I have a reasonably pleasant workflow to turn my notebooks into website articles. I am pretty confident that I will be able to keep that deal with myself :).