Jupyter Tutorial (part 3)

In this part of the Jupyter tutorial, we will describe how to customize and deploy Jupyter notebooks for use as course notes. The topics covered are:

  • Deploying Jupyter on a server.
  • Customizing the default Python startup code for Jupyter notebooks.
  • Customizing the look-and-feel of the Jupyter notebook via CSS.
  • Converting Jupyter notebooks to $\LaTeX$/PDF notes.

Deploying Jupyter for a class

The easiest way to run Jupyter is to run it on one's local machine. If you are using Jupyter for a class, this means requiring each student to download and install a Python environment along with the Jupyter software stack. A simple, foolproof method is to install the Anaconda distribution, a widely-used Python distribution containing Jupyter, Scipy, and Matplotlib, which is freely available for GNU/Linux, Mac OS, and Microsoft Windows.

Sometimes, you might not want to ask every student to install Jupyter. For instance, in my complex analysis class, I want to offer a few Jupyter notebook plots for students to play around with. But the class isn't focused on programming, and it seems overkill to ask each of the 100+ students to install the 455 MB Anaconda distribution just to interact with a few plots. For such cases, you can install Jupyter on a web server, and let the students access your notebooks over the Internet.

The main concern regarding server deployment is security: a Jupyter notebook can run arbitrary code on the server. It is very important to understand that notebook code runs on the server hosting the notebook, not the client running the web browser! So it's a very bad idea to have an unsecured Jupyter notebook that's publicly accessible over the Internet. A visitor can, in principle, use a code cell in your Jupyter notebook to run a program that exploits a privilege escalation bug to seize control of your server.

To deploy Jupyter securely on a server, you can either use access control (e.g. passwords), and using containers.

"Single-user" style access control

Jupyter has elementary support for "single user" access control. This involves having the Jupyter notebook server ask for a password before granting access. You can let your students know the password, and tell them to visit http://HOSTNAME:8888/notebooks/, where HOSTNAME is the IP address or Internet hostname of your server. Anonymous members of the public can go that address, but without the password they won't be allowed to open a notebook to run code. But be warned---those with the password will be able to run arbitrary code, including possibly seizing control of the server. So your students need to be trusted not to do that.

To deploy the Jupyter server this way, you have to specify a password and set up a SSL certification for encrypted communication. For details, follow the instructions here. Also, you will need to reconfigure your firewall to open port 8888 (the port for the Jupyter notebook), and get your server a static IP address and hostname; these steps are outside the scope of the present discussion.

JupyterHub deployment

Container deployment

The idea of container deployment is to spawn a temporary virtual machine, or "container", for each visitor to the server. The container runs a "fresh" copy of a GNU/Linux operating system whose filesystem contains just your Jupyter notebooks, along with a minimal set of software (Python, Jupyter, etc.) necessary for running the notebooks. Even if a user pwns or messes up the container, the real operating system hosting the containers is unaffected. The containers are destroyed after some time (say, a few hours after creation).

The advantage of this approach is that it is pretty tamper-proof. Even if a user (maliciously or accidentally) screws up the container, no real harm is done. A simple browser re-visit creates a new container with a fresh copy of the hosted notebooks. The disadvantage is that users won't be able to "save" anything to the server. If they want to retain any changes they make, they need to download a copy of the notebook to their own computer (via the menu option File → Download as → Notebook).

To accomplish this, we use the Docker container software, along with the code written up by the tmpnb project. For the full documentation, see the tmpnb Github page. Here are the basic steps:

  1. Install Docker.

  2. Create a directory containing the notebooks that you want to host (along with any images needed by the notebooks, etc.).

  3. Create a file named Dockerfile, in that directory. This file specifies the contents of the container image that you want to host. Here is a typical Dockerfile:

## Start with a premade container that includes a Scipy installation.
FROM jupyter/scipy-notebook

## Add some more software that's not in the premade scipy-notebook container
USER root
RUN echo deb http://ftp.debian.org/debian jessie-backports main >> /etc/apt/sources.list
RUN apt-get update && apt-get -y install ffmpeg && apt-get clean
USER $NB_USER

## Uncomment the following lines to add Python customizations to the container
## COPY nbconf.py $HOME/.ipython/profile_default/startup/
## RUN mkdir $HOME/.jupyter/custom
## COPY custom.css $HOME/.jupyter/custom/
## COPY custom.js $HOME/.jupyter/custom/

## Copy the notebooks from this directory to the container (where they will be
## stored in the $HOME/work/ directory).
COPY *.ipynb $HOME/work/

## Copy image files from the directory to the container.
RUN mkdir $HOME/work/images
COPY images/* $HOME/work/images/

USER $NB_USER
  1. In the same directory as `Dockerfile`, run the command `docker build -t MY_CONTAINER .`, replacing `MY_CONTAINER` with whatever you want your container to be called. This will generate the container, using the instructions written in `Dockerfile`.
  2. To deploy the container, run the following three commands (you might want to group them into a shell script). In the last line, replace `MY_CONTAINER` with whatever name you chose in step 4.
export TOKEN=$( head -c 30 /dev/urandom | xxd -p )
docker run --net=host -d -e CONFIGPROXY_AUTH_TOKEN=$TOKEN --name=proxy jupyter/configurable-http-proxy --default-target http://127.0.0.1:9999
docker run --net=host -d -e CONFIGPROXY_AUTH_TOKEN=$TOKEN --name=tmpnb -v /var/run/docker.sock:/docker.sock jupyter/tmpnb python orchestrate.py --image=MY_CONTAINER --cull-timeout=5400

Now you will be able to access the containerized Jupyter deployment at http://127.0.0.1:8000/notebooks/. To access it over the Internet, replace 127.0.0.1 with the IP address or hostname of your computer. You may need to reconfigure your firewall to open port 8000 for outside access. To complete the deployment, you will need to get your server a static IP address and hostname (which are topics beyond the scope of this discussion).

To halt the service, run these commands:

docker stop `docker ps -aq --filter name=tmpnb --filter name=proxy --filter name=MY_CONTAINER`
docker rm   `docker ps -aq --filter name=tmpnb --filter name=proxy --filter name=MY_CONTAINER`

Customization

Custom start-up code

You might want to specify some "start-up" code for the Jupyter notebook, to be executed before any other code in the notebook. This is done by adding Python files to the directory $HOME/.ipython/profile_default/startup/. (Here, $HOME stands for the home directory of the user running the Jupyter notebook; on Windows, it's C:\Users\USERNAME.) Any Python files in this directory are run automatically, in lexicographical order, each time the Python kernel is started (e.g., when opening a Jupyter notebook for the first time, or when using the menu option Kernel → Restart).

For server deployments, you have to make sure these start-up files are in the right locations on the server (e.g., if you're using containers, you need to explicitly copy them into the filesystem of the container, as indicated in the above discussion).

For instance, I like to apply some nice Matplotlib settings by having a file $HOME/.ipython/profile_default/startup/10-matplotlib-settings.py with the following contents:


In [2]:
import matplotlib.pyplot as plt

plt.rcParams['savefig.dpi'] = 75
plt.rcParams['figure.autolayout'] = False
plt.rcParams['figure.figsize'] = 10, 5
plt.rcParams['axes.labelsize'] = 18
plt.rcParams['axes.titlesize'] = 16
plt.rcParams['axes.linewidth'] = 2.0
plt.rcParams['ytick.major.width'] = 1.4
plt.rcParams['ytick.minor.width'] = 1.4
plt.rcParams['ytick.labelsize'] = 15
plt.rcParams['xtick.major.width'] = 1.4
plt.rcParams['xtick.minor.width'] = 1.4
plt.rcParams['xtick.labelsize'] = 15
plt.rcParams['font.size'] = 14
plt.rcParams['lines.linewidth'] = 2.0
plt.rcParams['lines.markersize'] = 8
plt.rcParams['legend.fontsize'] = 14
plt.rcParams['text.latex.preamble'] = "\\usepackage{subdepth}, \\usepackage{type1cm}"

Customizing the notebook interface

You can customize the Jupyter notebook interface using Javascript and CSS. The custom Javascript goes into the file $HOME/.jupyter/custom/custom.js, and the custom CSS goes into $HOME/.jupyter/custom/custom.css. (Here, $HOME stands for the home directory of the user running the Jupyter notebook; on Windows, it's C:\Users\USERNAME.)

For server deployments, you have to make sure these start-up files are in the right locations on the server (e.g., if you're using containers, you need to explicitly copy them into the filesystem of the container, as indicated in the above discussion).

As an example, you can use custom Javascript to add a toggle for showing/hiding the source code in code cells. This is done with the following custom.js file:

code_show=false;
function code_toggle() {
    if (code_show) {
    $('div.input').hide();
    } else {
    $('div.input').show();
    }
    code_show = !code_show
}

$([IPython.events]).on('notebook_loaded.Notebook', function() {
    $("#view_menu").append("<li id=\"toggle_input\" title=\"Show/Hide Code\"><a href=\"javascript:code_toggle()\">Show/Hide Code</a></li>")
    $('div#ipython_notebook').hide()
    $('span#save_widget').hide()
    $('span#kernel_logo_widget').hide()
    $('div.input').hide()
    // $('#notebook_panel').append(copyright)
});

Likewise, here is an example of a custom.css file for tweaking the look-and-feel of the notebook:

.output_png {
    display: table-cell;
    text-align: center;
    vertical-align: middle;
}

.container {
    width: 95% !important;
}

LaTeX conversion

Jupyter notebooks can be converted into $\LaTeX$ documents, and subsequently to pretty PDF notes. The basic command to use is jupyter nbconvert. For example, to convert jupyter_tutorial_02.ipynb (the previous notebook in this tutorial series) into $\LaTeX$, we run this command:

jupyter nbconvert --to latex jupyter_tutorial_02.ipynb

This creates the $\LaTeX$ file jupyter_tutorial_02.tex, along with the subdirectory jupyter_tutorial_02_files containing the image files required for the $\LaTeX$ compilation. These images include cached images of the interactive plots; obviously, the interactive plots themselves can't be directly converted.

If you have a working $\LaTeX$ installation, you can then compile to PDF in the usual way, by running

pdflatex jupyter_tutorial_02.tex

Typically, the automatic conversion will have some type-setting issues. Don't hesitate to make further manual edits to the .tex file to prettify the result. Some typical issues:

  • Lines of code often run off the end of the page. You may need to edit the code examples to make them fit on the printed page. Alternatively, you can reduce the text size in the code sample. To do this, find the code sample, which should begin with a Verbatim call:
    \begin{Verbatim}[commandchars=\\\{\}]
    Then change this as follows (replace \small with \footnotesize with you want even smaller text):
    \begin{Verbatim}[fontsize=\small,commandchars=\\\{\}]
  • HTML tables in the Jupyter notebooks don't get converted properly; you will have to re-do them using the $\LaTeX$ tabular or framed environments.
  • Hyperlinks to other Jupyter notebooks (.ipynb files) probably don't make sense anymore in a PDF document. You should manually remove them, possibly re-writing the surrounding text.
  • Equations written in the form $$...$$ get converted to \[...\], i.e. stand-alone un-numbered $\LaTeX$ equations. Instead, you probably want \begin{equation} ... \end{equation} blocks. Currently, I use Emacs find-and-replace macros to do this conversion. Doing it manually will probably be super-tedious, but I don't know a better solution.