Special DocOnce features for Jupyter notebooks

Hans Petter Langtangen, Simula and University of Oslo

Date: May 25, 2015

DocOnce enables turning book chapters, manuals, research papers, and in fact any type of document into Jupyter Notebooks (formerly knowns as IPython Notebooks). This note outlines some special features that one should be aware of and that can be used to tune the notebook typesetting.

Interactive sessions

By default, interactive Python sessions in !bc pyshell and !bc ipy environments are (for the ipynb format) split such that the output is removed and each input part is a separate cell. This means that when executing all the cells, one recreates the entire interactive session with all the output. Below is an example.

Solving the world's simplest differential equation

Let us explore SymPy to solve

$$ y' = y,\quad y(0)=y_0 = 2\thinspace . $$



In [1]:

    
from sympy import *
init_printing(use_latex=True)  # find the best method to print
t = symbols('t', real=True, positive=True)
y = symbols('y', cls=Function)
# Solve differential equation using dsolve
eq = diff(y(t), t) - y(t)
print eq



In [2]:

    
sol = dsolve(eq)
print sol



In [3]:

    
y = sol.rhs          # grab right-hand side of equation
# Determine integration constant C1 from initial condition
C1 = symbols('C1')
y0 = 2
eq = y.subs(t, 0) - y0  # equation for initial condition
print eq



In [4]:

    
sol = solve(eq, C1)     # solve wrt C1
print sol



In [5]:

    
y = y.subs(C1, sol[0])  # insert C1=2 in solution
print y



In [6]:

    
print latex(y)

The DocOnce input syntax of the first part of the above session looks like this

    !bc pyshell
    >>> from sympy import *
    >>> t = symbols('t', real=True, positive=True)
    >>> y = symbols('y', cls=Function)
    >>> # Solve differential equation using dsolve
    >>> eq = diff(y(t), t) - y(t)
    >>> print eq
    -y(t) + Derivative(y(t), t)
    >>> sol = dsolve(eq)
    >>> print sol
    y(t) == C1*exp(t)
    ...
    ...
    !ec

That is, the interactive session looks exactly as it does in the terminal window with the primitive Python shell.

We can, alternatively, use IPython syntax:

    !bc ipy
    In [1]: from sympy import *
    In [2]: t = symbols('t', real=True, positive=True)
    In [3]: y = symbols('y', cls=Function)
    In [4]: # Solve differential equation using dsolve
    In [5]: eq = diff(y(t), t) - y(t)
    In [6]: print eq
    Out[6]: -y(t) + Derivative(y(t), t)
    In [7]: sol = dsolve(eq)
    In [8]: print sol
    Out[8]: y(t) == C1*exp(t)
    !ec

This last ipy environment results in exactly the same interactive session in all formats except ipynb where the output is removed and the input is split over two cells. In format ipynb the above block is rendered as



In [7]:

    
from sympy import *
t = symbols('t', real=True, positive=True)
y = symbols('y', cls=Function)
# Solve differential equation using dsolve
eq = diff(y(t), t) - y(t)
print eq



In [8]:

    
sol = dsolve(eq)
print sol

There is an option --ipynb_split_pyshell=off that can be given to doconce format ipynb when compiling documents and that turns off the behaviour that interactive sessions are split into multiple cells. The result is then one single cell, and if we have "printing" as in >>> eq, there will be no output, except from the last one, in the output field in the notebook. This is usually not the behavior you want.

Showing an interactive session as pure text

Sometimes one wants to show an interactive session exactly as it looks like, with the input and the output. This can be done in the notebook by using the pyshell-t or ipy-t environments (-t for plain text display). With this DocOnce input,

    !bc pyshell-t
    >>> a = 1
    >>> b = 2
    >>> a + b
    3
    !ec

we get the plain text

>>> a = 1
        >>> b = 2
        >>> a + b
        3

The -t postfix can be used for any code that you want to display as text rather than as an executable cell. For example, with !bc pycod-t we create verbatim text and not a cell:

import MySpecialModule as m
        print m.main()

Ordinary code blocks

For ordinary book and manual writing, interactive sessions are used when the results of statements are important. Otherwise one applies standard code blocks. A standard code block can for example be

    !bc pycod
    x = np.linspace(0, 4*np.pi, 501)
    y = np.exp(-0.5*x)*np.sin(np.pi*x)
    plt.plot(x, y)
    !ec
    The result of this code segment appears in Figure ref{myfig}.

    FIGURE: [fig/myfig, width=500 frac=0.8] Plot. label{myfig}

This snippet turns out fine in all formats, except the notebook. The problem with notebooks is two-fold:

The snippet does not run without import of numpy and matplotlib.
The snippet results in a plot automatically, and with the figure in addittion, we get two plots.

The remedy for problem 1 is to use hidden code blocks, notified as !bc pyhid for Python code and !bc Xhid in general for language X:

    !bc pyhid
    import numpy as np
    import matplotlib.pyplot as plt
    !ec

Such code blocks are invisible in all formats except for ipynb. Books, manuals, and research papers will very often contain code snippets that do not run without (extensive) extra code. This exatra code must be provided in hidden code blocks for successful conversion to notebooks. If these cells take too much attention (that is probably why they were left out of the text...), one can insert some comments to explain that fact.

Problem 2 is solved using one of the preprocessors and an if test on the format, e.g., with Preprocess:

    # #if FORMAT != 'ipynb'
    The result of this code segment appears in Figure ref{myfig}.

    FIGURE: [fig/myfig, width=500 frac=0.8] Plot. label{myfig}
    # #endif

Now we can write the complete code segment with a preceding hidden block for import and the if test. Below is the rendering of this in the format ipynb.



In [9]:

    
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt



In [10]:

    
x = np.linspace(0, 4*np.pi, 501)
y = np.exp(-0.5*x)*np.sin(np.pi*x)
plt.plot(x, y)

Note that %matplotlib inline is automatically inserted before the first import of matplotlib in a DocOnce-generated notebook such that all plots are inlined.

(Hidden code blocks are also relevant for RunestoneInteractive books, which are made from Sphinx output.)

Other special options for notebooks

Figures and movies

Figures and movies can be implemented in several ways in notebooks, depending on the value of the options --ipynb_figure= and ipynb_movie=. For the former we have the values

md: plain Markdown syntax for a figure, with no possibility to adjust the size (default)
imgtag: <img ...> tag in HTML taking the specified width into account
Image: Python notebook cell with Image object

Below is an example with --ipynb_figure=imgtag.

For the movies we have the values

md: raw HTML code with iframe tag - not relevant for the notebook
HTML: raw HTML code with iframe tag embedded in the HTML object from the notebook (default)
HTML-YouTube: as HTML but use an IPython.display.YouTubeVideo object to display YouTube videos
ipynb: use IPython.display.YouTubeVideo object for YouTube videos, and use an HTML object with video tag for local movies

Below is an example with --ipynb_movie=ipynb. Execute the cell to create the YouTube video object.



In [11]:

    
from IPython.display import YouTubeVideo
YouTubeVideo("PtJrPEIHNJw")

Admonitions

Markdown has no support for admonitions while DocOnce has extensive support. Some methods for simulating admonitions in notebooks have therefore been implemented. These are specified by the --ipynb_admon= command-line option.

quote: typeset admon as Markdown quote (special font and gray vertical bar on the left)
paragraph: typeset admon as a plain paragraph with a heading if any (default)
hrule: use a horozontal rule to surround the heading and the text

Note that quotes in !bc quote environments are always typeset as Markdown quotes.

Here are two examples typeset with --ipynb_admon=hrule.

Notebooks have limited support for typesetting admons.

Admon environments must be simulated using Markdown quote environment, a plain paragraph, or decorations with HTML <hr> hrules.

Splitting documents.

Formats like html and sphinx support splitting documents into multiple web pages. The !split specification has no support in notebooks. This means that long documents must be long notebooks.

Equation references

Markdown (and thereby the Jupyter Notebook) does not support references to equations.

$$ \begin{equation} a = 1, \label{eq1} \tag{1} \end{equation} $$

$$ \begin{equation} b =2 \label{eq2} \tag{2} \end{equation} $$

Can we refer to (1) and (2) in format ipynb? Yes, in DocOnce-extended ipynb format, but not when writing notebooks interactively in the browser and using Markdown with LaTeX.

Translating notebooks to DocOnce

The `ipynb2doconce` tool

There is a program doconce ipynb2doconce for translating a notebook file to DocOnce format:

    Terminal> doconce ipynb2doconce notebook.ipynb

The output file notebook.do.txt should be an ordinary DocOnce file. With an optional argument --cell_delimiter, comment lines a la # ---------- code cell act in the as delimiter between the markdown and code cells in the .do.txt file.

Special comment annotations in the `.ipynb` file

In order to translate a DocOnce document to the Jupyter Notebook format and back again, some information needs to be coded as comments in the notebook and such that it can be brought back to DocOnce syntax. For example, index workds (idx) and labels are coded in comments, so are the original DocOnce figure and movie commands (which contain more information than what the notebook normally can make use of). Comments in the .ipynb file starting with dom: (DocOnce Metadata) contain coded information that can be translated back to DocOnce (see below for examples).

It is possible to write ordinary DocOnce documents with index words, labels, cross references, etc.; translate the document to a notebook; edit and expand the notebook interactively; and convert the notebook to a DocOnce document again - with preservation of index words, labels, etc., despite the fact that the notebook does not know about such syntax and construction. If you want to insert title, authors, date, index words, labels, figures, or movies in the notebook, use these types of constructions while writing a markdown cell in the notebook:

    <!-- dom:TITLE: Here goes the title -->
    # Here goes the title
    <!-- dom:AUTHOR: HPL Email:hpl@simula.no at Simula & UiO -->
    <!-- Author: --> **HPL** (email: `hpl@simula.no`), Simula and UiO

    <!-- dom:AUTHOR: Kaare Dump at Segfault, Cyberspace -->
    <!-- Author: --> **Kaare Dump**, Segfault, Cyberspace

    Date: **May 25, 2015**

    # Section heading
    <!-- dom:\label{mysubsec} -->
    <!-- dom:idx{headigs} -->
    <!-- dom:idx{section} -->

    Some text...

    <!-- dom:FIGURE: [fig/myfig, width=500 frac=0.6] This is a caption. -->
    <!-- begin figure -->
    <p>This is a caption.</p>
    <img src="fig/myfig.png" width=500>
    <!-- end figure -->

It is important to respect whitespace exactly as shown in this example. The dom: syntax must be the valid DocOnce syntax used when translating the document back to DocOnce format, while the Markdown lines coming after are the ones that can been seen in the present version of the notebook (these are removed forever when converting the notebook to DocOnce format).

Mako code is lost!

Any Preprocess or Mako code one has in a DocOnce document is lost forever when converting to a Jupyter Notebook.

As a specific example on generating a notebook and converting it back, consider the example.do.txt file:

    Terminal> cp example.do.txt tmp1.do.txt
    Terminal> doconce format ipynb tmp1
    Terminal> cp tmp1.ipynb tmp2.ipynb
    Terminal> doconce ipynb2doconce tmp2.ipynb
    Terminal> doconce diff tmp1.do.txt tmp2.do.txt
    Termonal> google-chrome tmp_diff_tmp2.do.txt.html