The Jupyter Notebook as Document:
from structure to application

M Pacer
Damián Avila
Jess Hamrick

JupyterLab User Testing!

What: 5–30 mins of making JupyterLab better
When: Friday from 8am to 5:30pm
Where: 2nd floor, Gramercy
How: Walk in or sign up at: https://bit.ly/jupytercon-usertesting

Introduction: *.ipynb and nbformat

*.ipynb

  • JSON on-disk representation
  • json schema defines required structure
  • current schema version: 4

nbformat

Python library for simple notebooks operations.

Straightforward questions:

  1. Minimal structure needed to meet the schema?
  2. Validate a notebook against the schema?

Minimal structure


In [ ]:
import nbformat
from nbformat.v4 import new_notebook
nb = new_notebook()
display(nb)
  • cells: list
  • metadata: dict
  • nbformat, nbformat_minor: int, int

Validate


In [ ]:
nbformat.validate(nb)

What happens if it's invalid?


In [ ]:
nb.pizza = True
nbformat.validate(nb)

Cells and their sources


In [ ]:
nb = new_notebook() # get rid of pizza
from nbformat.v4 import new_code_cell, new_markdown_cell, new_raw_cell
  • Three types of cells:
    • code_cell
    • markdown_cell
    • raw_cell

Markdown cells


In [ ]:
md = new_markdown_cell("First argument is the source string.")
display(md)
nb.cells.append(md)
  • cell_type: str, "markdown"
  • metadata: dict
  • source: str or list of strings

Raw cells


In [ ]:
raw = new_raw_cell(["Sources can also be a ","list of strings."])
display(raw)
nb.cells.append(raw)
  • cell_type: str, "raw"
  • metadata: dict
  • source: str or list of strings

Code cells


In [ ]:
code = new_code_cell(["#Either way, you need newlines\n", 
                      "print('like this')"])
display(code)
nb.cells.append(code)
  • cell_type: str, "code"
  • execution_count: None or int
  • metadata: dict
  • outputs: list
  • source: str or list of strings

Creating outputs

Output types:

  • stream
  • display_data
  • execute_result
  • error

display_data and execute_result can have multiple mimetypes.

For more on messages and output types:
Matthias Bussonier and Paul Ivanov's Jupyter: Kernels, protocols, and the IPython reference implementation

Metadata

  • notebook-level, nb.metadata
  • cell-level, nb.cells[0].metadata
  • output_level (for display_data and execute_result types), nb.cells[0].outputs[0].metadata

Arbitrary content, with some reserved Jupyter specific fields.

Reserved notebook metadata fields:

  • kernelspec
  • language_info
  • authors
  • title

Reserved cell metadata fields (all are optional):

  • deletable
  • collapsed
  • autoscroll
  • jupyter (jupyter metadata namespace, for internal use)
  • tags (useful for semantic customization)
  • name (should be unique)
  • for raw cells: format (content type)

Reserved output metadata

  • isolated

display_data and execute_result metadata are keyed with mimetypes.

Reading & writing notebooks to disk


In [ ]:
nbformat.write(nb, "my_demo_notebook.ipynb")
!ls my_*

In [ ]:
nb2 = nbformat.read("my_demo_notebook.ipynb", as_version=4)
print(nb2)

Application #1: nbconvert

Distinction between use cases for nbformat and nbconvert

nbformat: creating and validating notebooks

nbconvert: manipulate existing notebooks.

Nbconvert use cases

  • converting notebooks into other formats
    • web-display: html, slides (with reveal.js)
    • publishable documents: LaTeX/PDF, ASCIIDoc
    • plain-text: rst, markdown
    • executable scripts: e.g., *.py

Nbconvert use cases (cont.)

  • manipulating notebook content
    • cell magic (%%R) code highlighting
    • removing content
    • extracting images references for plain-text formats (LaTeX, markdown)
  • executing notebooks from the command-line

Command Line Interface


In [ ]:
!jupyter nbconvert my_demo_notebook.ipynb --to markdown
!cat my_demo_notebook.ipynb

NbConvertApp

  • manages the CLI.
  • manages the configuration as established both on the command line and via traitlet config files.
  • wraps base class functionality.

Nbconvert configuration: traitlets

Configuration is specified using traitlets

  • as on-disk config file: jupyter_nbconvert_config.(py|json)
  • command line arguments: jupyter nbconvert --template=basic
  • passed to instance: Exporter(config=Config())

Exporters

Orchestration layer.

Keyed to the --to <exporter_name> command line argument.

Exporters specify many aspects of conversion pipeline. E.g.:

  • which preprocessors
  • whether & which template
  • output format

The resources dictionary

In addition the NotebookNode instance, exporters create a resources dictionary.

This is useful for passing information that not be in the notebook itself.

  • Notebook styling (for html export)
  • metadata for populating a jekyll header

Preprocessors

Notebook to notebook transformations.

ExecutePreprocessor: This enables CLI execution.
TagRemovePreprocessor: Removes cells tagged with particular tags specified as traitlets.

Templates (and filters)

Nbconvert uses Jinja2 templates.

Templates inherit from one another.

Templates can access filters that can transform the content passed through them.

One of the most common filters passes the plaintext representation of a cell's source to pandoc for conversion.

pandoc & pandocfilters

pandoc converts between formats

pandocfilters manipulate pandoc's intermediate JSON representation.

Writers & Postprocessors

Writers handles the Exporter's final output.

FilesWriter writes to disk.

Postprocessors manipulate the file after the Writer is finished.

ServePostProcessor serves html file.

Entrypoints

In addition to being highly configurable, we have a mechanism for 3rd party libraries to register Exporters using entrypoints.

  1. Define JekyllExporter
  2. package it in my_jekyll_exporter
setup(,
      entry_points = {
        'nbconvert.exporters': 
            ['jekyll = my_jekyll_exporter:JekyllExporter']
      }
)

Entrypoints (contd.)

And with pip install my_jekyll_exporter

Anyone can use your exporter with
jupyter nbconvert --to jekyll

Multi-notebook workflows

Nbconvert works on single notebooks.

Multiple notebook conversion is still in early days.

Examples:

  • bookbook (html or pdf books from multiple notebooks)
  • multi_rise (one slideshow from many notebooks)

Execute for next speaker


In [ ]:
%reset -f

Nikola

Static site generator

Features

  • It’s just a bunch of HTML files and assets.
  • Incremental builds/rebuild using doit, so Nikola is fast.
  • Multilingual
  • Extensible
  • Friendly CLI
  • Multiple input formats such as reStructuredText, Markdown, HTML and Jupyter Notebooks (out of the box as part of the core!!)

The core of the Nikola / Jupyter integration


In [ ]:
from nbconvert.exporters import HTMLExporter

...

def _compile_string(self, nb_json):
    """Export notebooks as HTML strings."""
    self._req_missing_ipynb()
    c = Config(self.site.config['IPYNB_CONFIG'])
    c.update(get_default_jupyter_config())
    exportHtml = HTMLExporter(config=c)
    body, _ = exportHtml.from_notebook_node(nb_json)
    return body

Some other gems


In [ ]:
def read_metadata(self, post, lang=None):
    """Read metadata directly from ipynb file.
    As ipynb files support arbitrary metadata as json, the metadata used by Nikola
    will be assume to be in the 'nikola' subfield.
    """
    self._req_missing_ipynb()
    if lang is None:
        lang = LocaleBorg().current_lang
    source = post.translated_source_path(lang)
    with io.open(source, "r", encoding="utf8") as in_file:
        nb_json = nbformat.read(in_file, current_nbformat)
    # Metadata might not exist in two-file posts or in hand-crafted
    # .ipynb files.
    return nb_json.get('metadata', {}).get('nikola', {})

In [ ]:
def create_post(self, path, **kw):
    """Create a new post."""
    ...

    if content.startswith("{"):
        # imported .ipynb file, guaranteed to start with "{" because it’s JSON.
        nb = nbformat.reads(content, current_nbformat)
    else:
        nb = nbformat.v4.new_notebook()
        nb["cells"] = [nbformat.v4.new_markdown_cell(content)]

Let see it in action!


In [7]:
cd /media/data/devel/damian_blog/


/media/data/devel/damian_blog

In [8]:
!ls


cache		  galleries    plugins	    Start.ipynb      toggle.tpl
conf.py		  Guardfile    posts	    state_data.json  yes
Customization.md  old_conf.py  __pycache__  stories
files		  output       README.md    themes

In [ ]:
title = "We are above 1000 stars!"

In [ ]:
tags_list = ['Jupyter', 'python', 'reveal', 'RISE', 'slideshow']

In [ ]:
tags = ', '.join(tags_list)

In [ ]:
!nikola new_post -f ipynb -t "{title}" --tags="{tags}"
Creating New Post
-----------------

Title: We are above 1000 stars!
Scanning posts......done!
[2017-07-12T16:45:00Z] NOTICE: compile_ipynb: No kernel specified, assuming "python3".
[2017-07-12T16:45:01Z] INFO: new_post: Your post's text is at: posts/we-are-above-1000-stars.ipynb

In [ ]:
!nikola build

In [ ]:
!nikola deploy

In [9]:
from IPython.display import IFrame
IFrame("http://www.damian.oquanta.info/", 980, 600)


Out[9]:

Execute for next speaker


In [ ]:
%reset -f

RISE

Reveal.js - Jupyter/IPython Slideshow Extension

Previously, we developed a "converter" for the nbconvert library to export a ipynb to a STATIC html slideshow based in Reveal.js library.

But with RISE, you don't have a STATIC version anymore, you have a LIVE version! A notebook rendered as a Reveal.js-based slideshow, where you can execute code or show to the audience whatever you can show/do inside the notebook itself (but in a "slidy" way).

RISE at his core...

Reveal.js is a framework to create beautiful HTML-based presentations

  • Reveal.js understand a specfic html structure

So, to make RISE work, we need to:

  • "signal" the Slide Type for each cell
  • rearrange the cells accordingly with the strucuture Reveal.js understand

RISE is just another "view" of the Notebook

Anything that works in the notebook, it should work in RISE.

If you can execute code in the Notebook, then you can execute code in RISE.

With RISE we provide a simple way to get a presentation view from our content with the huge amount of features coming from Reveal.js and with all the power from the Jupyter notebook machinery.

Execute for next speaker


In [ ]:
%reset -f

Application #4: nbgrader

A tool for creating and grading assignments in Jupyter notebooks

Notebooks are perfect for weaving together instructions, coding exercises, plotting exercises, written responses, etc.

Notebooks are perfect for weaving together instructions, coding exercises, plotting exercises, written responses, etc.

But challenging to use:

  • Maintaining separate instructor and student versions
  • Autograding & manual grading
  • Adding comments & feedback

nbgrader: a tool for creating and grading assignments in Jupyter notebooks

nbgrader: a tool for creating and grading assignments in Jupyter notebooks

{ "nbgrader": { "grade": false, # this is not a test cell "locked": false, # this cell should be editable "solution": true, # this cell will contain a # student's solution "grade_id": "squares", # the name of the cell "schema_version": 1 # the metadata version }, "collapsed": true, "trusted": true }

nbgrader assign

  • Clear the instructor solutions (ClearSolutions preprocessor)
  • Make test cells uneditable (LockCells preprocessor)
  • Clear cell output (ClearOutput preprocessor)
  • ... etc. ...

Instructor Version


In [ ]:
def squares(n):
    """Compute the squares of numbers from 1 to n, such that the 
    ith element of the returned list equals i^2.
    
    """
    ### BEGIN SOLUTION
    if n < 1:
        raise ValueError("n must be greater than or equal to 1")
    return [i ** 2 for i in range(1, n + 1)]
    ### END SOLUTION

Student Version


In [ ]:
def squares(n):
    """Compute the squares of numbers from 1 to n, such that the 
    ith element of the returned list equals i^2.
    
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [4]:
%%bash

cd course101
nbgrader assign ps1 --force --create --debug


[AssignApp | DEBUG] Searching ['/Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101', '/Users/jhamrick/.jupyter', '/Users/jhamrick/miniconda3/envs/jupytercon/etc/jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[AssignApp | DEBUG] Looking for jupyter_config in /etc/jupyter
[AssignApp | DEBUG] Looking for jupyter_config in /usr/local/etc/jupyter
[AssignApp | DEBUG] Looking for jupyter_config in /Users/jhamrick/miniconda3/envs/jupytercon/etc/jupyter
[AssignApp | DEBUG] Looking for jupyter_config in /Users/jhamrick/.jupyter
[AssignApp | DEBUG] Looking for jupyter_config in /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101
[AssignApp | DEBUG] Looking for nbgrader_config in /etc/jupyter
[AssignApp | DEBUG] Looking for nbgrader_config in /usr/local/etc/jupyter
[AssignApp | DEBUG] Looking for nbgrader_config in /Users/jhamrick/miniconda3/envs/jupytercon/etc/jupyter
[AssignApp | DEBUG] Looking for nbgrader_config in /Users/jhamrick/.jupyter
[AssignApp | DEBUG] Looking for nbgrader_config in /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101
[AssignApp | DEBUG] Loaded config file: /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101/nbgrader_config.py
[AssignApp | DEBUG] Looking for nbgrader_config in /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101
[AssignApp | DEBUG] Loaded config file: /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101/nbgrader_config.py
[AssignApp | WARNING] Removing existing assignment: /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101/release/ps1
[AssignApp | INFO] Updating/creating assignment 'ps1': {}
[AssignApp | INFO] Converting notebook /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101/source/./ps1/problem1.ipynb
[AssignApp | DEBUG] Student: .
[AssignApp | DEBUG] Assignment: ps1
[AssignApp | DEBUG] Notebook: problem1
[AssignApp | DEBUG] Applying preprocessor: IncludeHeaderFooter
[AssignApp | DEBUG] Applying preprocessor: LockCells
[AssignApp | DEBUG] Applying preprocessor: ClearSolutions
[AssignApp | DEBUG] Applying preprocessor: ClearOutput
[AssignApp | DEBUG] Applying preprocessor: CheckCellMetadata
[AssignApp | DEBUG] Applying preprocessor: ComputeChecksums
[AssignApp | DEBUG] Checksum for 'match' is 19f7bfd22d168d3ca40f43b29937860b
[AssignApp | DEBUG] Checksum for 'test_match' is 32177cdb2bd9bcbf40ae05256c0d55fc
[AssignApp | DEBUG] Checksum for 'forward_chain' is ecba59778d96bec84de551d0cf471f82
[AssignApp | DEBUG] Checksum for 'forward_chain_uses_match' is f2ee5a4cf02dd6ae2a9a6ac8f9102671
[AssignApp | DEBUG] Checksum for 'forward_chain_immutable_belief' is 93d43b09acc10499d9a0be24fc434272
[AssignApp | DEBUG] Checksum for 'test_forward_chain' is 82941fc7253fec4c6be9352de7eee17b
[AssignApp | DEBUG] Checksum for 'explain_rules_1' is fad8963e2894e8732648dbe5d399a00e
[AssignApp | DEBUG] Checksum for 'explain_rules_2' is 3fc18f7153981514905b5c6800277ad5
[AssignApp | DEBUG] Checksum for 'explain_rules_3' is 6d5ec2c0951d009f57a984741a457f2b
[AssignApp | DEBUG] Checksum for 'explain_limits' is f9854aba01986aefcc8c82d34b9e49db
[AssignApp | DEBUG] Applying preprocessor: SaveCells
[AssignApp | DEBUG] Removing existing notebook 'problem1' from the database
[AssignApp | DEBUG] Creating notebook 'problem1' in the database
[AssignApp | DEBUG] Notebook kernelspec: {'display_name': 'Python 3', 'language': 'python', 'name': 'python3'}
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/test_match> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/forward_chain_uses_match> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/forward_chain_immutable_belief> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/test_forward_chain> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/explain_rules_1> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/explain_rules_2> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/explain_rules_3> into the gradebook
[AssignApp | DEBUG] Recorded grade cell GradeCell<ps1/problem1/explain_limits> into the gradebook
[AssignApp | DEBUG] Recorded solution cell Notebook<ps1/problem1>/match into the gradebook
[AssignApp | DEBUG] Recorded solution cell Notebook<ps1/problem1>/forward_chain into the gradebook
[AssignApp | DEBUG] Recorded solution cell Notebook<ps1/problem1>/explain_rules_1 into the gradebook
[AssignApp | DEBUG] Recorded solution cell Notebook<ps1/problem1>/explain_rules_2 into the gradebook
[AssignApp | DEBUG] Recorded solution cell Notebook<ps1/problem1>/explain_rules_3 into the gradebook
[AssignApp | DEBUG] Recorded solution cell Notebook<ps1/problem1>/explain_limits into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/print_rules> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/match> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/test_match> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/forward_chain> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/forward_chain_uses_match> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/forward_chain_immutable_belief> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/test_forward_chain> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/rules_1> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/belief_1> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/explain_rules_1> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/rules_2> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/belief_2> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/explain_rules_2> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/belief_3> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/explain_rules_3> into the gradebook
[AssignApp | DEBUG] Recorded source cell SolutionCell<ps1/problem1/explain_limits> into the gradebook
[AssignApp | DEBUG] Applying preprocessor: ClearHiddenTests
[AssignApp | DEBUG] Applying preprocessor: ComputeChecksums
[AssignApp | DEBUG] Checksum for 'match' is 19f7bfd22d168d3ca40f43b29937860b
[AssignApp | DEBUG] Checksum for 'test_match' is 32177cdb2bd9bcbf40ae05256c0d55fc
[AssignApp | DEBUG] Checksum for 'forward_chain' is ecba59778d96bec84de551d0cf471f82
[AssignApp | DEBUG] Checksum for 'forward_chain_uses_match' is f2ee5a4cf02dd6ae2a9a6ac8f9102671
[AssignApp | DEBUG] Checksum for 'forward_chain_immutable_belief' is 93d43b09acc10499d9a0be24fc434272
[AssignApp | DEBUG] Checksum for 'test_forward_chain' is 82941fc7253fec4c6be9352de7eee17b
[AssignApp | DEBUG] Checksum for 'explain_rules_1' is fad8963e2894e8732648dbe5d399a00e
[AssignApp | DEBUG] Checksum for 'explain_rules_2' is 3fc18f7153981514905b5c6800277ad5
[AssignApp | DEBUG] Checksum for 'explain_rules_3' is 6d5ec2c0951d009f57a984741a457f2b
[AssignApp | DEBUG] Checksum for 'explain_limits' is f9854aba01986aefcc8c82d34b9e49db
[AssignApp | DEBUG] Applying preprocessor: CheckCellMetadata
[AssignApp | INFO] Writing 23725 bytes to /Users/jhamrick/Dropbox/presentations/2017.08.25-jupytercon/jess/course101/release/./ps1/problem1.ipynb
[AssignApp | INFO] Setting destination file permissions to 644

Other nbgrader functionality

  • nbgrader autograde
  • Formgrader extension
  • nbgrader feedback

JupyterLab User Testing!

What: 5–30 mins of making JupyterLab better
When: Friday from 8am to 5:30pm
Where: 2nd floor, Gramercy
How: Walk in or sign up at: https://bit.ly/jupytercon-usertesting

Thanks! Any questions?

M Pacer (@mdpacer)

Damián Avila (@damian_avila)

Jess Hamrick (@jhamrick)