Introduction: *.ipynb and nbformat

*.ipynb

  • JSON on-disk representation
  • json schema defines required structure
  • current schema version: 4

nbformat

Python library for simple notebooks operations.

Straightforward questions:

  1. Minimal structure needed to meet the schema?
  2. Validate a notebook against the schema?

Minimal structure


In [ ]:
import nbformat
from nbformat.v4 import new_notebook
nb = new_notebook()
display(nb)
  • cells: list
  • metadata: dict
  • nbformat, nbformat_minor: int, int

Validate


In [ ]:
nbformat.validate(nb)

What happens if it's invalid?


In [ ]:
nb.pizza = True
nbformat.validate(nb)

Cells and their sources


In [ ]:
nb = new_notebook() # get rid of pizza
from nbformat.v4 import new_code_cell, new_markdown_cell, new_raw_cell
  • Three types of cells:
    • code_cell
    • markdown_cell
    • raw_cell

Markdown cells


In [ ]:
md = new_markdown_cell("First argument is the source string.")
display(md)
nb.cells.append(md)
  • cell_type: str, "markdown"
  • metadata: dict
  • source: str or list of strings

Raw cells


In [ ]:
raw = new_raw_cell(["Sources can also be a ","list of strings."])
display(raw)
nb.cells.append(raw)
  • cell_type: str, "raw"
  • metadata: dict
  • source: str or list of strings

Code cells


In [ ]:
code = new_code_cell(["#Either way, you need newlines\n", 
                      "print('like this')"])
display(code)
nb.cells.append(code)
  • cell_type: str, "code"
  • execution_count: None or int
  • metadata: dict
  • outputs: list
  • source: str or list of strings

Creating outputs

Output types:

  • stream
  • display_data
  • execute_result
  • error

display_data and execute_result can have multiple mimetypes.

For more on messages and output types:
Matthias Bussonier and Paul Ivanov's Jupyter: Kernels, protocols, and the IPython reference implementation

Metadata

  • notebook-level, nb.metadata
  • cell-level, nb.cells[0].metadata
  • output_level (for display_data and execute_result types), nb.cells[0].outputs[0].metadata

Arbitrary content, with some reserved Jupyter specific fields.

Reserved notebook metadata fields:

  • kernelspec
  • language_info
  • authors
  • title

Reserved cell metadata fields (all are optional):

  • deletable
  • collapsed
  • autoscroll
  • jupyter (jupyter metadata namespace, for internal use)
  • tags (useful for semantic customization)
  • name (should be unique)
  • for raw cells: format (content type)

Reserved output metadata

  • isolated

display_data and execute_result metadata are keyed with mimetypes.

Reading & writing notebooks to disk


In [ ]:
nbformat.write(nb, "my_demo_notebook.ipynb")
!ls my_*

In [ ]:
nb2 = nbformat.read("my_demo_notebook.ipynb", as_version=4)
print(nb2)