In [1]:
name = '2015-12-18-meeting-summary'
title = 'Running Jupyter Notebook with remote kernels'
tags = 'jupyter, hpc, anaconda'
author = 'Denis Sergeev'
In [2]:
from nb_tools import connect_notebook_to_post
from IPython.core.display import HTML
html = connect_notebook_to_post(name, title, tags, author)
This week one of the group's members asked how to use Jupyter on our UEA's supercomputer cluster Grace. So today we tried to walk through that. We also discussed other Jupyter capabilities and advantages.
Those who are still sceptical or have no idea what is Jupyter, go and try it on their official server.
<div class="alert alert-success", style="font-size: 150%"> This post is outdated! Update is here. </div>
One of the greatest advantages of Jupyter project is its modular ecosystem: it has a user interface that you see in a browser, but all computations are done by a kernel of your choice. Although initially Jupyter was made for Python (IPython kernel), it has recently become language-agnostic, and you can use a kernel that calls R or Julia or Matlab. In fact, here is a full list of languages that you can use with Jupyter.
As Jupyter documentation states, kernels are programming language specific processes that run independently and interact with the Jupyter Applications and their user interfaces.
But what is even more important for us here is that Jupyter and a kernel do not have to be on the same machine. Say, for example, you have a large dataset on a remote server and you don't want to transfer it all to a local PC. The solution here is to use a remote kernel.
In this post we will look at the simplest case of connecting Jupyter interface to a Python kernel on Grace via ssh.
Make sure you have a working Python distribution on Grace. Luckily, Grace has Anaconda module, and so you can load it and create a local environment following the instruction here.
To make sure that you use the correct Python environment every time you enter Grace, here's what you can do. Create a bash script that looks like that:
#!/bin/bash
module load python/anaconda/2.3.0
env_name=~/.conda/envs/myenv
. $env_name/bin/activate myenv
unset PYTHONHOME
Call it load_anaconda_env.sh
for example. Then add this line to your ~/.bashrc
file (before export LOGIN_INVOKE=0
):
. load_anaconda_env.sh
The next time you ssh to Grace, you'll get the output similar to:
discarding /gpfs/grace/anaconda/2.3.0/bin from PATH
prepending /gpfs/home/abc12xyz/.conda/envs/myenv/bin to PATH`
(myenv)[abc12xyz@login00 ~]$
We will use one of the simplest add-ons for managing Jupyter kernels: rk.
<div class="alert alert-warning", style="white-space: pre">Windows users should instead use remote_ikernel utility
remote_ikernel manage --add --name='Python on Grace' --kernel_cmd='ipython kernel -f {connection_file}' --interface=ssh --verbose --host='abc12xyz@grace.uea.ac.uk'</div>
The GitHub page has already a very clear instruction and even a short YouTube demonstration. One thing that can be unclear for an unexperienced pythonista is how to use the utility without root access. So if you are on a university PC with Linux and no admin rights, follow my steps.
pip install git+git://github.com/korniichuk/rk#egg=rk
ls ~/anaconda/lib/python2.7/site-packages/rk
Its contents should look like this:
In [3]:
!cat ~/anaconda/lib/python2.7/site-packages/rk/config/rk.ini
You have to change the default kernels_location
to
kernels_location = "~/.ipython/kernels"
Make sure that you can login to Grace without entering password. How? On a local machine run ssh-keygen
and then ssh-copy-id abc12xyz@grace.uea.ac.uk
.
Just follow these 2 simple steps: http://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id/
You have to be able to run this command (with your login obviously) without entering a password:
ssh abc12xyz@grace.uea.ac.uk
To install a template of a remote jupyter kernel to kernels location run this command:
rk install-template
Check that Jupyter sees your kernels:
In [ ]:
!jupyter kernelspec list
Available kernels:
template /local/abc12xyz/.ipython/kernels/template
python2 /local/abc12xyz/anaconda/lib/python2.7/site-packages/ipykernel/resources
The latter is the default kernel, while the new kernel is installed in a template
directory.
gedit /local/abc12xyz/.ipython/kernels/template/kernel.json
The .json file should look like this:
{
"argv": [
"rkscript",
"python",
"{connection_file}",
"abc12xyz@grace.uea.ac.uk"
],
"display_name": "Python 2 on Grace",
"language": "python"
}
Note that I replaced the host&login line with my details and renamed the kernel display_name
.
On your local PC, in your command line (or however you launch Jupyter) run
jupyter notebook
And then select the new kernel in the drop-down menu:
Wait for couple of seconds and then once it says that kernel is connected (little blue box in the upper right corner), you can execute cells.
Check the command line from where you run Jupyter for any disconnection errors. You might have to refresh the page or restart Jupyter if this does not work.
In [4]:
HTML(html)
Out[4]: