In [ ]:
__author__ = "Lucy Li"
__version__ = "CS224u, Stanford, Spring 2020"
This tutorial assumes that you have followed the course setup instructions. This means Jupyter is installed using Conda.
Home
, e.g., where your cloned cs224u
Github repo resides. jupyter notebook
and enter. After a few moments, a new browser window should open, listing the contents of your Home
directory. [I 17:23:47.479 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
. This tells you where your notebook is located. So if you were to accidentally close the window, you can open it again while your server is running. For this example, navigating to http://localhost:8888/
on your favorite web browser should open it up again. jupyter notebook --port 5656
. In this case, http://localhost:5656/
is where your directory resides. .ipynb
extension to open it. If you want to create a new notebook, in the top right corner, click on New
and under Notebooks
, click on Python
. If you have multiple environments, you should choose the one you want, e.g. Python [nlu]
. .ipynb
are formatted as a JSON and so if you open them in vim, emacs, or a code editor, it's much harder to read and edit. Jupyter Notebooks allow for interactive computing.
Cells help you organize your work into manageable chunks.
The top of your notebook contains a row of buttons. If you hover over them, the tooltips explain what each one is for: saving, inserting a new cell, cut/copy/paste cells, moving cells up/down, running/stopping a cell, choosing cell types, etc. Under Edit, Insert, and Cell in the toolbar, there are more cell-related options.
Notice how the bar on the left of the cell changes color depending on whether you're in edit mode or command mode. This is useful for knowing when certain keyboard shortcuts apply (discussed later).
There are three main types of cells: code, markdown, and raw.
Raw cells are less common than the other two, and you don't need to understand them to get going for this course. If you put anything in this type of cell, you can't run it. They are used for situations where you might want to convert your notebook to HTML or LaTeX using the nbconvert
tool or File -> Download as a format that isn't .ipynb
. Read more about raw cells here if you're curious.
Use the following code cells to explore various operations.
Typically it's good practice to put import statements in the first cell or at least in their own cell.
The square brackets next to the cell indicate the order in which you run cells. If there is an asterisk, it means the cell is currently running.
The output of a cell is usually any print statements in the cell and the value of the last line in the cell.
In [ ]:
import time
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
In [ ]:
print("cats")
# run this cell and notice how both strings appear as outputs
"cheese"
In [ ]:
# cut/copy and paste this cell
# move this cell up and down
# run this cell
# toggle the output
# toggle scrolling to make long output smaller
# clear the output
for i in range(50):
print("cats")
In [ ]:
# run this cell and stop before it finishes
# stop acts like a KeyboardInterrupt
for i in range(50):
time.sleep(1) # make loop run slowly
print("cats")
In [ ]:
# running this cell leads to no output
def function1():
print("dogs")
# put cursor in front of this comment and split and merge this cell.
def function2():
print("cheese")
In [ ]:
function1()
function2()
One difference between coding a Python script and a notebook is how you can run code "out of order" for the latter. This means you should be careful about variable reuse. It is good practice to order cells in the order which you expect someone to use the notebook, and organize code in ways that prevent problems from happening.
Clearing the output doesn't remove the old variable value. In the example below, we need to rerun cell A to start with a new a
. If we don't keep track of how many times we've run cell B or cell C, we might encounter unexpected bugs.
In [ ]:
# Cell A
a = []
In [ ]:
# Cell B
# try running this cell multiple times to add more pineapple
a.append('pineapple')
In [ ]:
# Cell C
# try running this cell multiple times to add more cake
a.append('cake')
In [ ]:
# depending on the number of times you ran
# cells B and C, the output of this cell will
# be different.
a
Even deleting cell D's code after running it doesn't remove list b
from this notebook. This means if you are modifying code, whatever outputs you had from old code may still remain in the background of your notebook.
In [ ]:
# Cell D
# run this cell, delete/erase it, and run the empty cell
b = ['apple pie']
In [ ]:
# b still exists after cell C is gone
b
Restart the kernel (Kernel -> Restart & Clear Output) to start anew. To check that things run okay in the intended order, restart and run everything (Kernel -> Restart & Run All). This is especially good to do before sharing your notebook with someone else.
Jupyter notebooks are handy for telling stories using your code. You can view Pandas DataFrames and plots directly under each code cell.
In [ ]:
# dataframe example
d = {'ingredient': ['flour', 'sugar'], '# of cups': [3, 4], 'purchase date': ['April 1', 'April 4']}
df = pd.DataFrame(data=d)
df
In [ ]:
# plot example
plt.title("pineapple locations")
plt.ylabel('latitude')
plt.xlabel('longitude')
_ = plt.scatter(np.random.randn(5), np.random.randn(5))
The other type of cell is Markdown, which allows you to write blocks of text in your notebook. Double click on any Markdown cell to view/edit it. Don't worry if you don't remember all of these things right away. You'll write more code than Markdown essays for this course, but the following are handy things to be aware of.
You may notice that this cell's header is prefixed with ###
. The fewer hashtags, the larger the header. You can go up to five hashtags for the smallest level header.
Here is a table. You can emphasize text using underscores or asterisks. You can also include links.
Markdown | Outcome |
---|---|
_italics_ or *italics* |
italics or italics |
__bold__ or **bold** |
bold or bold |
[link](http://web.stanford.edu/class/cs224u/) |
link |
[jump to Cells section](#cells) |
jump to Cells section |
There are three different ways to write a bullet list (asterisk, dash, plus):
Example of a numbered list:
A kernel executes code in a notebook.
You may have multiple conda environments on your computer. You can change which environment your notebook is using by going to Kernel -> Change kernel.
When you open a notebook, you may get a message that looks something like "Kernel not found. I couldn't find a kernel matching __. Please select a kernel." This just means you need to choose the version of Python or environment that you want to have for your notebook.
If you have difficulty getting your conda environment to show up as a kernel, this may help.
In our class we will be using IPython notebooks, which means the code cells run Python.
Fun fact: there are also kernels for other languages, e.g., Julia. This means you can create notebooks in these other languages as well, if you have them on your computer.
Go to Help -> Keyboard Shortcuts to view the shortcuts you may use in Jupyter Notebook.
Here are a few that I find useful on a regular basis:
In [ ]:
# play around with this cell with shortcuts
# delete this cell
# Edit -> Undo Delete Cells
for i in range(10):
print("jelly beans")
Notice that when you are done working and exit out of this notebook's window, the notebook icon in the home directory listing next to this notebook is green. This means your kernel is still running. If you want to shut it down, check the box next to your notebook in the directory and click "Shutdown."
To shutdown the jupyter notebook app as a whole, use Control-C in Terminal to stop the server and shut down all kernels.
These are some extra things that aren't top priority to know but may be interesting.
When you create a notebook, a checkpoint file is also saved in a hidden directory called .ipynb_checkpoints
. Every time you manually save the notebook, the checkpoint file updates. Jupyter autosaves your work on occasion, which only updates the .ipynb
file but not the checkpoint. You can revert back to the latest checkpoint using File -> Revert to Checkpoint.
We use this in our class for viewing jupyter notebooks from our course website. It allows you to render notebooks on the Internet. Check it out here.
View -> Cell toolbar
If you click on "Help" in the toolbar, there is a list of references for common Python tools, e.g. numpy, pandas.
Jupyter Notebook Documentation