Welcome to the Jupyter Notebook

I might slip and call it the "IPython Notebook" sometimes, because it was originally just for interactive Python sessions. But it does much more now (including R and Julia).

What is it?

This "notebook" is my go-to version of the Lab Book you might have seen in bench science. It captures a fluid mix of text (in "markdown" format, with decent support for $\LaTeX$) and computation, in this case in Python.


In [ ]:
# this is a python comment
# this cell contains python code

# executing the cell yields the results of the python command

Why does it work so well for me?

The Donald Knuth had a dream of "literate programming" which captivated me years ago, but I could never really do it until I had the notebook technology.

The notebook works really well for including plots and other graphics as well.


In [ ]:
# live code some graphics here

In [ ]:
# your turn: plot some additional digits of pi

We will use two "packages" for the hands-on portion of this tutorial

Pandas

This is a panel data package with a cute name.


In [ ]:
# live code an example of loading the va data csv with pandas here

In [ ]:
df = pd.read_csv('../3-data/

In [ ]:
# DataFrame.iloc method selects row and columns by "integer location"

df.iloc[5:10,

In [ ]:
# If you are new to this sort of thing, what do you think this does?

df.iloc[5:10, :10

In [ ]:
# I don't have time to show you the details now, but I find that
# pandas DataFrames have really done things well.  For example:

df.gs_text34

In [ ]:
df.gs_text34.value_counts(

Scikit-Learn

This is a python-based machine learning library that has a lot of great methods in a common framework.


In [ ]:
# you can guess what the next line does, 
# even if you have never used python before:

import sklearn.neighbors

In [ ]:
# here is how sklearn creates a "classifier":

clf =

In [ ]:
# I didn't mention `numpy` before, but this is "the fundamental
# package for scientific computing with Python"

In [ ]:
# sklearn gets mixed up with Pandas DataFrames and Series,
# so you need to turn things into np.arrays:

X = 
y =

In [ ]:
# one nice thing about sklearn is that it has all different
# fancy machine learning methods, but they all follow a
# common pattern:

# fit

In [ ]:
# predict

We will see plenty more of sklearn so I'll leave things here for now.