In [1]:
from IPython.display import display, Image, HTML
from talktools import website, nbviewer
Scientific computing and data science are complex activities that involve a wide range of contexts:
There are a large number of software tools that we use across these different contexts:
Python Ruby Perl C C++ Fortran Numba Cython MPI Hadoop Excel LaTeX Powerpoint Word Keynote Vim Emacs Make JavaScript Matlab Mathematica
This places a massive cognitive burden on users. This burden has nothing to do with the challenging technical problems users are trying to solve. This burden pulls them away from solving their actual problems.
We are working really hard to make sure that IPython is useful in the following contexts.
First and foremost, IPython is an interactive environment for writing and running code. We provide features to make this as pleasant as possible.
Tab completion:
In [2]:
import math
In [21]:
math.
Out[21]:
Interactive help:
In [4]:
math.cos?
Inline plotting:
In [5]:
%pylab inline
In [6]:
plot(rand(50))
Out[6]:
Seamless access to the system shell:
In [7]:
ls
IPython was used for interactive, exploratory data science at the first White House Hackathon.
In [8]:
from IPython.display import YouTubeVideo
YouTubeVideo('sjfsUzECqK0')
Out[8]:
IPython Notebook contain everything related to a computation and its results: code, narrative text, equations, plots, images, videos, HTML, JavaScript. We have developed tools for "publishing" these Notebook documents in different contexts:
Let's generate a static PDF of this talk's introduction:
In [9]:
!ipython nbconvert --to latex --post pdf "IPython Project.ipynb"
Here is the nbviewer website:
In [10]:
website('nbviewer.ipython.org')
Out[10]:
We also maintain a gallery of interesting Notebooks that contains a curated list of IPython Notebooks on various topics.
Cam Davidson-Pilon has written an entire book on Bayesian Statistics as a set of IPython Notebook that are hosted on GitHub and viewed on http://nbviewer.ipython.org.
In [11]:
website('http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/')
Out[11]:
Matthew Russell has written an O'Reilly published book that includes IPython Notebooks for all examples.
In [12]:
website('http://shop.oreilly.com/product/0636920030195.do')
Out[12]:
Jose Unpingco has written a series of blog posts on Signal Processing using the IPython Notebook. These blog posts were the basis of a full length book Python for Signal Processing, Springer (2013).
In [13]:
website('http://python-for-signal-processing.blogspot.com/')
Out[13]:
Jake Vanderplas and others publish technical blogs that are authored as IPython Notebooks.
In [14]:
website('http://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/')
Out[14]:
People are now using nbviewer with Twitter to speak about a wide range of technical work.
In [15]:
Image('images/twitter_post.png')
Out[15]:
The IPython Notebook, is being used extensively (PyCon, PyData, Strata, Supercomputing, SciPy, SIAM) for presentations on technical topics across a wide range of fields. The Notebook has a cell toolbar for adding slide related metadata to cells. However, we are working on improving this usage case:
On November 14, 2013, IBM announced that it was making its Jeopardy playing supercomputer, Watson, available to developers on the internet as a service. In the summer of 2013, researchers on the Watson team revealed that they were using the IPython Notebook and IPython.parallel
to improve Watson's performance and capabilities.
Before:
After:
The IPython Notebook is being used for lecture materials and student work in a number of university and high school courses on scientific computing and data science. Most of these courses are being developed publicly on GitHub. Here is a short list:
In [16]:
%%file courses.csv
"Course","University","Instructor"
"Data Science (CS 109)","Harvard University","Pfister and Blitzstein"
"Practical Data Science","NYU","Josh Attenberg"
"Scientific Computing (ASTR 599)","University of Washington","Jake Vanderplas"
"Working with Open Data","UC Berkeley","Raymond Yee"
"Computational Physics","Cal Poly","Jennifer Klay"
In [17]:
import pandas
In [18]:
df = pandas.read_csv('courses.csv'); df
Out[18]:
In [19]:
%load_ext load_style
In [20]:
%load_style talk.css
In [ ]: