Please check out the full git repository for the presentation notebook as well as other example notebooks.
https://github.com/jbarratt/ipython_notebook_presentation/tree/linuxcon
Terminal(Editor <-> Renderer) -> Email
vs Cyclicipynb
The other attributes will be clear, but a word on Literate (apologies to Knuth for the oversimplification)
This is a big part of where the title comes from: it's about the story more than the software. (Because of inline output, Notebook may even be 'SuperLiterate'.)
Update: IPython's founder, @fperez_org kindly pointed me to a blog post of his; they prefer the term Literate Computing.
Clarification: As @fperez_org pointed out, the kernel actually only speaks zeromq, the notebook process handles browser communications.
In [1]:
import json
import twitter
creds = json.load(open('/Users/jbarratt/.twitter.json'))
auth = twitter.oauth.OAuth(creds['access_token'],
creds['access_token_secret'],
creds['api_key'],
creds['api_secret'])
twitter_api = twitter.Twitter(auth=auth)
In [2]:
search_results = twitter_api.search.tweets(q='#linuxcon', count=5000)
statuses = search_results['statuses']
print len(statuses)
# Original source for this loop:
# http://nbviewer.ipython.org/github/ptwobrussell/Mining-the-Social-Web-2nd-Edition/blob/master/ipynb/Chapter%201%20-%20Mining%20Twitter.ipynb
for _ in range(5):
try:
next_results = search_results['search_metadata']['next_results']
except KeyError, e: # No more results when next_results doesn't exist
break
kwargs = dict([ kv.split('=') for kv in next_results[1:].split("&") ])
search_results = twitter_api.search.tweets(**kwargs)
statuses += search_results['statuses']
In [3]:
print len(statuses)
In [6]:
# We know the results are in 'statuses', let's peek at one.
print json.dumps(statuses[1], indent=1)
In [7]:
import re # important note; this is common practice in notebooks, but violates PEP8
# "Imports are always put at the top of the file, just after any
# module comments and docstrings, and before module globals and constants."
# Yes, this is a terrible way to find URL-like strings.
re.findall(r'(https?://\S*)', statuses[0]['text'])
Out[7]:
That looks plausible. Let's try applying that to all our results.
In [8]:
urls = []
for status in statuses:
urls += re.findall(r'(https?://\S*)', status['text'])
urls[0:10]
Out[8]:
Huh, not great, if we use the text
it looks like things get truncated. (\u2026
is … in unicode, what you see when a tweet trails off.)
Looking at the JSON again, it looks like a lot of these have ['entities']['urls'][(list)]['expanded_url']
, let's try for those.
In [9]:
urls = []
for status in statuses:
try:
urls += [x['expanded_url'] for x in status['entities']['urls']]
except:
pass
urls[0:5]
Out[9]:
Great! No more unicode weirdness. But, those shortened links are still redirects. Can we resolve them?
In [10]:
import requests
rv = requests.get('http://bit.ly/1vjbwEV')
rv.url
Out[10]:
Handy. Turns out if you get something with requests
you can just access the .url
property and find what it got after follwing all the redirects.
In [11]:
from collections import Counter
import requests
# collections.Counter is a handy way to find 'Top N'
popular = Counter()
# Don't need to look up the same short link twice.
cache = {}
for url in urls:
if url in cache:
popular[cache[url]] += 1
else:
try:
rv = requests.get(url)
# resolve the original URL
cache[url] = rv.url
popular[rv.url] += 1
except:
# ignore anything bad that happens
pass
In [12]:
popular.most_common(10)
Out[12]:
$ ipython
Python 2.7.6 (default, Jan 28 2014, 10:24:42)
Type "copyright", "credits" or "license" for more information.
IPython 3.0.0-dev -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import webbrowser
In [2]: webbrowser.
webbrowser.BackgroundBrowser webbrowser.MacOSX webbrowser.open_new_tab
webbrowser.BaseBrowser webbrowser.MacOSXOSAScript webbrowser.os
webbrowser.Chrome webbrowser.Mozilla webbrowser.register
webbrowser.Chromium webbrowser.Netscape webbrowser.register_X_browsers
webbrowser.Elinks webbrowser.Opera webbrowser.shlex
webbrowser.Error webbrowser.UnixBrowser webbrowser.stat
webbrowser.Galeon webbrowser.get webbrowser.subprocess
webbrowser.GenericBrowser webbrowser.main webbrowser.sys
webbrowser.Grail webbrowser.open webbrowser.time
webbrowser.Konqueror webbrowser.open_new
In [17]:
x = 5
x
Out[17]:
In [24]:
# I ran this cell a few times
x += 1
x
Out[24]:
In [14]:
import random, string
# make a big list of random strings
words = [''.join(random.choice(string.ascii_uppercase) for _ in range(6)) for _ in range(1000)]
# Plan A: turn them all into lowercase with a list comprehension
def listcomp_lower(words):
return [w.lower() for w in words]
# Plan B: Start with a list, and word by word append the lowercase versions
def append_lower(words):
new = []
for w in words:
new.append(w.lower())
return new
# %timeit is IPython Magic to do a quick benchmark
%timeit append_lower(words)
In [15]:
%timeit listcomp_lower(words)
In [25]:
%lsmagic
Out[25]:
There are many use cases where the notebook makes a lot of sense to use. Here are a few illustrated examples:
We won't go into them all for time, but a few highlights:
This is the gateway drug that gets many people into IPython Notebook. It's the real sweet spot between what makes Python great (pandas, scikit*, numpy, matplotlib, etc) and IPython Notebook great (Literate, Visual, Interactive, Iterative.)
Did I permanenently ruin your ability to hear the term 'big data' without thinking of this? You're welcome.
When the guy who wrote my AI Textbook uses it, you know it's good software!
Lots of the slides had more code than we might want in a report; several approaches. It's on the notebook roadmap to add an 'official' way to do this.
%run
magic runs another notebook, pulling variables inipython nbconvert --to python
& refactor) or build a real module (Tip: %load_ext autoreload; %autoreload 2
or %aimport mymodule
)custom.css
. (Annoying caveat!)/* Boss Mode */
div.input {
display: none;
}
div.output_prompt {
display: none;
}
div.output_text {
display: none;
}
Hey, we have HTML to play with! There are many ways to display prettier things inline.
Simple Custom HTML:
from IPython.core.display import HTML
def foo():
raw_html = "<h1>Yah, rendered HTML</h1>"
return HTML(raw_html)
In [20]:
class FancyText(object):
def __init__(self, text):
self.text = text
def _repr_html_(self):
""" Use some fancy CSS3 styling when we return this """
style=("text-shadow: 0 1px 0 #ccc,0 2px 0 #c9c9c9,0 3px 0 #bbb,"
"0 4px 0 #b9b9b9,0 5px 0 #aaa,0 6px 1px rgba(0,0,0,.1)")
return '<h1 style="{}">{}</h1>'.format(style, self.text)
FancyText("Hello #linuxcon!")
Out[20]:
$ ./checkipnb.py xkcd1313.ipynb
running xkcd1313.ipynb
.........
FAILURE:
def test1():
assert subparts('^it$') == {'^', 'i', 't', '$', '^i', 'it', 't$', '^it', 'it$', '^it$'}
test1()
-----
raised:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-11-a4492b0ec0d5> in <module>()
---> 26 test1()
<ipython-input-11-a4492b0ec0d5> in test1()
22 assert words('This is a TEST this is') == {'this', 'is', 'a', 'test'}
---> 23 assert lines('Testing / 1 2 3 / Testing over') == {'TESTING', '1 2 3', 'TESTING OVER'}
NameError: global name 'lines' is not defined
.............
ran notebook
ran 22 cells
1 cells raised exceptions
See more on Profiles, Javascript Extensions, IPython Extensions, and nbconvert Templates
In [11]:
profile = !ipython locate profile
print profile
custom_js = profile[0] + "/static/custom/custom.js"
print custom_js
!head $custom_js
Thankfully you can organize them in unique files, and just require
them in custom.js
$([IPython.events]).on('app_initialized.NotebookApp', function(){
require(['/static/custom/clean_start.js']);
require(['/static/custom/styling/css-selector/main.js']);
})
IPython.notebook.kernel.execute("!rm -rf /")
Demo Of a less scary example
One useful thing with having lots of notebooks around is high context sample code for solving future problems.
I wrote a simple tool (only works on OSX for now, yikes): nbgrep
In [19]:
!nbgrep seaborn
In [1]:
!ipython nbconvert Presentation.ipynb --to slides
$ pip install ipython[all]
(brew install python
) ORjbarratt@serialized.net
</small>