In [1]:
%run talktools
%matplotlib inline
The Need for Openness in Data Journalism
Thesis: journalists should subject themselves to the same reproducibility and openness standards as scientist.
Keegan: "I have found this “new” brand of data journalism disappointing foremost because it wants to perform science without abiding by scientific norms."
In [10]:
from IPython.display import IFrame
IFrame("http://nbviewer.ipython.org/github/brianckeegan/Bechdel/blob/master/Bechdel_test.ipynb", 800, 600)
Out[10]:
To their credit, FiveThirtyEight responded and put their data on GitHub.
In [3]:
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
for i, f in enumerate(fibonacci()):
print f,
if i > 35:
break
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
In [4]:
import matplotlib.pyplot as plt
import numpy as np
x, y = np.random.normal(size=(2, 100))
s, c = np.random.random(size=(2, 100))
plt.scatter(x, y, c=c, s=1000 * s, alpha=0.3);
Lots of visualization types are available: e.g. matplotlib gallery
In [5]:
# %load http://matplotlib.org/mpl_examples/mplot3d/bars3d_demo.py
In [6]:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
for c, z in zip(['r', 'g', 'b', 'y'], [30, 20, 10, 0]):
xs = np.arange(20)
ys = np.random.rand(20)
# You can provide either a single color or an array. To demonstrate this,
# the first bar of each set will be colored cyan.
cs = [c] * len(xs)
cs[0] = 'c'
ax.bar(xs, ys, zs=z, zdir='y', color=cs, alpha=0.8)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.show()
IPython's "nbviewer" website: http://nbviewer.ipython.org
In [7]:
from IPython.display import IFrame
IFrame("http://nbviewer.ipython.org", 800, 600)
Out[7]:
In [8]:
IFrame("http://jakevdp.github.io/blog/2013/08/28/understanding-the-fft/", 800, 600)
Out[8]:
From exploration to collaboration to publication to reproduction of results.
Code + Description + Data + Visualization in one place = True Openness and Reproducibility!
In [9]:
from intfact import factorizer
factorizer()
(thanks to Brian Granger and Jon Frederic)
We'll see more of this later...