In [1]:
%matplotlib inline
import pandas as pd

In [2]:
from IPython.core.display import HTML
css = open('style-table.css').read() + open('style-notebook.css').read()
HTML('<style>{}</style>'.format(css))


Out[2]:

In [3]:
titles = pd.read_csv('data/titles.csv')
titles.head()


Out[3]:
title year
0 A Lélek órása 1923
1 Aizaugusa gravi viegli krist 1986
2 Agliyorum 1988
3 0_1_0 2008
4 97 fung lau mung 1994

In [4]:
cast = pd.read_csv('data/cast.csv')
cast.head()


Out[4]:
title year name type character n
0 The Core 2003 Alejandro Abellan actor U.S.S. Soldier NaN
1 Il momento di uccidere 1968 Remo De Angelis actor Dago 9
2 Across the Divide 1921 Thomas Delmar actor Dago 4
3 Revan 2012 Diego James actor Dago NaN
4 Un homme marche dans la ville 1950 Fabien Loris actor Dago 12

In [ ]:

Define a year as a "Superman year" whose films feature more Superman characters than Batman. How many years in film history have been Superman years?


In [ ]:


In [ ]:

How many years have been "Batman years", with more Batman characters than Superman characters?


In [ ]:


In [ ]:

Plot the number of actor roles each year and the number of actress roles each year over the history of film.


In [ ]:


In [ ]:

Plot the number of actor roles each year and the number of actress roles each year, but this time as a kind='area' plot.


In [ ]:


In [ ]:

Plot the difference between the number of actor roles each year and the number of actress roles each year over the history of film.


In [ ]:


In [ ]:

Plot the fraction of roles that have been 'actor' roles each year in the history of film.


In [ ]:


In [ ]:

Plot the fraction of supporting (n=2) roles that have been 'actor' roles each year in the history of film.


In [ ]:


In [ ]:

Build a plot with a line for each rank n=1 through n=3, where the line shows what fraction of that rank's roles were 'actor' roles for each year in the history of film.


In [ ]:


In [ ]: