In [1]:
%matplotlib inline
import pandas as pd
In [2]:
wiki_df = pd.read_html("https://en.wikipedia.org/w/index.php?title=List_of_James_Bond_films&oldid=688916363", header=0)
Pandas read_html will return all the tables in the web page, as a list of dataframes
In [3]:
type(wiki_df)
Out[3]:
The table we want is the second (the first is a revision message). Using Python slices we get only the rows we want.
In [4]:
df = wiki_df[1][1:24]
In [5]:
df[['Title','Box office.1']]
Out[5]:
Hard to quickly see the trend in a table format. How 'bout a pretty graph? Pandas plot might be all you need. Usually dataframe.plot() is enough, but we'll add a title, a data table below, and some average dash lines.
In [6]:
ax = df.plot(table=True, xticks=[], title="Bond movies in 2005 dollars (million)", figsize=(17,11))
ax.hlines(y=df.mean()[0], xmin=0, xmax=23, color='b', alpha=0.5, linestyle='dashed', label='Box office average')
ax.hlines(y=df.mean()[1], xmin=0, xmax=23, color='g', alpha=0.5, linestyle='dashed', label='Budget average')
Out[6]:
In [ ]: