In [2]:
%matplotlib inline
import pandas as pd

In [3]:
from IPython.core.display import HTML
css = open('style-table.css').read() + open('style-notebook.css').read()
HTML('<style>{}</style>'.format(css))


Out[3]:

In [4]:
cast = pd.DataFrame.from_csv('data/cast.csv', index_col=None)
cast.head()


Out[4]:
title year name type character n
0 Suuri illusioni 1985 Homo $ actor Guests 22
1 Gangsta Rap: The Glockumentary 2007 Too $hort actor Himself NaN
2 Menace II Society 1993 Too $hort actor Lew-Loc 27
3 Porndogs: The Adventures of Sadie 2009 Too $hort actor Bosco 3
4 Stop Pepper Palmer 2014 Too $hort actor Himself NaN

In [12]:
release_dates = pd.DataFrame.from_csv('data/release_dates.csv', index_col=None,
                                      parse_dates=['date'], infer_datetime_format=True)
release_dates.head()


Out[12]:
title year country date
0 #73, Shaanthi Nivaasa 2007 India 2007-06-15
1 #Beings 2015 Romania 2015-01-29
2 #Ewankosau saranghaeyo 2015 Philippines 2015-01-21
3 #Horror 2015 USA 2015-11-20
4 #Lucky Number 2015 USA 2015-09-01

In [ ]:

Make a bar plot of the months in which movies with "Christmas" in their title tend to be released in the USA.


In [23]:
r_d = release_dates[(release_dates.title.str.contains("Christmas")) & (release_dates.country == "USA")]

r_d.date.dt.month.value_counts().sort_index().plot(kind="bar")


Out[23]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb2dd0a7b00>

In [ ]:

Make a bar plot of the months in which movies whose titles start with "The Hobbit" are released in the USA.


In [27]:
r_d = release_dates[(release_dates.title.str.contains("The Hobbit")) & (release_dates.country == "USA")]
r_d


Out[27]:
title year country date
340821 The Hobbit: An Unexpected Journey 2012 USA 2012-12-14
340886 The Hobbit: The Battle of the Five Armies 2014 USA 2014-12-17
340955 The Hobbit: The Desolation of Smaug 2013 USA 2013-12-13

In [25]:
r_d.date.dt.month.value_counts().sort_index().plot(kind="bar")


Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb2f1e8f438>

Make a bar plot of the day of the week on which movies with "Romance" in their title tend to be released in the USA.


In [30]:
r_d = release_dates[(release_dates.title.str.contains("Romance")) ]
r_d


Out[30]:
title year country date
789 100% OFF: A Recession-Era Romance 2012 USA 2012-07-04
5566 A Blue Gum Romance 1913 Australia 1913-09-20
5739 A California Romance 1922 USA 1922-12-24
5967 A Circus Romance 1916 USA 1916-01-24
6097 A Crooked Romance 1917 USA 1917-09-30
6108 A Cumberland Romance 1920 USA 1920-08-06
6109 A Cumberland Romance 1920 Portugal 1924-02-15
6110 A Cumberland Romance 1920 France 1924-07-11
6111 A Cumberland Romance 1920 UK 1926-10-11
7183 A Hoosier Romance 1918 USA 1918-08-18
7601 A Little Romance 1979 USA 1979-04-27
7602 A Little Romance 1979 Japan 1979-07-14
7603 A Little Romance 1979 France 1979-08-22
7604 A Little Romance 1979 Denmark 1979-09-03
7605 A Little Romance 1979 Colombia 1979-09-20
7606 A Little Romance 1979 Portugal 1979-09-28
7607 A Little Romance 1979 West Germany 1980-06-20
7608 A Little Romance 1979 Hungary 1982-04-15
7983 A Midnight Romance 1919 USA 1919-03-10
7984 A Midnight Romance 1919 France 1920-01-23
7985 A Midnight Romance 1919 Finland 1920-11-01
8567 A Novel Romance 2011 USA 2011-11-11
8589 A Parisian Romance 1916 USA 1916-01-09
8590 A Parisian Romance 1932 USA 1932-10-01
8969 A Rogue's Romance 1919 USA 1919-06-09
8970 A Romance of Billy Goat Hill 1916 USA 1916-10-09
8971 A Romance of Burke and Wills Expedition of 1860 1918 Australia 1918-09-07
8972 A Romance of Happy Valley 1919 USA 1919-01-26
8973 A Romance of Wastdale 1921 France 1924-02-22
8974 A Romance of the Redwoods 1917 USA 1917-05-14
... ... ... ... ...
389704 True Romance 1993 France 1993-11-03
389705 True Romance 1993 Ireland 1993-11-05
389706 True Romance 1993 Mexico 1993-11-05
389707 True Romance 1993 Spain 1993-11-12
389708 True Romance 1993 Argentina 1993-11-18
389709 True Romance 1993 Italy 1993-11-26
389710 True Romance 1993 Sweden 1993-12-03
389711 True Romance 1993 South Korea 1993-12-31
389712 True Romance 1993 Japan 1994-01-22
389713 True Romance 1993 Germany 1994-01-27
389714 True Romance 1993 Greece 1994-01-27
389715 True Romance 1993 Australia 1994-02-10
389716 True Romance 1993 Denmark 1994-04-08
389717 True Romance 1993 Netherlands 1994-06-09
389718 True Romance 1993 Finland 1994-06-10
389719 True Romance 1993 Brazil 1994-07-22
389720 True Romance 1993 Portugal 1995-02-03
389721 True Romance 1993 Turkey 1995-02-03
389722 True Romance 1993 Uruguay 1995-10-20
389723 True Romance 1993 Hungary 1996-06-06
389724 True Romance 1993 Singapore 2002-04-18
394071 Unashamed: A Romance 1938 Denmark 1953-04-20
396656 Up Romance Road 1918 USA 1918-06-24
407550 When Romance Rides 1922 USA 1922-04-09
407551 When Romance Rides 1922 Portugal 1924-11-07
407953 Where Romance Rides 1925 USA 1925-04-28
410060 Wild Romance 2006 Netherlands 2006-11-09
410061 Wild Romance 2006 Belgium 2006-11-15
410149 Wild West Romance 1928 USA 1928-06-10
417591 Young Romance 1915 USA 1915-01-21

326 rows × 4 columns


In [29]:
r_d.date.dt.dayofweek.value_counts().sort_index().plot(kind="bar")


Out[29]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb2f1e8fda0>

Make a bar plot of the day of the week on which movies with "Action" in their title tend to be released in the USA.


In [32]:
r_d = release_dates[(release_dates.title.str.contains("Action")) ]
r_d


Out[32]:
title year country date
5970 A Civil Action 1998 USA 1999-01-08
5971 A Civil Action 1998 Austria 1999-03-05
5972 A Civil Action 1998 Czech Republic 1999-03-11
5973 A Civil Action 1998 Iceland 1999-04-09
5974 A Civil Action 1998 Ireland 1999-04-09
5975 A Civil Action 1998 Italy 1999-04-09
5976 A Civil Action 1998 Spain 1999-04-09
5977 A Civil Action 1998 UK 1999-04-09
5978 A Civil Action 1998 Argentina 1999-04-15
5979 A Civil Action 1998 Peru 1999-04-15
5980 A Civil Action 1998 Singapore 1999-04-15
5981 A Civil Action 1998 Brazil 1999-04-16
5982 A Civil Action 1998 Norway 1999-04-16
5983 A Civil Action 1998 Portugal 1999-04-16
5984 A Civil Action 1998 Germany 1999-04-22
5985 A Civil Action 1998 Mexico 1999-04-23
5986 A Civil Action 1998 Belgium 1999-04-28
5987 A Civil Action 1998 France 1999-04-28
5988 A Civil Action 1998 Greece 1999-04-28
5989 A Civil Action 1998 Australia 1999-04-29
5990 A Civil Action 1998 Netherlands 1999-04-29
5991 A Civil Action 1998 Denmark 1999-04-30
5992 A Civil Action 1998 Taiwan 1999-05-01
5993 A Civil Action 1998 Thailand 1999-05-14
5994 A Civil Action 1998 New Zealand 1999-05-20
5995 A Civil Action 1998 Poland 1999-05-28
5996 A Civil Action 1998 Kuwait 1999-06-02
5997 A Civil Action 1998 Malaysia 1999-06-17
5998 A Civil Action 1998 Hungary 1999-07-01
5999 A Civil Action 1998 Turkey 1999-09-03
... ... ... ... ...
216190 Missing in Action 2: The Beginning 1985 USA 1985-03-01
216191 Missing in Action 2: The Beginning 1985 West Germany 1985-08-22
216192 Missing in Action 2: The Beginning 1985 Australia 1985-09-26
216193 Missing in Action 2: The Beginning 1985 France 1985-11-06
216194 Missing in Action 2: The Beginning 1985 Brazil 1986-01-10
216195 Missing in Action 2: The Beginning 1985 Denmark 1986-03-07
216196 Missing in Action 2: The Beginning 1985 Portugal 1986-07-25
246109 Partners in Action 2002 USA 2002-10-31
290407 Single Action 1998 Mexico 1998-10-24
300207 Stand by for Action 1942 Sweden 1944-02-14
300208 Stand by for Action 1942 Portugal 1944-03-01
300209 Stand by for Action 1942 Finland 1945-10-14
357817 The Oscar Nominated Short Films 2012: Live Action 2012 USA 2012-02-09
357819 The Oscar Nominated Short Films 2013: Live Action 2013 USA 2013-02-01
357820 The Oscar Nominated Short Films 2014: Live Action 2014 USA 2014-01-31
357822 The Oscar Nominated Short Films 2015: Live Action 2015 USA 2015-01-30
388755 Triple Action 1925 USA 1925-12-20
396384 Untitled Disney Live-Action Fairy Tale 2017 USA 2017-12-22
396385 Untitled Disney Live-Action Fairy Tale 2018 USA 2018-11-02
396386 Untitled Disney Live-Action Fairy Tale 2019 USA 2019-03-29
396387 Untitled Disney Live-Action Fairy Tale (II) 2019 USA 2019-11-08
396408 Untitled Mother-Daughter/Action Comedy Project 2017 Netherlands 2017-05-11
396409 Untitled Mother-Daughter/Action Comedy Project 2017 USA 2017-05-12
409309 Who's Got the Action? 1962 West Germany 1963-03-01
409310 Who's Got the Action? 1962 Sweden 1963-03-04
409311 Who's Got the Action? 1962 Finland 1963-04-19
409312 Who's Got the Action? 1962 Denmark 1963-05-01
409313 Who's Got the Action? 1962 Ireland 1963-05-03
409314 Who's Got the Action? 1962 France 1963-05-24
409315 Who's Got the Action? 1962 Mexico 1963-12-05

243 rows × 4 columns


In [33]:
r_d.date.dt.dayofweek.value_counts().sort_index().plot(kind="bar")


Out[33]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb2f2c0a5c0>

On which date was each Judi Dench movie from the 1990s released in the USA?


In [34]:
usa = release_dates[release_dates.country == 'USA']

c = cast
c = c[c.name == 'Judi Dench']
c = c[c.year // 10 * 10 == 1990]
c.merge(usa).sort('date')


/home/ubuntu/miniconda3/lib/python3.5/site-packages/ipykernel/__main__.py:7: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
Out[34]:
title year name type character n country date
0 GoldenEye 1995 Judi Dench actress M 6 USA 1995-11-17
2 Jack & Sarah 1995 Judi Dench actress Margaret 3 USA 1996-03-22
1 Hamlet 1996 Judi Dench actress Hecuba 12 USA 1996-12-25
3 Mrs Brown 1997 Judi Dench actress Queen Victoria 1 USA 1997-10-03
7 Tomorrow Never Dies 1997 Judi Dench actress M 9 USA 1997-12-19
4 Shakespeare in Love 1998 Judi Dench actress Queen Elizabeth 12 USA 1999-01-08
5 Tea with Mussolini 1999 Judi Dench actress Arabella 2 USA 1999-05-14
6 The World Is Not Enough 1999 Judi Dench actress M 6 USA 1999-11-19

In [ ]:

In which months do films with Judi Dench tend to be released in the USA?


In [35]:
c = cast
c = c[c.name == 'Judi Dench']
m = c.merge(usa).sort('date')
m.date.dt.month.value_counts().sort_index().plot(kind='bar')


/home/ubuntu/miniconda3/lib/python3.5/site-packages/ipykernel/__main__.py:3: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  app.launch_new_instance()
Out[35]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb2f2bfdda0>

In [36]:
c = cast
c = c[c.name == 'Tom Cruise']
m = c.merge(usa).sort('date')
m.date.dt.month.value_counts().sort_index().plot(kind='bar')


/home/ubuntu/miniconda3/lib/python3.5/site-packages/ipykernel/__main__.py:3: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)
  app.launch_new_instance()
Out[36]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fb2f1e45400>

In which months do films with Tom Cruise tend to be released in the USA?


In [ ]:


In [ ]: