In [1]:
%matplotlib inline
import pandas as pd

In [2]:
titles = pd.DataFrame.from_csv('../data/titles.csv', index_col=None)
titles.head()


Out[2]:
title year
0 Johan nyt on markkinat! 1966
1 Tus Feromonas Me Matan 2016
2 Hanna Amon 1951
3 Orgazmo 1997
4 Pyaar To Hona Hi Tha 1998

In [3]:
cast = pd.DataFrame.from_csv('../data/cast.csv', index_col=None)
cast.head()


Out[3]:
title year name type character n
0 Suuri illusioni 1985 Homo $ actor Guests 22
1 Gangsta Rap: The Glockumentary 2007 Too $hort actor Himself NaN
2 Menace II Society 1993 Too $hort actor Lew-Loc 27
3 Porndogs: The Adventures of Sadie 2009 Too $hort actor Bosco 3
4 Stop Pepper Palmer 2014 Too $hort actor Himself NaN

How many movies are listed in the titles dataframe?


In [4]:
len(titles)


Out[4]:
221104

What are the earliest two films listed in the titles dataframe?


In [5]:
titles.sort_values(by='year').head(2)


Out[5]:
title year
37922 Miss Jerry 1894
88048 Reproduction of the Corbett and Jeffries Fight 1899

How many movies have the title "Hamlet"?


In [6]:
len(titles[titles.title == "Hamlet"])


Out[6]:
19

How many movies are titled "North by Northwest"?


In [7]:
len(titles[titles.title == "North by Northwest"])


Out[7]:
1

When was the first movie titled "Hamlet" made?


In [8]:
titles[titles.title == "Hamlet"].year.min()


Out[8]:
1910

List all of the "Treasure Island" movies from earliest to most recent.


In [9]:
titles[titles.title == "Treasure Island"].sort_values(by='year')


Out[9]:
title year
123681 Treasure Island 1918
77186 Treasure Island 1920
129574 Treasure Island 1934
148090 Treasure Island 1950
142592 Treasure Island 1972
140787 Treasure Island 1973
152161 Treasure Island 1985
63873 Treasure Island 1999

How many movies were made in the year 1950?


In [10]:
len(titles[titles.year == 1950])


Out[10]:
1037

How many movies were made in the year 1960?


In [11]:
len(titles[titles.year == 1960])


Out[11]:
1481

How many movies were made from 1950 through 1959?


In [12]:
len(titles[(titles.year >= 1950) & (titles.year <= 1959)])


Out[12]:
12235

In what years has a movie titled "Batman" been released?


In [13]:
titles[titles.title == "Batman"]


Out[13]:
title year
25735 Batman 1943
98750 Batman 1989

How many roles were there in the movie "Inception"?


In [14]:
len(cast[cast.title == "Inception"])


Out[14]:
76

How many roles in the movie "Inception" are NOT ranked by an "n" value?


In [15]:
len(cast[(cast.title == "Inception") & (cast.n.isnull())])


Out[15]:
25

But how many roles in the movie "Inception" did receive an "n" value?


In [16]:
len(cast[(cast.title == "Inception") & (cast.n.notnull())])


Out[16]:
51

Display the cast of "North by Northwest" in their correct "n"-value order, ignoring roles that did not earn a numeric "n" value.


In [17]:
cast[(cast.title == "North by Northwest") & (cast.n.notnull())].sort_values(by='n')


Out[17]:
title year name type character n
803392 North by Northwest 1959 Cary Grant actor Roger O. Thornhill 1
3208097 North by Northwest 1959 Eva Marie Saint actress Eve Kendall 2
1343001 North by Northwest 1959 James Mason actor Phillip Vandamm 3
2887119 North by Northwest 1959 Jessie Royce Landis actress Clara Thornhill 4
328060 North by Northwest 1959 Leo G. Carroll actor The Professor 5
2791418 North by Northwest 1959 Josephine Hutchinson actress Mrs. Townsend 6
1562972 North by Northwest 1959 Philip Ober actor Lester Townsend 7
1174789 North by Northwest 1959 Martin Landau actor Leonard 8
2253447 North by Northwest 1959 Adam Williams actor Valerian 9
1669504 North by Northwest 1959 Edward Platt actor Victor Larrabee 10
613651 North by Northwest 1959 Robert Ellenstein actor Licht 11
2113983 North by Northwest 1959 Les Tremayne actor Auctioneer 12
427685 North by Northwest 1959 Philip Coolidge actor Dr. Cross 13
1390700 North by Northwest 1959 Patrick McVey actor Sergeant Flamm 14
188864 North by Northwest 1959 Edward Binns actor Captain Junket 15
1276075 North by Northwest 1959 Ken Lynch actor Charley - Chicago Policeman 16

Display the entire cast, in "n"-order, of the 1972 film "Sleuth".


In [18]:
cast[(cast.title == "Sleuth") & (cast.year == 1972) & (cast.n.notnull())].sort_values(by='n')


Out[18]:
title year name type character n
1572154 Sleuth 1972 Laurence Olivier actor Andrew Wyke 1
300287 Sleuth 1972 Michael Caine actor Milo Tindle 2
344006 Sleuth 1972 Alec Cawthorne actor Inspector Doppler 3
1350564 Sleuth 1972 John (II) Matthews actor Detective Sergeant Tarrant 4
2502400 Sleuth 1972 Eve (III) Channing actress Marguerite Wyke 5
1335298 Sleuth 1972 Teddy Martin actor Police Constable Higgs 6

Now display the entire cast, in "n"-order, of the 2007 version of "Sleuth".


In [19]:
cast[(cast.title == "Sleuth") & (cast.year == 2007) & (cast.n.notnull())].sort_values(by='n')


Out[19]:
title year name type character n
300288 Sleuth 2007 Michael Caine actor Andrew 1
1191875 Sleuth 2007 Jude Law actor Milo 2
1664503 Sleuth 2007 Harold Pinter actor Man on T.V. 3

How many roles were credited in the silent 1921 version of Hamlet?


In [20]:
len(cast[(cast.title == "Hamlet") & (cast.year == 1921)])


Out[20]:
9

How many roles were credited in Branagh’s 1996 Hamlet?


In [21]:
len(cast[(cast.title == "Hamlet") & (cast.year == 1996)])


Out[21]:
54

How many "Hamlet" roles have been listed in all film credits through history?


In [22]:
len(cast[cast.character == "Hamlet"])


Out[22]:
87

How many people have played an "Ophelia"?


In [23]:
len(cast[cast.character == "Ophelia"])


Out[23]:
101

How many people have played a role called "The Dude"?


In [24]:
len(cast[cast.character == "The Dude"])


Out[24]:
17

How many people have played a role called "The Stranger"?


In [25]:
len(cast[cast.character == "The Stranger"])


Out[25]:
200

How many roles has Sidney Poitier played throughout his career?


In [26]:
len(cast[cast.name == "Sidney Poitier"])


Out[26]:
43

How many roles has Judi Dench played?


In [27]:
len(cast[cast.name == "Judi Dench"])


Out[27]:
54

List the supporting roles (having n=2) played by Cary Grant in the 1940s, in order by year.


In [28]:
cast[
    (cast.name == 'Cary Grant') & 
    (cast.year // 10 == 194) &
    (cast.n == 2)
].sort_values(by='year')


Out[28]:
title year name type character n
803389 My Favorite Wife 1940 Cary Grant actor Nick 2
803399 Penny Serenade 1941 Cary Grant actor Roger Adams 2

List the leading roles that Cary Grant played in the 1940s in order by year.


In [29]:
cast[
    (cast.name == 'Cary Grant') & 
    (cast.year // 10 == 194) &
    (cast.n == 1)
].sort_values(by='year')


Out[29]:
title year name type character n
803414 The Howards of Virginia 1940 Cary Grant actor Matt Howard 1
803371 His Girl Friday 1940 Cary Grant actor Walter Burns 1
803416 The Philadelphia Story 1940 Cary Grant actor C. K. Dexter Haven 1
803404 Suspicion 1941 Cary Grant actor Johnnie 1
803418 The Talk of the Town 1942 Cary Grant actor Leopold Dilg 1
803395 Once Upon a Honeymoon 1942 Cary Grant actor Patrick 'Pat' O'Toole 1
803362 Destination Tokyo 1943 Cary Grant actor Capt. Cassidy 1
803387 Mr. Lucky 1943 Cary Grant actor Joe Adams 1
803388 Mr. Lucky 1943 Cary Grant actor Joe Bascopolous 1
803396 Once Upon a Time 1944 Cary Grant actor Jerry Flynn 1
803354 Arsenic and Old Lace 1944 Cary Grant actor Mortimer Brewster 1
803391 None But the Lonely Heart 1944 Cary Grant actor Ernie Mott 1
803390 Night and Day 1946 Cary Grant actor Cole Porter 1
803393 Notorious 1946 Cary Grant actor Devlin 1
803410 The Bachelor and the Bobby-Soxer 1947 Cary Grant actor Dick Nugent 1
803411 The Bishop's Wife 1947 Cary Grant actor Dudley 1
803386 Mr. Blandings Builds His Dream House 1948 Cary Grant actor Jim Blandings 1
803366 Every Girl Should Be Married 1948 Cary Grant actor Dr. Madison Brown 1
803375 I Was a Male War Bride 1949 Cary Grant actor Capt. Henri Rochard 1

How many roles were available for actors in the 1950s?


In [30]:
len(cast[
    (cast.year // 10 == 195) &
    (cast.type == "actor")
])


Out[30]:
152055

How many roles were avilable for actresses in the 1950s?


In [31]:
len(cast[
    (cast.year // 10 == 195) &
    (cast.type == "actress")
])


Out[31]:
55220

How many leading roles (n=1) were available from the beginning of film history through 1980?


In [32]:
len(cast[
    (cast.year <= 1980) &
    (cast.n == 1)
])


Out[32]:
62721

How many non-leading roles were available through from the beginning of film history through 1980?


In [33]:
len(cast[
    (cast.year <= 1980) &
    (cast.n != 1)
])


Out[33]:
1074772

How many roles through 1980 were minor enough that they did not warrant a numeric "n" rank?


In [34]:
len(cast[
    (cast.year <= 1980) &
    (cast.n.isnull())
])


Out[34]:
429322