Using OMDB api to find data about TV shows


In [78]:
import json
import urllib.request as request

We're using two params. t (for title) and Season. Change values as you wish :)


In [163]:
url = 'http://www.omdbapi.com/?t=Scandal&Season=3'

In [164]:
content = request.urlopen(url).read()

In [165]:
data = json.loads(content.decode('UTF-8'))

Now let's take a look at what data we have


In [166]:
print(data)


{'Episodes': [{'Episode': '1', 'imdbRating': '8.6', 'Released': '2013-10-03', 'Title': "It's Handled", 'imdbID': 'tt3134028'}, {'Episode': '2', 'imdbRating': '8.4', 'Released': '2013-10-10', 'Title': "Guess Who's Coming to Dinner", 'imdbID': 'tt3120752'}, {'Episode': '3', 'imdbRating': '8.4', 'Released': '2013-10-17', 'Title': 'Mrs. Smith Goes to Washington', 'imdbID': 'tt3172768'}, {'Episode': '4', 'imdbRating': '7.8', 'Released': '2013-10-24', 'Title': 'Say Hello to My Little Friend', 'imdbID': 'tt3172772'}, {'Episode': '5', 'imdbRating': '8.4', 'Released': '2013-10-31', 'Title': 'More Cattle, Less Bull', 'imdbID': 'tt3172774'}, {'Episode': '6', 'imdbRating': '8.0', 'Released': '2013-11-07', 'Title': 'Icarus', 'imdbID': 'tt3255848'}, {'Episode': '7', 'imdbRating': '8.3', 'Released': '2013-11-14', 'Title': "Everything's Coming Up Mellie", 'imdbID': 'tt3276282'}, {'Episode': '8', 'imdbRating': '8.4', 'Released': '2013-11-21', 'Title': 'Vermont Is for Lovers, Too', 'imdbID': 'tt3276280'}, {'Episode': '9', 'imdbRating': '8.0', 'Released': '2013-12-05', 'Title': 'YOLO', 'imdbID': 'tt3276284'}, {'Episode': '10', 'imdbRating': '8.1', 'Released': '2013-12-12', 'Title': 'A Door Marked Exit', 'imdbID': 'tt3276286'}, {'Episode': '11', 'imdbRating': '8.1', 'Released': '2014-02-27', 'Title': 'Ride, Sally, Ride', 'imdbID': 'tt3288594'}, {'Episode': '12', 'imdbRating': '8.2', 'Released': '2014-03-06', 'Title': 'We Do Not Touch the First Ladies', 'imdbID': 'tt3288596'}, {'Episode': '13', 'imdbRating': '8.4', 'Released': '2014-03-13', 'Title': 'No Sun on the Horizon', 'imdbID': 'tt3288600'}, {'Episode': '14', 'imdbRating': '8.5', 'Released': '2014-03-20', 'Title': 'Kiss Kiss Bang Bang', 'imdbID': 'tt3288608'}, {'Episode': '15', 'imdbRating': '8.0', 'Released': '2014-03-27', 'Title': 'Mama Said Knock You Out', 'imdbID': 'tt3288614'}, {'Episode': '16', 'imdbRating': '8.3', 'Released': '2014-04-03', 'Title': 'The Fluffer', 'imdbID': 'tt3288612'}, {'Episode': '17', 'imdbRating': '8.4', 'Released': '2014-04-10', 'Title': 'Flesh and Blood', 'imdbID': 'tt3288628'}, {'Episode': '18', 'imdbRating': '8.7', 'Released': '2014-04-17', 'Title': 'The Price of Free and Fair Elections', 'imdbID': 'tt3288620'}], 'Response': 'True', 'Season': '3', 'Title': 'Scandal', 'totalSeasons': '6'}

Okay! As we can see, data are indexed by Episodes. What do we have about the first episode?


In [167]:
print(data['Episodes'][0])


{'Episode': '1', 'imdbRating': '8.6', 'Released': '2013-10-03', 'Title': "It's Handled", 'imdbID': 'tt3134028'}

So now we know which attributes each episode has. imdbRating is quite interesting, let's work on that. If we want to find out the rating for the seventh episode, all we gotta do is:


In [170]:
print(data['Episodes'][6]['imdbRating'])


8.3

Since we want to analyze data statistically, let's create a list and keep all ratings grouped


In [141]:
ratings = []
for episode in data['Episodes']:
    ratings.append(episode['imdbRating'])

In [171]:
print(ratings)


['8.6', '8.4', '8.4', '7.8', '8.4', '8.0', '8.3', '8.4', '8.0', '8.1', '8.1', '8.2', '8.4', '8.5', '8.0', '8.3', '8.4', '8.7']

In [142]:
from scipy import stats

What rating can be considered the mode for that season?


In [143]:
mode = stats.mode(ratings)
print(mode)


ModeResult(mode=array(['8.4'], 
      dtype='<U3'), count=array([6]))
/home/lmayra/.pyenv/versions/3.4.3/lib/python3.4/site-packages/scipy/stats/stats.py:257: RuntimeWarning: The input array could not be properly checked for nan values. nan values will be ignored.
  "values. nan values will be ignored.", RuntimeWarning)

In [144]:
print(mode[0][0])


8.4

How many episodes have been rated like this?


In [139]:
print(mode[1][0])


6