Using the recommendation library



In [1]:

    
import os
os.chdir('..')



In [2]:

    
# Import all the packages we need to generate recommendations
import numpy as np
import pandas as pd
import src.utils as utils
import src.recommenders as recommenders
import src.similarity as similarity

# imports necesary for plotting
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline  

# Enable logging on Jupyter notebook
import logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)



In [3]:

    
# loads dataset 
dataset_folder = os.path.join(os.getcwd(), 'data')
dataset_folder_ready = utils.load_dataset(dataset_folder)

# adds personal ratings to original dataset ratings file.
ratings_file = os.path.join(dataset_folder, 'ml-latest-small','ratings-merged.csv')
[ratings, my_customer_number] = utils.merge_datasets(dataset_folder_ready, ratings_file)









    



INFO:root:dataset was already downloaded
INFO:root:dataset stored in: /Users/hcorona/github/recsys-101-workshop/data/ml-latest-small
INFO:root:loaded 44 personal ratings
INFO:root:loaded 9125 movies
INFO:root:loaded 100048 ratings in total



In [4]:

    
# the data is stored in a long pandas dataframe
# we need to pivot the data to create a [user x movie] matrix
ratings_matrix = ratings.pivot_table(index='customer', columns='movie', values='rating', fill_value=0)
ratings_matrix = ratings_matrix.transpose()

Understanding Movie Similarity

Try with different movies
Try with different types of similarity metrics (look in /src/similarity.py)
Which similarity metric works the best?



In [5]:

    
# find similar movies 
# try with different movie titles and see what happens 
movie_title = 'Star Wars: Episode VI - Return of the Jedi (1983)'
similarity_type = "cosine"
logger.info('top-10 movies similar to %s, using %s similarity', movie_title, similarity_type)
print(similarity.compute_nearest_neighbours(movie_title, ratings_matrix, similarity_type)[0:10])









    



INFO:root:top-10 movies similar to Star Wars: Episode VI - Return of the Jedi (1983), using cosine similarity






    



                                                   item  similarity
7490  Star Wars: Episode VI - Return of the Jedi (1983)    1.000000
7489  Star Wars: Episode V - The Empire Strikes Back...    0.785080
7488          Star Wars: Episode IV - A New Hope (1977)    0.762233
6460  Raiders of the Lost Ark (Indiana Jones and the...    0.656852
4030          Indiana Jones and the Last Crusade (1989)    0.647390
657                           Back to the Future (1985)    0.631344
5164                   Men in Black (a.k.a. MIB) (1997)    0.627694
7860                             Terminator, The (1984)    0.609394
5102                                 Matrix, The (1999)    0.607405
7485   Star Wars: Episode I - The Phantom Menace (1999)    0.598890



In [6]:

    
# find similar movies 
# try with different movie titles and see what happens 
movie_title = 'All About My Mother (Todo sobre mi madre) (1999)'
similarity_type = "pearson"
logger.info('top-10 movies similar to: %s, using %s similarity', movie_title, similarity_type)
print(similarity.compute_nearest_neighbours(movie_title, ratings_matrix, similarity_type)[0:10])









    



INFO:root:top-10 movies similar to: All About My Mother (Todo sobre mi madre) (1999), using pearson similarity






    



                                                   item  similarity
307    All About My Mother (Todo sobre mi madre) (1999)    1.000000
672            Bad Education (La mala educación) (2004)    0.504208
7791                Talk to Her (Hable con Ella) (2002)    0.464245
2368  Dreamlife of Angels, The (Vie rêvée des anges,...    0.449805
1511         Central Station (Central do Brasil) (1998)    0.448065
5761                                     Nowhere (1997)    0.441745
1224                          Breaking the Waves (1996)    0.438522
3941                     Idiots, The (Idioterne) (1998)    0.436944
772   Battle of Algiers, The (La battaglia di Algeri...    0.432585
9015                  Your Friends and Neighbors (1998)    0.425468

Creating recommendations for your personal ratings

Try with different similarity metrics (look in /src/similarity.py)
Try with different values of K (K is the number of neigbhours to consider when generating the recommendations)
Which combination of K and number of metrics works better?, discuss it with others.



In [7]:

    
# get recommendations for a single user
recommendations = recommenders.recommend_uknn(ratings, my_customer_number, K=200, similarity_metric='cosine', N=10)
recommendations









    



INFO:root:computed nearest neighbours using cosine






    Out[7]:






  
    
      
      rating
      movie
    
  
  
    
      0
      3.231480
      Inception (2010)
    
    
      1
      2.727677
      Matrix, The (1999)
    
    
      2
      2.606161
      Fight Club (1999)
    
    
      3
      2.596321
      Dark Knight, The (2008)
    
    
      4
      2.595773
      Forrest Gump (1994)
    
    
      5
      2.547936
      Shawshank Redemption, The (1994)
    
    
      6
      2.525448
      Pulp Fiction (1994)
    
    
      7
      2.367299
      Lord of the Rings: The Fellowship of the Ring,...
    
    
      8
      2.112106
      Lord of the Rings: The Return of the King, The...
    
    
      9
      2.088432
      Back to the Future (1985)



In [8]:

    
# get recommendations for a single user
recommendations = recommenders.recommend_iknn(ratings, my_customer_number, K=100, similarity_metric='cosine')
recommendations









    Out[8]:






  
    
      
      rating
      movie
    
  
  
    
      0
      4.682608
      Grand Budapest Hotel, The (2014)
    
    
      1
      4.641638
      Dallas Buyers Club (2013)
    
    
      2
      4.583980
      Hugo (2011)
    
    
      3
      4.570574
      Harry Potter and the Deathly Hallows: Part 1 (...
    
    
      4
      4.563340
      The Imitation Game (2014)
    
    
      5
      4.561054
      Gravity (2013)
    
    
      6
      4.560607
      Way, Way Back, The (2013)
    
    
      7
      4.559606
      Star Trek (2009)
    
    
      8
      4.555776
      WALL·E (2008)
    
    
      9
      4.553129
      28 Weeks Later (2007)

	rating	movie
0	3.231480	Inception (2010)
1	2.727677	Matrix, The (1999)
2	2.606161	Fight Club (1999)
3	2.596321	Dark Knight, The (2008)
4	2.595773	Forrest Gump (1994)
5	2.547936	Shawshank Redemption, The (1994)
6	2.525448	Pulp Fiction (1994)
7	2.367299	Lord of the Rings: The Fellowship of the Ring,...
8	2.112106	Lord of the Rings: The Return of the King, The...
9	2.088432	Back to the Future (1985)

	rating	movie
0	4.682608	Grand Budapest Hotel, The (2014)
1	4.641638	Dallas Buyers Club (2013)
2	4.583980	Hugo (2011)
3	4.570574	Harry Potter and the Deathly Hallows: Part 1 (...
4	4.563340	The Imitation Game (2014)
5	4.561054	Gravity (2013)
6	4.560607	Way, Way Back, The (2013)
7	4.559606	Star Trek (2009)
8	4.555776	WALL·E (2008)
9	4.553129	28 Weeks Later (2007)