Using GraphLab Create we can take an SFrame containing user ratings for movies, and quickly create a recommender. We'll see how to tune parameters for this recommender and how to get a sense of what its performance will be in practice. Finally (mostly for fun) we'll create two stereotypical users, and see what they get recommended.
To start, let's import a few modules.
In [1]:
import graphlab as gl
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
data_url = 'https://static.turi.com/datasets/movie_ratings/sample.small'
sf = gl.SFrame.read_csv(data_url,delimiter='\t',column_type_hints={'rating':int})
Using the same data to train and evaluate a model is problematic. Specifically this leads to a problem called overfitting. We'll follow the common approach of holding out a randomly selected 20% of our data to use later for evaluation.
Now we just use this helper function to get our train and test set.
In [3]:
(train_set, test_set) = sf.random_split(0.8)
Now that we have a train and test set, let's come up with a very simple way of predicting ratings. That way when we try more complicated things, we'll have some baseline for comparison.
GraphLab's PopularityRecommender provides this functionality. It just stores the mean rating per item. When asked to predict a user's rating for a particular item pair, it just predicts the mean of all ratings for that item; it pays no attention to user information.
In order to use the PopularityRecommender, all we need to do is pass its create function the data and tell it the pertinent column names.
In [4]:
m = gl.popularity_recommender.create(train_set, 'user', 'movie', 'rating')
Now that we have a (simple) model, we need a way to measure the accuracy of its predictions. That way we can compare the performance of different models. The Root Mean Squared Error is one of the most common ways to measure the accuracy.
In [5]:
baseline_rmse = gl.evaluation.rmse(test_set['rating'], m.predict(test_set))
print baseline_rmse
The type of model that turned out to be the best for the famous Netflix Competition is called Matrix Factorization. This is a form of Collaborative Filtering where recommendations are generated using ratings from users that are some how similar.
Whenever you use a particular type of model there are almost inevitably some parameters you must specify. Matrix Factorization is no exception. Properly tuning these parameters can have a huge effect on how well your model works. The two most important parameters are the number of dimensions and the regularization coefficient.
In [6]:
regularization_vals = [0.001, 0.0001, 0.00001, 0.000001]
models = [gl.factorization_recommender.create(train_set, 'user', 'movie', 'rating',
max_iterations=50, num_factors=5, regularization=r)
for r in regularization_vals]
In [7]:
# Save the train and test RMSE, for each model
(rmse_train, rmse_test) = ([], [])
for m in models:
rmse_train.append(m['training_rmse'])
rmse_test.append(gl.evaluation.rmse(test_set['rating'], m.predict(test_set)))
Let's create a plot to show the RMSE for these different regularization values.
In [8]:
(fig, ax) = plt.subplots(figsize=(10, 8))
[p1, p2, p3] = ax.semilogx(regularization_vals, rmse_train,
regularization_vals, rmse_test,
regularization_vals, len(regularization_vals) * [baseline_rmse]
)
ax.set_ylim([0.7, 1.1])
ax.set_xlabel('Regularization', fontsize=20)
ax.set_ylabel('RMSE', fontsize=20)
ax.legend([p1, p2, p3], ["Train", "Test", "Baseline"])
Out[8]:
Looks like we get the best Root Mean Squared Error (on the test set) when n_factors = 5 and regularization = 0.00001. Let's use those parameters!
In [9]:
data_url = 'https://static.turi.com/datasets/movie_ratings/sample.large'
data = gl.SFrame.read_csv(data_url,delimiter='\t',column_type_hints={'rating':int})
In [10]:
action_movies = ['GoldenEye', 'Casino Royale', 'Independence Day', 'Con Air', 'The Rock',
'The Bourne Identity', 'Ocean\'s Eleven', 'Lethal Weapon 4', 'Gladiator',
'Indiana Jones and the Last Crusade', 'The Matrix', 'Kill Bill: Vol. 1',
'Air Force One', 'Braveheart', 'The Man with the Golden Gun',
'The Bourne Supremacy', 'Saving Private Ryan']
romance_movies = ['Sleepless in Seattle', 'An Affair to Remember', 'Ghost', 'Love Actually',
'You\'ve Got Mail', 'Notting Hill', 'Titanic', 'Miss Congeniality',
'Some Like It Hot', 'Pretty Woman', 'How to Lose a Guy in 10 Days']
# Boring helper function to create ratings
def ratings(movie_list, user, rating):
num = len(movie_list)
records = {'user': [user] * num, 'movie': movie_list, 'rating': [rating] * num}
return gl.SFrame(records)
# Loves action movies, hates romance movies
action_user = 'Archie the Action Lover'
action_user_ratings = ratings(action_movies, action_user, 5)
action_user_ratings = action_user_ratings.append(ratings(romance_movies, action_user, 1))
# Loves romance movies, hates action movies
romantic_user = 'Rebecca the Romance Lover'
romantic_user_ratings = ratings(action_movies, romantic_user, 1)
romantic_user_ratings = romantic_user_ratings.append(ratings(romance_movies, romantic_user, 5))
data = data.append(action_user_ratings)
data = data.append(romantic_user_ratings)
Now, let's create a matrix factorization model, using the larger data sample.
In [11]:
# Create a new model, using the larger dataset, with the tuned parameters
m = gl.ranking_factorization_recommender.create(data, 'user', 'movie', 'rating',
max_iterations=50, num_factors=5,
regularization=0.00001)
Let see what recommendations we get for our action lover.
In [12]:
# Show recommendations for the action lover.
recommendations = m.recommend(gl.SArray([action_user]), k=40)
print recommendations['movie']
Let see what recommendations we get for our romance lover.
In [13]:
# Show recommendations for the romance lover.
recommendations = m.recommend(gl.SArray([romantic_user]), k=40)
print recommendations['movie']
Looks good to me, especially considering these users haven't rated many movies!
(Looking for more details about the modules and functions? Check out the API docs.)