Demo of svdRec, a Python3 module for Recommenders

Download the movielens dataset here

Before loading in the data, I highly recommend cutting out some of the "ratings.csv" file. It's 20M rows long and can take a long time to process. In bash you can do something like:

cat ratings.csv | head -100000 > ratings_small.csv

We'll also load in the movies.csv file to a DataFrame - this will act as a dictionary to translate between MovieID and Movie Titles.


In [1]:
from svdRec import svdRec
import pandas as pd 

svd = svdRec.svdRec()
svd.load_csv_sparse('data/ml-20m/ratings_small.csv', delimiter=',', skiprows=1)
svd.SVD(num_dim=100)


Note: load_csv_sparse expects a csv in the format of: rowID, colID, Value, ...
Created matrix of shape:  (702, 128594)

In [2]:
movies = pd.read_table('data/ml-20m/movies.csv', sep=',',names = ['movieId',"Title","genres"])
movie_dict = {}
for i, row in movies.iterrows():
    movie_dict.update({row['movieId']: row['Title']})
svd.load_item_encoder(movie_dict)

In [3]:
MOVIE_ID = 3114 # Toy Story 2
for item in svd.get_similar_items(MOVIE_ID,show_similarity=True):
    print(item)
    print(svd.get_item_name(item[0]),'\n')


(3114, 0.12452773823527524)
Toy Story 2 (1999) 

(1, 0.096984857294616089)
Toy Story (1995) 

(2355, 0.043104443630875205)
Bug's Life, A (1998) 

(2987, 0.041949127538017023)
Who Framed Roger Rabbit? (1988) 

(6377, 0.040522854363774369)
Finding Nemo (2003) 


In [6]:
USERID=25
for item in svd.recommends_for_user(USERID, num_recom=5, show_similarity=False):
    print("ID: ", item)
    print("Actual Rating: ", svd.mat.toarray()[USERID][item])
    print("Title: ",svd.get_item_name(item),'\n')


ID:  356
Actual Rating:  5.0
Title:  Forrest Gump (1994) 

ID:  1961
Actual Rating:  0.0
Title:  Rain Man (1988) 

ID:  1270
Actual Rating:  0.0
Title:  Back to the Future (1985) 

ID:  1097
Actual Rating:  0.0
Title:  E.T. the Extra-Terrestrial (1982) 

ID:  1307
Actual Rating:  0.0
Title:  When Harry Met Sally... (1989) 


In [9]:
user_to_rec = 3
print("Items for User %s to check out based on similar user:\n"% user_to_rec, svd.recs_from_closest_user(user_to_rec,num_users=2))


User #3's most similar user is User #[282, 488] 
Items for User 3 to check out based on similar user:
 [0, 1, 5, 4104, 9, 2057, 6153, 6154, 15, 16, 18, 20, 24, 28, 31, 33, 2081, 35, 38, 46, 49, 57, 2107, 61, 4160, 4161, 2114, 6217, 86, 2141, 94, 2151, 30824, 109, 110, 2158, 116, 6265, 2173, 4224, 4231, 6280, 2187, 140, 139, 4239, 149, 4245, 4247, 152, 6293, 6296, 156, 160, 164, 8359, 8360, 171, 172, 8372, 2230, 184, 6332, 8382, 193, 197, 4298, 207, 4305, 4307, 2268, 6364, 6366, 2272, 4320, 6372, 230, 231, 4328, 4326, 234, 6376, 240, 241, 2290, 2288, 4339, 2293, 246, 4342, 4343, 6385, 6391, 2299, 2301, 255, 2305, 259, 261, 2312, 4360, 6409, 2317, 4366, 271, 2320, 4367, 4368, 2323, 276, 4369, 6415, 2328, 6428, 2334, 287, 2336, 289, 4385, 291, 294, 295, 299, 2352, 306, 307, 2354, 4405, 315, 317, 318, 328, 2383, 336, 338, 2389, 343, 2392, 347, 348, 349, 2395, 4445, 4446, 2401, 354, 355, 356, 2405, 6497, 6502, 360, 361, 363, 366, 369, 373, 376, 2426, 379, 6533, 4488, 6536, 6538, 6549, 2454, 412, 419, 431, 433, 434, 439, 2489, 6592, 453, 2501, 456, 2504, 467, 2516, 470, 479, 480, 2532, 484, 2538, 491, 2541, 499, 506, 2561, 6657, 2566, 519, 2570, 526, 2579, 4627, 538, 540, 4637, 4638, 4640, 4642, 550, 2599, 6710, 579, 2627, 584, 586, 588, 589, 591, 592, 2639, 596, 598, 4700, 607, 2670, 2671, 4718, 2676, 4727, 6775, 2682, 4731, 4740, 647, 2698, 2699, 2705, 4756, 2709, 2711, 664, 2715, 2717, 2719, 2721, 8865, 2723, 4798, 2760, 713, 2762, 719, 4815, 2769, 8911, 4822, 6872, 732, 735, 2787, 2790, 742, 2792, 746, 2796, 749, 4847, 8948, 8960, 4866, 8968, 777, 4873, 779, 2828, 4877, 783, 784, 785, 787, 4885, 8982, 8983, 4888, 4889, 2857, 817, 2870, 2881, 834, 836, 2889, 2890, 4946, 857, 4957, 4962, 2915, 2917, 4972, 4973, 4974, 4978, 4989, 4990, 4991, 4992, 4993, 2946, 2947, 2948, 2950, 903, 2958, 911, 912, 5008, 5013, 2967, 922, 923, 2975, 5026, 2984, 2992, 2994, 2996, 5047, 952, 5050, 3004, 5059, 31684, 967, 3015, 3016, 968, 5063, 31693, 5091, 5095, 7146, 3051, 7148, 3059, 1019, 3074, 7172, 3081, 1035, 1036, 5134, 1041, 1046, 1058, 3112, 3113, 5170, 1078, 1079, 27705, 1088, 1089, 1093, 1096, 1100, 3151, 3156, 7254, 3159, 1119, 5217, 5219, 3173, 1126, 3175, 1128, 1125, 3174, 5224, 1134, 1135, 1146, 1147, 7292, 3197, 7302, 3207, 7310, 1172, 38037, 1174, 1175, 7322, 7324, 27807, 1185, 5281, 1187, 1188, 5283, 1192, 1195, 1197, 1198, 1199, 1200, 7345, 5298, 3252, 1205, 3253, 1207, 3256, 1209, 5308, 1213, 1214, 1215, 7360, 1218, 3266, 1220, 1221, 1222, 3270, 1224, 3268, 7365, 5323, 5324, 1229, 1233, 1239, 1240, 1244, 1245, 1246, 1249, 3300, 5348, 1255, 1257, 1258, 1259, 1260, 3309, 1262, 1261, 1264, 1269, 1270, 1274, 1275, 3327, 1280, 5375, 5376, 5379, 5381, 1287, 1290, 1297, 7443, 1303, 5400, 3358, 3362, 5414, 1319, 5417, 3385, 1345, 5442, 1347, 5443, 1349, 5444, 5448, 1353, 5451, 3407, 5457, 5458, 1366, 5463, 1369, 3420, 3423, 5478, 5480, 1386, 1392, 1393, 34161, 1395, 3447, 5501, 1408, 3467, 3469, 3470, 3480, 1436, 3497, 3498, 1465, 3523, 5571, 3526, 5576, 1484, 3537, 3542, 1499, 5601, 5602, 1512, 1516, 3568, 5617, 5619, 5620, 1526, 3577, 3580, 5635, 1541, 1543, 1561, 1563, 3614, 3616, 1572, 5668, 3622, 5672, 1579, 3628, 1583, 3632, 1586, 1587, 3638, 26171, 1609, 1613, 1616, 3675, 3687, 1640, 3696, 3701, 3702, 5751, 3716, 1672, 1675, 3725, 1679, 1681, 1689, 1692, 1700, 1701, 3750, 1703, 3751, 3752, 5799, 1712, 5809, 3763, 1720, 1721, 1725, 1728, 1730, 1731, 3792, 1746, 1747, 1759, 3820, 5871, 1776, 3825, 1778, 3824, 3830, 1783, 5880, 1800, 5899, 5901, 1808, 3861, 3868, 1830, 3881, 1834, 3892, 5942, 1847, 3896, 5944, 5956, 3910, 1875, 1883, 1887, 5988, 5990, 3947, 1906, 6002, 6005, 1910, 1918, 3966, 3967, 3968, 1922, 1923, 3976, 3983, 3985, 3987, 3989, 3993, 3995, 4001, 4004, 1960, 4010, 6059, 4013, 4014, 4015, 4017, 4021, 4022, 4024, 4026, 1981, 1982, 4033, 1996, 1999, 2001, 4060, 4068, 2027, 6124]

In [ ]: