Clustering Collaborative Filtering Recommender System Using Eculidean Distance

In this notebook we are going to explain how the Recommender system called Clu_CF_Euc that appears in the paper 'Multi-Criteria Service Recommendation Based on User Criteria Preferences' is implemented.

To implement this recommender system we have created a class called CluCFEuc which contains several methods, the most important of all which is predict_rating(user_id, item_id).

In the following sections we are going to describe the most important parts of the CluCFEuc class

Initializer

In the firsts versions of this class the recommendations took a long time to be computed and that because a lot of calculations had to be done each time a rating of an item was to be predicted for a user. To solve this, many of the calculations are done before engaging recommendations and are stored as attributes of the object. The information that is pre-calculated is:

  • Users: A dictionary containing all the users in the train data, where the key is the user ID and the value is an user object. Each user object contains a user ID, the average overall rating that this user has given to hotels, the criteria weights of this user, the cluster this user belongs to, and the overall ratings that this user has given to the items he/she has reviewed.
  • User clusters: A dictionary containing all the clusters, where the key is the name of the cluster and the value is a list of all the IDs of the users that belong to this cluster.
  • User reviews: A dictionary containing all the reviews the users have made, where the key is the user ID and the value is a list of the reviews this user has made.
  • Users' similarities matrix: A dictionary that contains the similaries between each pair of users, where the key is the user ID and the value is another dictionary that contains the numerical values of the similarity between this user and the other ones.

Additionally, the user IDs and the item IDs that are present in the training data are also stored in attributes for easy access.

The following is the code of the initializer method:


In [ ]:
def __init__(self, reviews):
    self.reviews = reviews
    self.user_dictionary = extractor.initialize_users(self.reviews)
    print('Users Initialized')
    self.user_cluster_dictionary = fourcity_clusterer.build_user_clusters(self.reviews)
    print('User cluster built')
    self.user_ids = extractor.get_groupby_list(self.reviews, 'user_id')
    self.item_ids = extractor.get_groupby_list(self.reviews, 'offering_id')
    self.user_reviews_dictionary = fourcity_clusterer.build_user_reviews_dictionary(self.reviews, self.user_ids)
    print('User reviews dictionary built')
    self.user_similarity_matrix = {}
    self.build_user_similarities_matrix()
    print('Users similarities matrix built')
    
def initialize_users(reviews):
    """
    Builds a dictionary containing all the users in the reviews. Each user
    contains information about its average overall rating, the list of reviews
    that user has made, and the cluster the user belongs to

    :param reviews: the list of reviews
    :return: a dictionary with the users initialized, the keys of the
    dictionaries are the users' ID
    """
    user_ids = get_groupby_list(reviews, 'user_id')
    user_dictionary = {}

    for user_id in user_ids:
        user = User(user_id)
        user_reviews = ETLUtils.filter_records(reviews, 'user_id', [user_id])
        user.average_overall_rating = get_user_average_overall_rating(
            user_reviews, user_id, apply_filter=False)
        user.criteria_weights = get_criteria_weights(
            user_reviews, user_id, apply_filter=False)
        _, user.cluster = get_significant_criteria(user.criteria_weights)
        user_dictionary[user_id] = user
        user.item_ratings = get_user_item_ratings(user_reviews)

    print('Total users: %i' % len(user_ids))

    return user_dictionary


def build_user_clusters(reviews):
    """
    Builds a series of clusters for users according to their significant
    criteria. Users that have exactly the same significant criteria will belong
    to the same cluster.

    :param reviews: the list of reviews
    :return: a dictionary where all the keys are the cluster names and the
    values for those keys are list of users that belong to that cluster
    """

    user_list = extractor.get_groupby_list(reviews, 'user_id')
    user_cluster_dictionary = {}

    for user in user_list:
        weights = extractor.get_criteria_weights(reviews, user)
        significant_criteria, cluster_name = extractor.get_significant_criteria(weights)

        if cluster_name in user_cluster_dictionary:
            user_cluster_dictionary[cluster_name].append(user)
        else:
            user_cluster_dictionary[cluster_name] = [user]

    return user_cluster_dictionary

def build_user_reviews_dictionary(reviews, users):
    """
    Builds a dictionary that contains all the reviews the users have made where
    the key is the user ID and the value is a list of the reviews this user has
    made.

    :param reviews: a list of reviews
    :param users: the list of users to be considered
    :return: a dictionary that contains all the reviews the users have made
    where the key is the user ID and the value is a list of the reviews this
    user has made.
    """
    user_reviews_dictionary = {}

    for user in users:
        user_reviews_dictionary[user] =\
            ETLUtils.filter_records(reviews, 'user_id', user)

    return user_reviews_dictionary

def calculate_users_similarity(self, user_id1, user_id2):
    """
    Calculates the similarity between two users based on how similar are their
    ratings in the reviews

    :param user_id1: the ID of user 1
    :param user_id2: the ID of user 2
    :return: a float with the similarity between the two users. Since this
    function is based on euclidean distance to calculate the similarity, a
    similarity of 0 indicates that the users share exactly the same tastes
    """
    user_weights1 = self.user_dictionary[user_id1].criteria_weights
    user_weights2 = self.user_dictionary[user_id2].criteria_weights

    return fourcity_clusterer.calculate_euclidean_distance(user_weights1, user_weights2)

def build_user_similarities_matrix(self):
    """
    Builds a matrix that contains the similarity between every pair of users
    in the dataset of this recommender system. This is particularly useful
    to prevent repeating the same calculations in each cycle

    """
    for user1 in self.user_ids:
        self.user_similarity_matrix[user1] = {}
        for user2 in self.user_ids:
            self.user_similarity_matrix[user1][user2] = self.calculate_users_similarity(user1, user2)

Predict user rating

With all the information needed stored in the object's attributes, we can now proceed to predict the rating a user will give to a certain item. This is done in the predict_rating(user_id, item_id)


In [ ]:
def predict_rating(self, user_id, item_id):
    """
    Predicts the rating the user will give to the hotel

    :param user_id: the ID of the user
    :param item_id: the ID of the hotel
    :return: a float between 1 and 5 with the predicted rating
    """
    if user_id not in self.user_reviews_dictionary:
        return None

    cluster_name = self.user_dictionary[user_id].cluster

    # We remove the given user from the cluster in order to avoid bias
    cluster_users = list(self.user_cluster_dictionary[cluster_name])
    cluster_users.remove(user_id)

    similarities_sum = 0.
    similarities_ratings_sum = 0.
    
    for cluster_user in cluster_users:
        cluster_user_overall_rating = self.user_dictionary[cluster_user].average_overall_rating
        users_similarity = self.user_similarity_matrix[cluster_user][user_id]

        if item_id in self.user_dictionary[cluster_user].item_ratings:
            cluster_user_item_rating = self.user_dictionary[cluster_user].item_ratings[item_id]
            similarities_sum += users_similarity
            similarities_ratings_sum += users_similarity * (cluster_user_item_rating - cluster_user_overall_rating)
            
    predicted_rating = None

    if similarities_sum > 0:
        user_average_rating = self.user_dictionary[user_id].average_overall_rating
        predicted_rating = \
            user_average_rating + similarities_ratings_sum / similarities_sum

        # We cut the rating to the limits of the range
        if predicted_rating > 5:
            predicted_rating = 5
        elif predicted_rating < 1:
            predicted_rating = 1

    return predicted_rating

The above method will return a value in the range [1, 5] with the predicted value. We can now proceed to perform an evaluation of the recommender system performace.

Evaluation

To evaluate this recommender system we use two kinds of metrics: accuracy metrics and relevance metrics.

Accuracy metrics

The metrics used to evaluate the accuracy of this algorithm are the mean absolute error (MAE) and the root mean square error (RMSE)


In [ ]:
def predict_ratings_list(self, reviews):
    """
    For each one of the reviews this method predicts the rating for the
    user and item contained in the review and also returns the error
    between the predicted rating and the actual rating the user gave to the
    item

    :param reviews: a list of reviews (the test data)
    :return: a tuple with a list of the predicted ratings and the list of
    errors for those predictions
    """
    predicted_ratings = []
    errors = []

    index = 0
    print('CluCFEuc')
    print('Total reviews: %i' % len(self.reviews))

    for review in reviews:
        # print('Index: %i' % index)
        index += 1

        user_id = review['user_id']
        hotel_id = review['offering_id']
        predicted_rating = self.predict_rating(user_id, hotel_id)
        actual_rating = None
        if hotel_id in self.user_dictionary[user_id].item_ratings:
            actual_rating = self.user_dictionary[user_id].item_ratings[hotel_id]

        error = None

        if predicted_rating is not None and actual_rating is not None:
            error = abs(predicted_rating - actual_rating)

        predicted_ratings.append(predicted_rating)
        errors.append(error)

    return predicted_ratings, errors

The above method takes a list of reviews (test dataset) and predicts the rating a user would give to the item that appears in each review. It returns a pair composed of a list of all predicted ratings and a list of the errors of the predicted ratings. With this information now we can go on and proceed to calculate the MAE and RMSE.

We perform a K-Fold evaluation with the dataset splitted into 80-20 and take the average of the MAE and RMSE in 5 cycles, as shown below:


In [ ]:
def perform_clu_overall_cross_validation():

    # reviews = extractor.pre_process_reviews()
    reviews = extractor.load_json_file('/Users/fpena/tmp/filtered_reviews.json')

    for i in xrange(0, 5):
        train, test = ETLUtils.split_train_test(reviews)
        user_cluster_dictionary = fourcity_clusterer.build_user_clusters(train)
        _, errors = fourcity_clusterer.clu_overall_list(test, user_cluster_dictionary)
        mean_absolute_error = calculate_mean_average_error(errors)
        print('Mean Absolute error: %f' % mean_absolute_error)
        root_mean_square_error = calculate_root_mean_square_error(errors)
        print('Root mean square error: %f' % root_mean_square_error)

Relevance metrics

To evaluate how good is the relevance of the predicted items we use the precision and recall metrics, which are coded as follows:


In [ ]:
def calculate_user_recall(self, user, n):
    """
    Calculates the recall of this recommender system for a given user.
    The recall is defined as the number of correct hits divided by the
    total number of items that this user likes

    :param user: the user ID
    :param n: the number of items to be displayed to the user
    :return: the recall of this recommender system
    """
    num_hits = 0
    favorite_items = extractor.get_five_star_hotels_from_user(self.user_reviews_dictionary[user], 4.5)

    if not favorite_items:
        return None

    items = self.item_ids + favorite_items
    length = n + len(favorite_items)
    predicted_ratings = self.predict_user_ratings(user, items)
    sorted_ratings = sorted(predicted_ratings.iteritems(), key=operator.itemgetter(1))
    sorted_ratings.reverse()
    sorted_ratings = sorted_ratings[:length]

    for item, rating in sorted_ratings:
        if item in favorite_items:
            print('Item: %s\t Rating: %f' % (item, rating))
            num_hits += 1

    recall = float(num_hits) / float(len(favorite_items))
    return recall

def calculate_recall(self, users, n):
    """
    Calculates the recall of this recommender system for a list of users.
    The recall is defined as the number of correct hits divided by the
    total number of items that this user likes. This method returns the
    average of the recalls for each user

    :param users: a list with the IDs of the users
    :param n: the number of items to be displayed to the user
    :return: the recall of this recommender system
    """

    total_recall = 0
    num_cycles = 0
    index = 0

    for user in users:
        print('Index %i' % index)
        index += 1
        recall = self.calculate_user_recall(user, n)
        if recall is not None:
            total_recall += recall
            num_cycles += 1
            print('Recall: %f' % recall)

    average_recall = total_recall / float(num_cycles)
    print('Average recall: %f' % average_recall)

    return average_recall


def perform_clu_cf_euc_top_n_validation():
    reviews = extractor.load_json_file('/Users/fpena/tmp/filtered_reviews.json')
    clusterer = CluCFEuc(reviews)
    users = extractor.get_groupby_list(reviews, 'user_id')
    clusterer.calculate_recall(users, 10)

Results

Accuracy Metrics

Range open (one cluster)

Mean absolute error: 0.643860
Root mean square error: 0.836126
--- 329.206673861 seconds ---

Range [0.8, 1.2] or [-1.2, -0.8]

Mean Absolute error: 0.725547
Root mean square error: 0.978313
--- 330.151209831 seconds ---

Range [0.7, 1.3] or [-1.3, -0.7]

Mean absolute error: 0.703855
Root mean square error: 0.933754
--- 326.090766191 seconds ---

Range [0.5, 1.5] or [-1.5, -0.5]

Mean absolute error: 0.721087
Root mean square error: 0.962147
--- 319.189477921 seconds ---

Range [0.1, 1.9] or [-1.9, -0.1]

Mean Absolute error: 0.631229
Root mean square error: 0.839463
--- 230.939920902 seconds ---

Dummy predictor (predicts all ratings to be 3.9)

Final mean absolute error: 0.743690
Final root mean square error: 0.930624
--- 285.975922108 seconds ---

Cosine similarity

Final mean absolute error: 0.650432
Final root mean square error: 0.826167
--- 463.971400023 seconds ---

Relevance Metrics

n=10

Without trim (open range)
Average recall: 0.016286
Average precision: 0.0016286
--- 1193.08387589 seconds —

With trim (range [1,5])
Average recall: 0.015881
Average precision: 0.0015881
--- 1346.18048596 seconds —

n=500

With trim (open range) Average recall: 0.495125
Average precision: 0.00099025
--- 1083.18167305 seconds —

Without trim (range [1,5])
Average recall: 0.495721
Average precision: 0.000991442
--- 1097.08114195 seconds —

Things to do

  • Compare to other recommnder systems. I already asked Marius for some code to compare with, I still have to fit the data into it. I also read that there are several recommender systems inside the Mahout system, I still have to explore this deeply.