Testing the Recommender System on Ruihai's TripAdvisor Dataset

The original Ruihai TripAdvisor dataset contains:

  • 167430 reviews
  • Made by 150962 users
  • About 2312 items
  • It has an approximated sparsity of 0.999520291066

Since the sparsity is very high, we are going to filter the dataset to include only the reviews of users who have made five or more reviews. That leave us with a dataset of:

  • 1323 reviews
  • Made by 220 users
  • About 860 items
  • It has an approximated sparsity of 0.993007399577

Still the sparsity is very high, but we will work with this filtered dataset. We did not filter the sparsity further because the number of resulting reviews was going to be very low for our recommendation purpouses.

A detailed explanation of the filtered dataset can be found here

Since the number of reviews is so low (1323), everytime a shuffling on the data is performed the results of the recommender vary considerably. So we are going to take an average of 10 runs on the presented results.

Adjusted Weighted Sum Recommender

  • MAE: 0.7973289
  • RMSE: 1.0342732
  • Coverage: 0.0914644

Average Recommender

  • MAE: 0.8257624
  • RMSE: 1.0267473
  • Coverage: 1.0

Dummy

  • MAE: 0.7558525
  • RMSE: 1.0537799
  • Coverage: 1.0

As we can see, the Dummy recommender has a better performance than the Adjusted Weighted Sum Recommender, which may be explained by the fact that the dataset is very sparse. We can also see that the Coverage of the Adjusted Weighted Sum Recommender is very low.