Overview

The purpose of this Data Science Experience (DSX) project is to show how recommendations can be done with Apache Spark and integrated into a web application. This project uses a randomly (but biased) generated dataset with approximately two thousand movies and 500,000 ratings. The ratings have been generated randomly

The overall web application architecture can be seen here:

There is a live demo web application available here: https://movie-recommend-demo.mybluemix.net

Below you can see a screenshot from the demo web application where the logged in user has searched for movies with 'harry' in the title and is then rating a movie.

Instructions

The project is split into a number of different notebooks that focus on specific steps.

Step 1 - Exploratory analysis

In this notebook, we perform some basic exploratory analysis of the ratings dataset before we jump into machine learning.

Step 2 - Train model

Here we use Spark's Machine Learning Library (MLlib) to train a machine learning model on the data.

Step 3 - Predict ratings

In this notebook, we simulate a new user's movie ratings and then use those ratings to predice movies for them.

Step 4 - Realtime Recommendations

The Apache spark trained model is designed to be built as a batch process. This notebook we investigate how we can augment the batch generated model with ratings for new users so that we can provide recommendations without having to wait for the next batch run.

Step 5 - Setup Movie Recommendation Web App

If you haven't setup your own instance of the demo web application with Cloudant and Compose Redis, this step will walk you through that process.

Step 6 - Install Spark Cloudant

In this notebook we install the latest Spark Cloudant library to use Cloudant as a source of rating data and as a destination for the generated recommendations.

Step 7 - Setup Movie Recommendation Web Application

In this notebook, we walk through setting up a web application where users can rate movies and receive recommendations using the Cloudant Datastore Recommender.

Support

If you have any questions about this project, please contact me at chris.snow@uk.ibm.com

Credits

This site was really useful https://github.com/jadianes/spark-movie-lens



In [ ]:



In [ ]: