Recommender System / Dimensionality Reduction

Question 1

Here is a table of 1-5 star ratings for five movies (M, N, P. Q. R) by three raters (A, B, C).

M N P Q R
A 1 2 3 4 5
B 2 3 2 5 3
C 5 5 5 3 2

Normalize the ratings by subtracting the average for each row and then subtracting the average for each column in the resulting table.


In [1]:
import numpy as np

In [2]:
# Matrix containing raw movie ratings by users A, B and C
rawRating = np.array([[1,2,3,4,5],
                      [2,3,2,5,3],
                      [5,5,5,3,2]])

# Matrix containing average movie ratings by users A, B and C
userAvg = np.array([[3,3,3,3,3],
                    [3,3,3,3,3],
                    [4,4,4,4,4]])

In [3]:
# the normalized movie ratings for users A, B and C
normalRating = rawRating - userAvg
print "Normalized user ratings: "
print normalRating


Normalized user ratings: 
[[-2 -1  0  1  2]
 [-1  0 -1  2  0]
 [ 1  1  1 -1 -2]]

In [4]:
movieAvg = np.array([[-2.0/3, 0, 0, 2.0/3, 0],
                     [-2.0/3, 0, 0, 2.0/3, 0],
                     [-2.0/3, 0, 0, 2.0/3, 0]])

normalizedScore = normalRating - movieAvg
print "Normalized movie scores: "
print normalizedScore


Normalized movie scores: 
[[-1.33333333 -1.          0.          0.33333333  2.        ]
 [-0.33333333  0.         -1.          1.33333333  0.        ]
 [ 1.66666667  1.          1.         -1.66666667 -2.        ]]

Question 2

This is a table giving the profile of three items:

A 1 0 1 0 1 2
B 1 1 0 0 1 6
C 0 1 0 1 0 2

The first five attributes are Boolean, and the last is an integer "rating." Assume that the scale factor for the rating is α. Compute, as a function of α, the cosine distances between each pair of profiles. For each of α = 0, 0.5, 1, and 2, determine the cosine of the angle between each pair of vectors.


In [5]:
import numpy as np
import math

In [6]:
def vectorLength(v):
    return math.sqrt(sum([vi**2 for vi in v]))

def vectorProducts(v1,v2):
    return np.dot(v1,v2)/(vectorLength(v1)*vectorLength(v2))

def cosineAngle(v, alpha):
    va = [vi for vi in v]
    va[-1] *= alpha
    return va

In [7]:
A = [1,0,1,0,1,2]
B = [1,1,0,0,1,6]
C = [0,1,0,1,0,2]

alphaVec = [0,0.5,1,2]

for alpha in alphaVec:
    
    AA = cosineAngle(A, alpha)
    AB = cosineAngle(B, alpha)
    AC = cosineAngle(C, alpha)
    
    cosAB = vectorProducts(AA,AB)
    cosAC = vectorProducts(AA,AC)
    cosBC = vectorProducts(AB,AC)
    
    print "Alpha: %s" % alpha
    print "cos(A,B): %s" % (cosAB)
    print "cos(A,C): %s" % (cosAC)
    print "cos(B,C): %s" % (cosBC)
    print "\n",


Alpha: 0
cos(A,B): 0.666666666667
cos(A,C): 0.0
cos(B,C): 0.408248290464

Alpha: 0.5
cos(A,B): 0.721687836487
cos(A,C): 0.288675134595
cos(B,C): 0.666666666667

Alpha: 1
cos(A,B): 0.847318545736
cos(A,C): 0.617213399848
cos(B,C): 0.849836585599

Alpha: 2
cos(A,B): 0.946094540761
cos(A,C): 0.865180912697
cos(B,C): 0.952579344416

Question 3

In this question, all columns will be written in their transposed form, as rows, to make the typography simpler. Matrix M has three rows and two columns, and the columns form an orthonormal basis. One of the columns is [2/7,3/7,6/7]. There are many options for the second column [x,y,z]. Write down those constraints on x, y, and z. Then, identi fy in the list below the one column that could be [x,y,z]. All components are computed to three decimal places, so the constraints may be satisfied only to a close approximation.


In [8]:
import numpy as np
from numpy import dot
from numpy.linalg import norm

base = [2./7,3./7,6./7]

Matrix = np.array([[0.702, -0.702, 0.117],
                   [-0.548, 0.401, 0.273],
                   [-0.288, -0.490, 0.772],
                   [0.975, 0.700, -0.675]])

In [9]:
for M in Matrix:
    print M, dot(base, M), norm(M)


[ 0.702 -0.702  0.117] 0.0 0.999648438202
[-0.548  0.401  0.273] 0.249285714286 0.731870207072
[-0.288 -0.49   0.772] 0.369428571429 0.958659480733
[ 0.975  0.7   -0.675] -1.11022302463e-16 1.37704393539

Question 4

Suppose we have three points in a two dimensional space: (1,1), (2,2), and (3,4). We want to perform PCA on these points, so we construct a 2-by-2 matrix whose eigenvectors are the directions that best represent these three points. Construct this matrix.


In [10]:
import numpy as np

M = np.array([[1,1], [2,2], [3,4]])
MT = np.transpose(M)

MTM = np.dot(MT, M)

print "M: "
print M
print

print "MT: "
print MT
print

print "MTM: "
print MTM


M: 
[[1 1]
 [2 2]
 [3 4]]

MT: 
[[1 2 3]
 [1 2 4]]

MTM: 
[[14 17]
 [17 21]]

Question 5

Identify the vector that is orthogonal to the vector [1,2,3].


In [11]:
import numpy as np

M = np.array([[0, 2, -1],
              [-3, 4, -2],
              [-1, -1, 1],
              [-4, 2, -1]])

Y = np.transpose(np.array([1, 2, 3]))
A = np.dot(M, Y)

print A


[ 1 -1  0 -3]

Question 6

Consider the diagonal matrix M =

1 0 0
0 2 0
0 0 0

Compute its Moore-Penrose pseudoinverse,


In [12]:
import numpy as np
M = np.array([[1,0,0],[0,2,0],[0,0,0]])
print M


[[1 0 0]
 [0 2 0]
 [0 0 0]]

In [13]:
MInv = np.linalg.pinv(M)
print MInv


[[ 1.   0.   0. ]
 [ 0.   0.5  0. ]
 [ 0.   0.   0. ]]

In [ ]: