Quiz - Week 4A

Q1.


  • Here is a table of 1-5 star ratings for five movies (M, N, P. Q. R) by three raters (A, B, C).
     M  N   P   Q   R
A    1  2   3   4   5
B   2   3   2   5   3
C   5   5   5   3   2
  • Normalize the ratings by subtracting the average for each row and then subtracting the average for each column in the resulting table. Then, identify the true statement about the normalized table.

Solution 1.

Step 1. Mean Calculation for row

       M    N    P    Q    R    Mean
  A    1    2    3    4    5    3
  B    2    3    2    5    3    3
  C    5    5    5    3    2    4

Step 2. Mean Subtraction for row

       M    N    P    Q    R    Mean
  A    -2   -1   0    1    2    3
  B    -1   0   -1    2    0    3
  C    1    1    1   -1   -2    4

Step 3. Mean Calculation for col

       M    N    P    Q    R    Mean
  A    -2   -1   0    1    2    3
  B    -1   0   -1    2    0    3
  C    1    1    1   -1   -2    4
  Mean -2/3 0    0  -2/3   0    

Step 4. Mean Substraction for col

       M    N    P    Q    R    Mean
  A    -4/3 -1   0   1/3    2    3
  B    -1/3 0   -1   4/3    0    3
  C    5/3  1    1  -1/3   -2    4
  Mean -2/3 0    0  -2/3   0    

In [1]:
import numpy as np

print np.mean(np.array([1,2,3,4,5]))
print np.mean(np.array([2,3,2,5,3]))
print np.mean(np.array([5,5,5,3,2]))

print "=============="
a = np.array([-2,-1, 1])
# print np.mean(a)
print a - np.mean(a)

a = np.array([-1, 0, 1])
# print np.mean(a)
print a - np.mean(a)

a = np.array([ 0,-1, 1])
# print np.mean(a)
print a - np.mean(a)

a = np.array([ 1, 2,-1])
# print np.mean(a)
print a - np.mean(a)

a = np.array([ 2, 0,-2])
# print np.mean(a)
print a - np.mean(a)


3.0
3.0
4.0
==============
[-1.33333333 -0.33333333  1.66666667]
[-1.  0.  1.]
[ 0. -1.  1.]
[ 0.33333333  1.33333333 -1.66666667]
[ 2.  0. -2.]

Q2.


  • Below is a table giving the profile of three items.
A   1   0   1   0   1   2
B   1   1   0   0   1   6
C   0   1   0   1   0   2
  • The first five attributes are Boolean, and the last is an integer "rating." Assume that the scale factor for the rating is α. Compute, as a function of α, the cosine distances between each pair of profiles. For each of α = 0, 0.5, 1, and 2, determine the cosine of the angle between each pair of vectors. Which of the following is FALSE?

In [28]:
from numpy.linalg import norm
# cos(A, B)
alphas = [0, 0.5, 1, 2]

for alpha in alphas:
    print "=========================="
    A = np.array([1,0,1,0,1,2*alpha], dtype="float32")
    B = np.array([1,1,0,0,1,6*alpha], dtype="float32")
    C = np.array([0,1,0,1,0,2*alpha], dtype="float32")
    
    print "Current Alpha: " + str(alpha)
    cos_a_b = float(np.dot(A.T, B)) / float( norm(A, ord=2)* norm(B, ord=2))
    print "Consine(A, B) = " + str(cos_a_b)

    cos_a_c = float(np.dot(A.T, C)) / float( norm(A, ord=2)* norm(C, ord=2))
    print "Consine(A, C) = " + str(cos_a_c)

    cos_b_c = float(np.dot(B.T, C)) / float( norm(B, ord=2)* norm(C, ord=2))
    print "Consine(B, C) = " + str(cos_b_c)


==========================
Current Alpha: 0
Consine(A, B) = 0.666666666667
Consine(A, C) = 0.0
Consine(B, C) = 0.408248315343
==========================
Current Alpha: 0.5
Consine(A, B) = 0.72168784944
Consine(A, C) = 0.288675139776
Consine(B, C) = 0.666666666667
==========================
Current Alpha: 1
Consine(A, B) = 0.8473185889
Consine(A, C) = 0.617213414251
Consine(B, C) = 0.849836556801
==========================
Current Alpha: 2
Consine(A, B) = 0.946094512584
Consine(A, C) = 0.865180900773
Consine(B, C) = 0.952579402468

Quiz - Week 4B

Q1.


Note: In this question, all columns will be written in their transposed form, as rows, to make the typography simpler. Matrix M has three rows and two columns, and the columns form an orthonormal basis. One of the columns is [2/7,3/7,6/7]. There are many options for the second column [x,y,z]. Write down those constraints on x, y, and z. Then, identify in the list below the one column that could be [x,y,z]. All components are computed to three decimal places, so the constraints may be satisfied only to a close approximation.


In [22]:
import numpy as np

def test_orth(lst, tested_lst):

    col_b = np.array(tested_lst)
    col_a = np.array(lst)
    return (np.dot(col_b.T, col_a), np.linalg.norm(col_b, ord=2) )

a = [2/7.0, 3/7.0, 6/7.0]
print test_orth(a, [-.702,  .117,   .702])
print test_orth(a, [-.288, -.490,  .772])
print test_orth(a, [.728,   .485, -.485])
print test_orth(a, [2.250, -.500, -.750])


(0.45128571428571423, 0.9996484382021511)
(0.36942857142857144, 0.95865948073338325)
(0.00014285714285711126, 1.0002169764606077)
(-0.2142857142857143, 2.4238399287081647)

Q2.


Note: In this question, all columns will be written in their transposed form, as rows, to make the typography simpler. Matrix M has three rows and three columns, and the columns form an orthonormal basis. One of the columns is [2/7,3/7,6/7], and another is [6/7, 2/7, -3/7]. Let the third column be [x,y,z]. Since the length of the vector [x,y,z] must be 1, there is a constraint that $x^2+y^2+z^2$ = 1. However, there are other constraints, and these other constraints can be used to deduce facts about the ratios among x, y, and z. Compute these ratios, and then identify one of them in the list below.

2x + 3y + 6z = 0 6x + 2y - 3z = 0

14x + 7y = 0 => y = -2x -4x + 6z = 0 => 3z = 2x => z = 2/3x

Q3.


Suppose we have three points in a two dimensional space: (1,1), (2,2), and (3,4). We want to perform PCA on these points, so we construct a 2-by-2 matrix whose eigenvectors are the directions that best represent these three points. Construct this matrix and identify, in the list below, one of its elements.


In [29]:
M = np.array([[1,1],
              [2,2],
              [3,4]])

print np.dot(M.T, M)


[[14 17]
 [17 21]]
35.0

Q4


Find, in the list below, the vector that is orthogonal to the vector [1,2,3]. Note: the interesting concept regarding eigenvectors is "orthonormal," that is unit vectors that are orthogonal. However, this question avoids using unit vectors to make the calculations simpler.


In [23]:
def test_orth(lst, tested_lst):
    flag = True
    col_b = np.array(tested_lst)
    col_a = np.array(lst)

    return np.dot(col_b.T, col_a)

a = [1, 2, 3]
print test_orth(a, [-1, -2,  0])
print test_orth(a, [-1, -2, -3])
print test_orth(a, [-4,  2, -1])
print test_orth(a, [-1, -1,  1])


print "================="
print test_orth(a, [-1,1,-1])
print test_orth(a, [2,-3, 1])
print test_orth(a, [-4,2,-1])
print test_orth(a, [ 3,0,-1])


-5
-14
-3
0
=================
-2
-1
-3
0

In [ ]: