In [3]:
import numpy as np
import math
In [19]:
## Q2 Solution.
def hash(x):
return math.fmod(3 * x + 2, 11)
for i in xrange(1,12):
print hash(i)
This question involves three different Bloom-filter-like scenarios. Each scenario involves setting to 1 certain bits of a 10-bit array, each bit of which is initially 0.
Scenario A:
Scenario B:
Scenario C:
Let a, b, and c be the expected number of bits set to 1 under scenarios A, B, and C, respectively. Which of the following correctly describes the relationships among a, b, and c?
In [14]:
## Q3 Solution.
prob = 1.0 / 10
a = (1 - prob)**4
print a
b = (1 - ( 1 - (1 - prob)**2) )**2
print b
c = (1 - (1.0 /10 * 1.0 / 9))
print c
support = 2 => {2,4,6,8, ...} & {3, 6, 9,...}
In [18]:
## Q5 Solution.
vec1 = np.array([2, 1, 1])
vec2 = np.array([10, -7, 1])
print vec1.dot(vec2) / (np.linalg.norm(vec1) * np.linalg.norm(vec2))
In [21]:
## Q6 Solution.
# probability that they agree at one particular band
p1 = 0.6**2
print (1 - p1)**3
In [26]:
## Q7 Solution.
p1 = 1 - (1 - .9)**3
p2 = 1 - (1 - .1)**3
print "new LSH is (.4, .6, {}, {})-sensitive family".format(p1, p2)
Suppose the Web consists of four pages A, B, C, and D, that form a chain A-->B-->C-->D
We wish to compute the PageRank of each of these pages, but since D is a "dead end," we will "teleport" from D with probability 1 to one of the four pages, each with equal probability. We do not teleport from pages A, B, or C. Assuming the sum of the PageRanks of the four pages is 1, what is the PageRank of page B, correct to two decimal places?
In [41]:
## Q9 Solution.
M = np.array([[0, 0, 0, .25],
[1, 0, 0, .25],
[0, 1, 0, .25],
[0, 0, 1, .25]])
r = np.array([.25, .25, .25, .25])
for i in xrange(30):
r = M.dot(r)
print r
In [44]:
## Q10 Solution.
print 1 - (1 - .3)*(1 - .4)
In [48]:
##Q12
L = np.array([[-.25, -.5, -.76, -.29, -.03, -.07, -.01],
[-.05, -.1, -.15, .20, .26, .51, .77 ]]).T
print L
V = np.array([[6.74, 0],[0, 5.44]])
print V
R = np.array([[-.57, -.11, -.57, -.11, -.57],
[-.09, 0.70, -.09, .7, -.09]])
print R
print L.dot(V).dot(R)
Recall that the power iteration does r=X·r until converging, where X is a nxn matrix and n is the number of nodes in the graph.
Using the power iteration notation above, what is matrix X value when solving topic sensitive Pagerank with teleport set {0,1} for the following graph? Use beta=0.8. (Recall that the teleport set contains the destination nodes used when teleporting).
In [53]:
X = 0.8 * np.array([[1.0/3, 0, 0],
[1.0/3, 0, 0],
[1.0/3, 1, 0]])
X += 0.2 * np.array([[.5, .5, .5],
[.5, .5, .5],
[ 0, 0, 0]])
print X