That is, the training data set consists of:
Our goal is to find the maximum-margin linear classifier for this data. In easy cases, the shortest line between a positive and negative point has a perpendicular bisector that separates the points. If so, the perpendicular bisector is surely the maximum-margin separator. Alas, in this case, the closest pair of positive and negative points, x2 and x3, have a perpendicular bisector that misclassifies x1 as negative, so that won't work.
The next-best possibility is that we can find a pair of points on one side (i.e., either two positive or two negative points) such that a line parallel to the line through these points is the maximum-margin separator. In these cases, the limit to how far from the two points the parallel line can get is determined by the closest (to the line between the two points) of the points on the other side. For our simple data set, this situation holds.
Consider all possibilities for boundaries of this type, and express the boundary as w.x+b=0, such that w.x+b≥1 for positive points x and w.x+b≤-1 for negative points x. Assuming that w = (w1,w2), identify in the list below the true statement about one of w1, w2, and b.
In [12]:
import numpy as np
p1 = (5, 4)
p2 = (8, 3)
p3 = (7, 2)
p4 = (3, 3)
def calc_wb(p1, p2):
dx = ( p1[0] - p2[0] )
dy = ( p1[1] - p2[1] )
return ( ( float(dy) *2 / float(dy - dx), float(-dx)*2 / float(dy - dx) ),\
(dx*p2[1] - dy * p2[0])*2 / float(dy - dx) + 1) # b = dx*y1 - dy*x1
def cal_margin(w, b, pt):
return w[0] * pt[0] + w[1] * pt[1] + b
w, b = calc_wb(p1, p2)
print "w for p1, p2: " + str(w)
print "b for p1, p2: " + str(b)
print "==========================="
print cal_margin(w, b, p1)
print cal_margin(w, b, p2)
print cal_margin(w, b, p3)
print cal_margin(w, b, p4)
print
w, b = calc_wb(p4, p3)
print "w for p1, p2: " + str(w)
print "b for p1, p2: " + str(b)
print "==========================="
print cal_margin(w, b, p1)
print cal_margin(w, b, p2)
print cal_margin(w, b, p3)
print cal_margin(w, b, p4)
We propose to use the diagonal line with slope +1 and intercept +2 as a decision boundary, with positive examples above and negative examples below. However, like any linear boundary for this training set, some examples are misclassified. We can measure the goodness of the boundary by computing all the slack variables that exceed 0, and then using them in one of several objective functions. In this problem, we shall only concern ourselves with computing the slack variables, not an objective function.
To be specific, suppose the boundary is written in the form w.x+b=0, where w = (-1,1) and b = -2. Note that we can scale the three numbers involved as we wish, and so doing changes the margin around the boundary. However, we want to consider this specific boundary and margin.
Determine the slack for each of the 16 points. Then, identify the correct statement in the list below.
In [17]:
w = (-1, 1)
b = -2
def cal_margin(w, b, pt):
return w[0] * pt[0] + w[1] * pt[1] + b
print cal_margin(w, b, (7, 10) )
print cal_margin(w, b, (7, 8) )
print cal_margin(w, b, (3, 4) )
print cal_margin(w, b, (3, 4) )
To be precise, the 20 points represent (Age,Salary) pairs of people who do or do not buy gold jewelry. Age (appreviated A in the decision tree) is the x-axis, and Salary (S in the tree) is the y-axis. Those that do are represented by gold points, and those that do not by green points. The 10 points of gold-jewelry buyers are:
(28,145), (38,115), (43,83), (50,130), (50,90), (50,60), (50,30), (55,118), (63,88), and (65,140).
The 10 points of those that do not buy gold jewelry are:
(23,40), (25,125), (29,97), (33,22), (35,63), (42,57), (44, 105), (55,63), (55,20), and (64,37).
Some of these points are correctly classified by the decision tree and some are not. Determine the classification of each point, and then indicate in the list below the point that is misclassified.
In [16]:
def predict_by_tree(pt):
if pt[0] < 45:
if pt[1] < 110:
print "Doesn't buy"
else:
print "Buy"
else:
if pt[1] < 75:
print "Doesn't buy"
else:
print "Buy"
predict_by_tree((43, 83))
predict_by_tree((55, 118))
predict_by_tree((65, 140))
predict_by_tree((28, 145))
print "=============="
predict_by_tree((65, 140))
predict_by_tree((25, 125))
predict_by_tree((44, 105))
predict_by_tree((35, 63))
| 1 2 3 4 | | 1 | | 5 6 7 8 | * | 2 | | 9 10 11 12 | | 3 | | 13 14 15 16 | | 4 |
The matrix-vector product is the vector x of length n, whose ith element xi is given by
$$ \begin{equation} x_i = \sum_{ j = 1}^n m_{ij} \cdot v_j \end{equation} $$
In [6]:
import numpy as np
mat = np.array([ [1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10,11,12],
[13,14,15,16] ])
vec = np.array([1, 2, 3, 4])
def key_val(mat, vec):
pair = dict()
for idx, row in enumerate(mat):
# pair[idx + 1] = np.dot(row, vec)
pair[idx + 1] = row * vec
return pair
print key_val(mat, vec)
Suppose we have the following relations:
R S
A B B C 0 1 0 1 1 2 1 2 2 3 2 3
</pre>
(1, (R, 0)) (2, (R, 1)) (3, (R, 2)) (0, (S, 1)) (1, (S, 2)) (2, (S, 3))
In [ ]: