Support Vector Machines (SVM)

I ran this at the command prompt

THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=1,allow_gc=False' jupyter notebook

In [1]:
%matplotlib inline

In [2]:
import theano


WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: GeForce GTX 980 Ti (CNMeM is enabled with initial size: 70.0% of memory, cuDNN 5105)

In [3]:
from theano import function, config, sandbox, shared 
import theano.tensor as T

In [4]:
print( theano.config.device )
print( theano.config.lib.cnmem)  # cf. http://deeplearning.net/software/theano/library/config.html
print( theano.config.print_active_device)# Print active device at when the GPU device is initialized.


gpu
0.7
True

In [5]:
print(theano.config.allow_gc)
print(theano.config.optimizer_excluding)


False


In [6]:
import numpy as np
import scipy

In [7]:
import sys
sys.path.append( './ML' )

In [8]:
from SVM import SVM, SVM_serial, SVM_parallel

In [9]:
import pandas as pd

Dataset examples

from sci-kit learn, sklearn


In [10]:
X = np.random.randn(300,2)
y = np.logical_xor(X[:,0] > 0, X[:,1] > 0)

In [10]:
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

Load and prepare data set

dataset for grid search


In [11]:
iris = load_iris()
X = iris.data
y = iris.target

In [12]:
# Dataset for decision function visualization: we only keep the first two
# features in X and sub-sample the dataset to keep only 2 classes and 
# make it a binary classification problem

X_2d = X[:,:2]
X_2d=X_2d[y>0]
y_2d=y[y>0]
y_2d -= 1

In [13]:
# It is usually a good idea to scale the data for SVM training.
# We are cheating a bit in this example in scaling all of the data,
# instead of fitting the transformation on the training set and 
# just applying it on the test set.  

scaler = StandardScaler()  
X= scaler.fit_transform(X)
X_2d=scaler.fit_transform(X_2d)

In [14]:
print(type(X)); print(X.shape); print(type(X_2d));print(X_2d.shape);print(type(y));print(y.shape);
print(type(y_2d));print(y_2d.shape)


<type 'numpy.ndarray'>
(150, 4)
<type 'numpy.ndarray'>
(100, 2)
<type 'numpy.ndarray'>
(150,)
<type 'numpy.ndarray'>
(100,)

In [15]:
ratio_of_train_to_total = 0.6
numberofexamples = len(y_2d)
numberoftrainingexamples = int(numberofexamples*ratio_of_train_to_total)
numbertovalidate = (numberofexamples - numberoftrainingexamples)/2
numbertotest= numberofexamples - numberoftrainingexamples - numbertovalidate
print(numberofexamples);print(numbertotest);print(numberoftrainingexamples);print(numbertovalidate)


100
20
60
20

In [16]:
shuffledindices = np.random.permutation( numberofexamples)

In [17]:
X_2d_train = X_2d[:numberoftrainingexamples]
y_2d_train = y_2d[:numberoftrainingexamples]
X_2d_valid = X_2d[numberoftrainingexamples:numberoftrainingexamples + numbertovalidate]
y_2d_valid = y_2d[numberoftrainingexamples:numberoftrainingexamples + numbertovalidate]
X_2d_test = X_2d[numberoftrainingexamples + numbertovalidate:]
y_2d_test = y_2d[numberoftrainingexamples + numbertovalidate:]

Clarke, Fokoue, and Zhang in Principles and Theory for Data Mining and Machine Learning (2009) and Bishop, Pattern Recognition and Machine Learning (2007) both, for support vector machines, for the case of binary classification, has $y\in \lbrace -1, 1\rbrace$, as opposed to $y\in \lbrace 0,1 \rbrace$ for $K=2$ total number of classes that outcome $y$ could belong to. Should this be made more explicit, noted more prominently, in practice?


In [39]:
y_2d_train


Out[39]:
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [18]:
y_2d_train[y_2d_train < 1] = -1

In [19]:
print(y_2d_train.shape);print(y_2d_train)


(60,)
[-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
  1  1  1  1  1  1  1  1  1  1]

In [20]:
y_2d_valid[y_2d_valid < 1] = -1

In [21]:
y_2d_test[y_2d_test < 1] = -1

from Coursera's Machine Learning Introduction by Andrew Ng, Ex. 6, i.e. Programming Exercise 6


In [19]:
where_ex6_is_str = './coursera_Ng/machine-learning-ex6/ex6/'
ex6data1_mat_data = scipy.io.loadmat( where_ex6_is_str + "ex6data1.mat")

Using SVM


In [18]:
SVM_iris = SVM(X_2d_train,y_2d_train,len(y_2d_train),1.0,1,0.001)

In [19]:
SVM_iris.build_W();

.build_update might take a while for FAST_COMPILE (that flag command that's typed in before the notebook starts for theano)


In [20]:
SVM_iris.build_update();

In [21]:
SVM_iris.train_model_full();

In [22]:
SVM_iris.build_b();

In [25]:
SVM_iris.make_predict(X_2d_valid[0])


Out[25]:
(array(0.1666666693307083),
 <theano.compile.function_module.Function at 0x7fdf53e65fd0>)

In [23]:
SVM_iris.make_predictions(X_2d_valid)


Out[23]:
[CudaNdarray(0.166666671634),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666671634),
 CudaNdarray(0.166666686535)]

In [24]:
X_2d_test


Out[24]:
array([[ 1.72551842, -0.21746808],
       [ 2.4836548 ,  2.80292193],
       [ 0.20924564, -0.21746808],
       [ 0.05761837, -0.21746808],
       [-0.24563619, -0.82154608],
       [ 2.18040025,  0.38660992],
       [ 0.05761837,  1.59476592],
       [ 0.20924564,  0.68864892],
       [-0.39726347,  0.38660992],
       [ 0.96738203,  0.68864892],
       [ 0.66412748,  0.68864892],
       [ 0.96738203,  0.68864892],
       [-0.70051802, -0.51950708],
       [ 0.81575475,  0.99068792],
       [ 0.66412748,  1.29272692],
       [ 0.66412748,  0.38660992],
       [ 0.05761837, -1.12358508],
       [ 0.36087292,  0.38660992],
       [-0.09400891,  1.59476592],
       [-0.54889074,  0.38660992]])

In [25]:
y_2d_test


Out[25]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [26]:
y_test_pred= SVM_iris.make_predictions(X_2d_test)

In [30]:
np.array( [np.array(yhat) for yhat in y_test_pred] )


Out[30]:
array([ 0.16666669,  0.16666667,  0.16666667,  0.16666667,  0.16666667,
        0.16666669,  0.16666667,  0.16666667,  0.16666667,  0.16666669,
        0.16666669,  0.16666669,  0.16666667,  0.16666669,  0.16666667,
        0.16666669,  0.16666667,  0.16666669,  0.16666667,  0.16666667], dtype=float32)

In [26]:
y_valid_pred = [ SVM_iris.make_predict(X_2d_valid_ele) for X_2d_valid_ele in X_2d_valid ]

In [27]:
y_valid_pred = [y_valid_pred_ele[0] for y_valid_pred_ele in y_valid_pred]

In [28]:
y_valid_pred = np.array( y_valid_pred).flatten()

In [30]:
#y_valid_pred[ y_valid_pred>0 ] = 1
#y_valid_pred[ y_valid_pred<0 ] = -1
y_valid_pred = np.sign( y_valid_pred)

In [31]:
(y_2d_valid == y_valid_pred).astype(theano.config.floatX).sum()/len(y_valid_pred)


Out[31]:
1.0

In [29]:
y_valid_pred


Out[29]:
array([ 0.16666667,  0.16666667,  0.16666667,  0.16666667,  0.16666667,
        0.16666667,  0.16666667,  0.16666667,  0.16666667,  0.16666667,
        0.16666667,  0.16666667,  0.16666667,  0.16666667,  0.16666667,
        0.16666667,  0.16666667,  0.16666667,  0.16666667,  0.16666667])

In [30]:
y_2d_valid


Out[30]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [46]:
SVM_iris_X


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-46-8bc9315467ed> in <module>()
----> 1 T.shared( SVM_iris.X[0])

AttributeError: 'module' object has no attribute 'shared'

In [65]:
SVM_iris = SVM(X_2d_train,y_2d_train,len(y_2d_train),0.1,1.0,0.001)

In [66]:
SVM_iris.build_W();
SVM_iris.build_update();
SVM_iris.train_model_full();
SVM_iris.build_b();

In [67]:
y_valid_pred = np.array( [ SVM_iris.make_predict(X_2d_valid_ele)[0] for X_2d_valid_ele in X_2d_valid ] ).flatten()

In [68]:
y_valid_pred[ y_valid_pred>0 ] = 1
y_valid_pred[ y_valid_pred<0 ] = -1

In [69]:
(y_2d_valid == y_valid_pred).astype(theano.config.floatX).sum()/len(y_valid_pred)


Out[69]:
0.20000000000000001

In [70]:
SVM_iris = SVM(X_2d_train,y_2d_train,len(y_2d_train),0.1,0.1,0.001)

In [71]:
SVM_iris.build_W();
SVM_iris.build_update();
SVM_iris.train_model_full();
SVM_iris.build_b();

In [72]:
y_valid_pred = np.array( [ SVM_iris.make_predict(X_2d_valid_ele)[0] for X_2d_valid_ele in X_2d_valid ] ).flatten()

In [73]:
y_valid_pred[ y_valid_pred>0 ] = 1
y_valid_pred[ y_valid_pred<0 ] = -1

In [74]:
(y_2d_valid == y_valid_pred).astype(theano.config.floatX).sum()/len(y_valid_pred)


Out[74]:
0.25

In [75]:
SVM_iris = SVM(X_2d_train,y_2d_train,len(y_2d_train),0.01,0.1,0.001)

In [76]:
SVM_iris.build_W();
SVM_iris.build_update();
SVM_iris.train_model_full();
SVM_iris.build_b();

In [77]:
y_valid_pred = np.array( [ SVM_iris.make_predict(X_2d_valid_ele)[0] for X_2d_valid_ele in X_2d_valid ] ).flatten()

In [78]:
y_valid_pred[ y_valid_pred>0 ] = 1
y_valid_pred[ y_valid_pred<0 ] = -1

In [79]:
(y_2d_valid == y_valid_pred).astype(theano.config.floatX).sum()/len(y_valid_pred)


Out[79]:
0.25

In [ ]:


In [53]:
m_val = np.cast["int32"](X.shape[0])
Xi = theano.shared( np.zeros_like(X[0],dtype=theano.config.floatX) )
X = theano.shared( np.zeros_like(X,dtype=theano.config.floatX) )
y = theano.shared( np.random.randint(2,size=m_val))
yi = theano.shared( np.cast["int32"]( np.random.randint(2)) )
m = theano.shared( m_val )
lambda_mult = theano.shared( np.zeros(m_val).astype(theano.config.floatX) ) # lambda Lagrange multipliers

In [63]:
Xi.set_value( X[np.int32(1)] )


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-63-edc7c221ea3d> in <module>()
----> 1 Xi.set_value( X[np.int32(1)] )

/home/topolo/PropD/Theano/theano/sandbox/cuda/var.pyc in set_value(self, value, borrow)
    100                 # in case this is a cuda_ndarray, we copy it
    101                 value = copy.deepcopy(value)
--> 102         self.container.value = value  # this will copy a numpy ndarray
    103 
    104     def __getitem__(self, *args):

/home/topolo/PropD/Theano/theano/gof/link.pyc in __set__(self, value)
    475                 self.storage[0] = self.type.filter_inplace(value,
    476                                                            self.storage[0],
--> 477                                                            **kwargs)
    478             else:
    479                 self.storage[0] = self.type.filter(value, **kwargs)

/home/topolo/PropD/Theano/theano/sandbox/cuda/type.pyc in filter_inplace(self, data, old_data, strict, allow_downcast)
    127                         data)
    128             else:
--> 129                 converted_data = theano._asarray(data, self.dtype)
    130 
    131                 if (allow_downcast is None and

/home/topolo/PropD/Theano/theano/misc/safe_asarray.pyc in _asarray(a, dtype, order)
     32         dtype = theano.config.floatX
     33     dtype = np.dtype(dtype)  # Convert into dtype object.
---> 34     rval = np.asarray(a, dtype=dtype, order=order)
     35     # Note that dtype comparison must be done by comparing their `num`
     36     # attribute. One cannot assume that two identical data types are pointers

/home/topolo/Public/anaconda2/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
    480 
    481     """
--> 482     return array(a, dtype, copy=False, order=order)
    483 
    484 def asanyarray(a, dtype=None, order=None):

ValueError: ('setting an array element with a sequence.', 'Container name "None"')

In [41]:
np.random.randint(2,size=4)


Out[41]:
array([0, 1, 1, 1])

In [46]:
np.random.randint(2)


Out[46]:
1

In [67]:
X = np.random.randn(300,2)
y = np.logical_xor(X[:,0] > 0, X[:,1] > 0)

In [70]:
def rbf(Xi,Xj,sigma):  
        """ rbf - radial basis function"""
        kernel_result = T.exp( -( (Xi-Xj)**2).sum()/ ( np.float32(2*sigma) )
        return kernel_result

class SVM(object):
    """ SVM - Support Vector Machines 
    """
    def __init__(self,X,y,m,C,sigma,alpha):
        assert m == X.shape[0] and m == y.shape[0]
        self.C = np.float32(C)
        self.sigma = np.float32(sigma)
        self.alpha = np.float32(alpha)
        
        self._m = theano.shared( np.int32(m))
            
#        self._Xi = theano.shared( X[0].astype(theano.config.floatX) )
        self.X = theano.shared( X.astype(theano.config.floatX) )
        self.y = theano.shared( y.astype(theano.config.floatX) )
#        self._yi = theano.shared( y[0].astype(theano.config.floatX)  )
        self.lambda_mult = theano.shared( np.random.rand(m).astype(theano.config.floatX) ) # lambda Lagrange multipliers
        
                              
    def build_W(self):
        m = self._m.get_value()
        X = self.X
        y = self.y
        lambda_mult = self.lambda_mult
                              
        def dual_step(Xj,yj,lambdaj, # input sequences we iterate over j=0,1,...m-1
                      cumulative_sum, # previous iteration
                      prodi,Xi,sigma): # non-sequences that aren't iterated over
            prodj = prodi*lambdaj*yj*rbf(Xi,Xj,sigma)
            return prodj + 
                              
        for i in range(m):
            Xi = self.X[i]
            yi = self.y[i]
            lambdai = self.lambda_mult[i]
            prodi = lambdai*yi
                
                              
                              
            theano.scan(fn=dual_step,
                        sequences=[X,y,lambda_mult],
                        non_sequences=[prodi,Xi,sigma])

In [69]:
y[0].astype(theano.config.floatX)


Out[69]:
1.0

In [74]:
test_SVM = SVM(X,y,len(y),1.,0.1,0.01)

In [80]:
range(test_SVM._m.get_value());

In [77]:
np.random.rand(4)


Out[77]:
array([ 0.06221329,  0.60626937,  0.40109709,  0.18349741])

In [81]:
test_SVM.X


Out[81]:
<CudaNdarrayType(float32, matrix)>

Test values


In [9]:
m=4
d=2
X_val=np.arange(2,m*d+2).reshape(m,d).astype(theano.config.floatX) 
X=theano.shared( X_val)
y_val=np.random.randint(2,size=m).astype(theano.config.floatX)
y=theano.shared( y_val )
lambda_mult_val = np.random.rand(m).astype(theano.config.floatX)
lambda_mult = theano.shared( lambda_mult_val ) # lambda Lagrange multipliers
sigma_val = 2.0
sigma = theano.shared( np.float32(sigma_val))

In [11]:
np.random.randint(2,size=4)


Out[11]:
array([0, 1, 1, 0])

In [13]:
X[1]


Out[13]:
Subtensor{int64}.0

In [14]:
np.random.rand(4)


Out[14]:
array([ 0.50587177,  0.21207985,  0.49935986,  0.96292576])

In [ ]:
#lambda_mult = theano.shared( np.zeros(m_val).astype(theano.config.floatX) ) # lambda Lagrange multipliers

In [16]:
prodi = lambda_mult[1]*y[1]

In [41]:
sigma=0.5
def step(Xj,Xi):
    rbf = T.exp(-(Xj-Xi)**2/(np.float32(2.*sigma**2)))
    return sandbox.cuda.basic_ops.gpu_from_host(rbf)

In [42]:
output,update=theano.scan(fn=step, sequences=[X,],non_sequences=[X[1],])

In [43]:
test_rbf = theano.function(inputs=[],outputs=output,updates=update )

In [44]:
print(test_rbf().shape)
test_rbf()


(4, 2)
Out[44]:
array([[  3.35462624e-04,   3.35462624e-04],
       [  1.00000000e+00,   1.00000000e+00],
       [  3.35462624e-04,   3.35462624e-04],
       [  1.26641649e-14,   1.26641649e-14]], dtype=float32)

In [45]:
#Check
prodi_val = lambda_mult_val[1]*y_val[1]

In [47]:
for j in range(4):
    print( np.exp(-((X_val[j]-X_val[1])**2).sum(0)/(np.float32(2.*sigma**2))) )


1.12535e-07
1.0
1.12535e-07
1.60381e-28

In [48]:
X_val


Out[48]:
array([[ 2.,  3.],
       [ 4.,  5.],
       [ 6.,  7.],
       [ 8.,  9.]], dtype=float32)

In [39]:
X_val[3]


Out[39]:
array([[ 20.,  21.],
       [ 22.,  23.],
       [ 24.,  25.]], dtype=float32)

In [49]:
prodi = lambda_mult[0]*y[0]

In [55]:
sigma=0.5
def step(Xj,yj,lambda_multj,Xi):
    rbf = lambda_multj*yj*T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    return sandbox.cuda.basic_ops.gpu_from_host(rbf)

In [56]:
output,update=theano.scan(fn=step, sequences=[X,y,lambda_mult],non_sequences=[X[0],])

In [57]:
test_rbf = theano.function(inputs=[],outputs=output,updates=update )

In [58]:
print(test_rbf().shape)
test_rbf()


(4,)
Out[58]:
array([  8.78704727e-01,   2.64269353e-08,   7.92035906e-30,
         0.00000000e+00], dtype=float32)

In [59]:
sigma=0.5

def rbf(Xj,Xi,sigma):
    rbf = T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    return rbf

def step(Xj,yj,lambda_multj,Xi,yi,lambda_multi):
#    W_i = lambda_multi*yi*lambda_multj*yj*T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    W_i = lambda_multi*yi*lambda_multj*yj*rbf(Xj,Xi,sigma)
    return W_i

In [60]:
output,update=theano.scan(fn=step, sequences=[X,y,lambda_mult],non_sequences=[X[0],y[0],lambda_mult[0]])

In [61]:
test_rbf = theano.function(inputs=[],outputs=output,updates=update )

In [62]:
test_rbf()


Out[62]:
array([  7.72122025e-01,   2.32214727e-08,   6.95965630e-30,
         0.00000000e+00], dtype=float32)

In [63]:
output1,update1=theano.scan(fn=step, sequences=[X,y,lambda_mult],non_sequences=[X[1],y[1],lambda_mult[1]])

In [66]:
test_rbf1 = theano.function(inputs=[],outputs=output1,updates=update1 )

In [67]:
test_rbf1()


Out[67]:
array([  2.32214727e-08,   5.51463775e-02,   1.30508415e-09,
         2.93131262e-29], dtype=float32)

In [69]:
test_rbf = theano.function(inputs=[],outputs=output+output1 )

In [70]:
test_rbf()


Out[70]:
array([  7.72122025e-01,   5.51463999e-02,   1.30508415e-09,
         2.93131262e-29], dtype=float32)

In [71]:
output,update=theano.scan(fn=step, sequences=[X,y,lambda_mult],non_sequences=[X[0],y[0],lambda_mult[0]])

In [74]:
updates=[update,]

In [75]:
for i in range(1,4):
    outputi,updatei=theano.scan(fn=step, sequences=[X,y,lambda_mult],non_sequences=[X[i],y[i],lambda_mult[i]])
    output += outputi
    updates.append(update)

In [76]:
test_rbf = theano.function(inputs=[],outputs=output )

In [77]:
test_rbf()


Out[77]:
array([ 0.77212203,  0.0551464 ,  0.00243885,  0.60576063], dtype=float32)

In [81]:
sigma=1.

In [82]:
for j in range(4):
    print( np.exp(-((X_val[j]-X_val[0])**2).sum()/(np.float32(2.*sigma**2))) )


1.0
0.0183156
1.12535e-07
2.31952e-16

In [83]:
X_val


Out[83]:
array([[ 2.,  3.],
       [ 4.,  5.],
       [ 6.,  7.],
       [ 8.,  9.]], dtype=float32)

In [84]:
np.sum( [ np.exp(-((X_val[j]-X_val[0])**2).sum()/(np.float32(2.*sigma**2))) for j in range(4)])


Out[84]:
1.0183158

In [85]:
def step(Xj,Xi):
    rbf = T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    return rbf

In [86]:
output,update=theano.scan(fn=step, sequences=[X,],non_sequences=[X[0],])

In [87]:
test_rbf = theano.function(inputs=[],outputs=output,updates=update )

In [88]:
test_rbf()


Out[88]:
array([  1.00000000e+00,   1.83156393e-02,   1.12535176e-07,
         2.31952270e-16], dtype=float32)

In [107]:
def step(Xj,Xi):
    rbf = T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    return rbf

In [108]:
output,update=theano.scan(fn=step, sequences=[X],outputs_info=[None,],non_sequences=[X[0]])

In [109]:
test_rbf = theano.function(inputs=[],outputs=output,updates=update )

In [110]:
test_rbf()


Out[110]:
array([  1.00000000e+00,   1.83156393e-02,   1.12535176e-07,
         2.31952270e-16], dtype=float32)

In [113]:
output,update=theano.reduce(fn=step, sequences=[X],outputs_info=[None,],non_sequences=[X[0]])
test_rbf = theano.function(inputs=[],outputs=output,updates=update )
test_rbf()


Out[113]:
array(2.3195226989972605e-16, dtype=float32)

In [114]:
def step(Xj,cumulative_sum,Xi):
    rbf = T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    return cumulative_sum + rbf

In [116]:
W_i0 = theano.shared( np.float32(0.))

In [117]:
output,update=theano.scan(fn=step, sequences=[X],outputs_info=[W_i0,],non_sequences=[X[0]])

In [118]:
test_rbf = theano.function(inputs=[],outputs=output,updates=update )

In [119]:
test_rbf()


Out[119]:
array([ 1.        ,  1.01831567,  1.01831579,  1.01831579], dtype=float32)

In [120]:
# Also this works:
output,update=theano.reduce(fn=step, sequences=[X],outputs_info=[W_i0,],non_sequences=[X[0]])
test_rbf = theano.function(inputs=[],outputs=output,updates=update )
test_rbf()


Out[120]:
array(1.0183157920837402, dtype=float32)

In [125]:
sigma=0.5

def rbf(Xj,Xi,sigma):
    rbf = T.exp(-((Xj-Xi)**2).sum()/(np.float32(2.*sigma**2)))
    return rbf

def step(Xj,yj,lambda_multj,cumulative_sum, Xi,yi,lambda_multi):
    W_i = lambda_multi*yi*lambda_multj*yj*rbf(Xj,Xi,sigma)
    return cumulative_sum + W_i

In [128]:
W_00 = theano.shared( np.float32(0.))
output,update=theano.reduce(fn=step, sequences=[X,y,lambda_mult],outputs_info=[W_00],
                          non_sequences=[X[0],y[0],lambda_mult[0]])
updates=[update,]

In [133]:
for i in range(1,m):
    W_i0 = theano.shared( np.float32(0.))
    outputi,updatei=theano.reduce(fn=step, sequences=[X,y,lambda_mult],
                                  outputs_info=[W_i0],
                                non_sequences=[X[i],y[i],lambda_mult[i]])
    output += outputi
    updates.append(update)

In [134]:
test_rbf = theano.function(inputs=[],outputs=output )

In [135]:
test_rbf()


Out[135]:
array(1.4354679584503174, dtype=float32)

In [138]:
#sanity check
cum_sum_val=0.
for i in range(m):
    toadd=np.sum([lambda_mult_val[i]*y_val[i]*lambda_mult_val[j]*y_val[j]*np.exp(-((X_val[j]-X_val[i])**2).sum()/(np.float32(2.*sigma**2))) for j in range(4)])
    cum_sum_val += toadd
print(cum_sum_val)


1.43546790583

In [11]:
test_SVM=SVM(X_val,y_val,m,1.0,2.0,0.01)

In [14]:
test_f= theano.function( inputs=[], outputs=T.dot( test_SVM.y, test_SVM.lambda_mult))

In [15]:
test_f()


Out[15]:
array(0.806511402130127, dtype=float32)

In [17]:
test_f= theano.function( inputs=[], outputs=T.dot( test_SVM.y, test_SVM.y ))

In [18]:
test_f()


Out[18]:
array(2.0, dtype=float32)

In [19]:
test_SVM.y.get_value()


Out[19]:
array([ 1.,  0.,  0.,  1.], dtype=float32)

In [20]:
theano.ifelse( T.lt(test_SVM.y,np.float32(0)), np.float32(0), test_SVM.y )


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-df7a9629eaea> in <module>()
----> 1 theano.ifelse( T.lt(test_SVM.y,np.float32(0)), np.float32(0), test_SVM.y )

TypeError: 'module' object is not callable

In [25]:
lower_bound = theano.shared( np.float32(0.) )
theano.ifelse.ifelse( T.lt(test_SVM.y, lower_bound), lower_bound, test_SVM.y )


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-3ae0dbcf3af8> in <module>()
      1 lower_bound = theano.shared( np.float32(0.) )
----> 2 theano.ifelse.ifelse( T.lt(test_SVM.y, lower_bound), lower_bound, test_SVM.y )

/home/topolo/PropD/Theano/theano/ifelse.pyc in ifelse(condition, then_branch, else_branch, name)
    359                     isinstance(else_branch_elem.type, TensorType)):
    360                 else_branch_elem = then_branch_elem.type.filter_variable(
--> 361                     else_branch_elem)
    362 
    363             elif (isinstance(else_branch_elem.type, TensorType) and not

/home/topolo/PropD/Theano/theano/tensor/type.pyc in filter_variable(self, other, allow_convert)
    233             dict(othertype=other.type,
    234                  other=other,
--> 235                  self=self))
    236 
    237     def value_validity_msg(self, a):

TypeError: Cannot convert Type TensorType(float32, vector) (of Variable HostFromGpu.0) into Type TensorType(float32, scalar). You can try to manually convert HostFromGpu.0 into a TensorType(float32, scalar).

In [35]:
lower_bound = theano.shared( np.float32(0.5) )
#lower_bound_check=T.switch( T.lt(test_SVM.y, lower_bound), lower_bound, test_SVM.y )
lower_bound_check=T.switch( T.lt(test_SVM.y, lower_bound), test_SVM.y, lower_bound )

test_f=theano.function(inputs=[],outputs=lower_bound_check)

In [36]:
test_f()


Out[36]:
array([ 0.5,  0. ,  0. ,  0.5], dtype=float32)

In [37]:
np.ndarray(5)


Out[37]:
array([  6.95675101e-316,   6.92966431e-310,   1.38685057e-316,
         6.92963114e-310,   2.37151510e-322])

In [31]:
dir(scipy);

In [11]:
with open("./Data/train.1",'rb') as f:
    train_1_lst = f.readlines()
f.close()
# strip of '\n'
train_1_lst = [x.strip() for x in train_1_lst]
print(len(train_1_lst))


3089

In [12]:
train_1_lst=[line.replace('1:','').replace('2:','').replace('3:','').replace('4:','') for line in train_1_lst]

In [13]:
train_1_lst=[line.split() for line in train_1_lst]
train_1_arr=np.array( [[float(ele) for ele in line] for line in train_1_lst] )

In [14]:
train_1_y=train_1_arr[:,0]
train_1_X=train_1_arr[:,1:]

In [15]:
print(train_1_y.shape)
print(train_1_X.shape)


(3089,)
(3089, 4)

In [69]:
with open("./Data/test.1",'rb') as f:
    test_1_lst = f.readlines()
f.close()
# strip of '\n'
test_1_lst = [x.strip() for x in test_1_lst]
print(len(test_1_lst))

test_1_lst=[line.replace('1:','').replace('2:','').replace('3:','').replace('4:','') for line in test_1_lst]

test_1_lst=[line.split() for line in test_1_lst]
test_1_arr=np.array( [[float(ele) for ele in line] for line in test_1_lst] )

test_1_y=test_1_arr[:,0]
test_1_X=test_1_arr[:,1:]


4000

In [11]:
with open("./Data/train.3",'rb') as f:
    train_3_lst = f.readlines()
f.close()
# strip of '\n'
train_3_lst = [x.strip() for x in train_3_lst]
print(len(train_3_lst))

train_3_lst=[line.replace('1:','').replace('2:','').replace('3:','').replace('4:','').replace('5:','').replace('6:','').replace('7:','').replace('8:','').replace('9:','').replace('10:','').replace('11:','').replace('12:','').replace('13:','').replace('14:','').replace('15:','').replace('16:','').replace('17:','').replace('18:','').replace('19:','').replace('20:','').replace('21:','').replace('22:','') for line in train_3_lst]
train_3_lst=[line.split() for line in train_3_lst]
train_3_DF=pd.DataFrame( train_3_lst)


1243

In [12]:
train_3_y = train_3_DF[0].as_matrix().astype(theano.config.floatX)
train_3_X = train_3_DF.ix[:,1:].as_matrix().astype(theano.config.floatX)
print(train_3_X.shape)


(1243, 22)

In [52]:
ratiotraintotot = 0.2
numberofexamples1 = len(train_1_y)
numberoftrain1 = int( numberofexamples1 * ratiotraintotot )
numberofvalid1 = numberofexamples1 - numberoftrain1

In [19]:
shuffled_idx = np.random.permutation(numberofexamples1)

In [53]:
train1_idx = shuffled_idx[:numberoftrain1]
valid1_idx = shuffled_idx[numberoftrain1:]

In [21]:
from sklearn.svm import SVC

In [22]:
clf=SVC()

In [54]:
clf.fit(train_1_X[train1_idx],train_1_y[train1_idx])


Out[54]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [55]:
(clf.predict(train_1_X[valid1_idx]) == train_1_y[valid1_idx]).astype(theano.config.floatX).sum()/len(valid1_idx)


Out[55]:
0.68042071197411003

In [76]:
(clf.predict(test_1_X) == test_1_y).astype(theano.config.floatX).sum()/float(len(test_1_y))


Out[76]:
0.5

In [25]:
pd.DataFrame(train_1_X).describe()


Out[25]:
0 1 2 3
count 3089.000000 3089.000000 3089.000000 3089.000000
mean 32.272548 113.256891 0.068555 115.665165
std 32.859650 95.473191 0.242481 38.173494
min 0.000000 -4.555206 -0.752439 8.157474
25% 16.539200 35.475700 -0.156715 94.214690
50% 23.466500 86.845020 0.126325 122.507700
75% 37.909000 164.437000 0.246017 145.348600
max 297.050000 581.073100 0.717061 180.000000

In [26]:
scaler = StandardScaler()  
train_1_X_scaled = scaler.fit_transform(train_1_X)

In [27]:
pd.DataFrame(train_1_X_scaled).describe()


Out[27]:
0 1 2 3
count 3.089000e+03 3.089000e+03 3.089000e+03 3.089000e+03
mean 3.712921e-15 -5.839723e-16 -3.595556e-16 -7.099245e-15
std 1.000162e+00 1.000162e+00 1.000162e+00 1.000162e+00
min -9.822921e-01 -1.234181e+00 -3.386346e+00 -2.816748e+00
25% -4.788820e-01 -8.148233e-01 -9.291670e-01 -5.620116e-01
50% -2.680331e-01 -2.766865e-01 2.382850e-01 1.792774e-01
75% 1.715589e-01 5.361546e-01 7.319796e-01 7.777187e-01
max 8.059134e+00 4.900768e+00 2.674890e+00 1.685600e+00

In [28]:
pd.DataFrame(train_1_y).describe()


Out[28]:
0
count 3089.000000
mean 0.647459
std 0.477839
min 0.000000
25% 0.000000
50% 1.000000
75% 1.000000
max 1.000000

In [28]:
train_1_y[ train_1_y < 1] = -1

In [29]:
len(train1_idx)


Out[29]:
154

In [56]:
SVM_1 = SVM_parallel(train_1_X_scaled[train1_idx],train_1_y[train1_idx],len(train_1_y[train1_idx]),1.0,1.,0.001)

In [57]:
SVM_1.build_W();

In [58]:
SVM_1.build_update()


Out[58]:
(GpuFromHost.0, <theano.compile.function_module.Function at 0x7fe8fe417c10>)

In [59]:
SVM_1.train_model_full()
SVM_1.build_b()


Out[59]:
(Elemwise{mul,no_inplace}.0, OrderedUpdates())

In [35]:



---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-35-d9ade50eaa5d> in <module>()
----> 1 valid_1_X_scaled = scaler.transform(valid_1_X)

NameError: name 'valid_1_X' is not defined

In [60]:
#yhat_parallel = SVM_1.make_predictions(train_1_X_scaled[valid1_idx]) ;
yhat_parallel = SVM_1.make_predictions_parallel(train_1_X_scaled[valid1_idx[:300]]) ;

In [36]:
yhat_parallel_2 = SVM_1.make_predictions_parallel(train_1_X_scaled[valid1_idx[:100]]) ;

In [61]:
yhat_parallel[0].shape


Out[61]:
(300,)

In [37]:
yhat_parallel_2


Out[37]:
(CudaNdarray([ 1.06532967  1.64311457 -1.47852945 -1.58601379  1.05086398 -0.49782896
   1.45557761  0.4120605  -1.23792279  1.23897517 -1.38246512  0.9092896
  -1.4292047  -1.44208336 -1.43128359 -0.9819535   1.37075341 -1.57761288
  -1.33187222  1.02457297  0.77928293  1.2445482   1.57451165 -0.93099558
   0.94742841 -1.43188941  1.59254408  1.2501024   0.87218213  1.38017094
  -0.45651317  1.4014951  -1.31638932  1.07225263  0.74926066  1.78455269
   0.83058614  1.32445002 -1.66566944  1.06068075  0.76313078 -0.72306609
   0.02906038 -1.36807358  1.11386871 -1.49504721  1.61015987  1.40692878
  -0.51949513  0.82104218  1.0796752  -0.812249    0.48507643  0.38147646
  -0.3750838   0.38286799  1.62341988  0.8841731  -1.02292609  1.99942017
   0.87027806 -1.53541613  1.3216902   0.22465101 -0.96404696  0.97882861
  -1.51583326 -1.32324529 -0.88828182  1.04831994 -0.85362542  0.69190705
   1.46849263  0.72825259 -1.15763843 -0.71747416  1.07399905 -0.04256304
  -0.06824605  1.37278044  1.01930261 -1.64745283  0.51999462 -1.41425312
   0.82940471  0.33227038  0.91453397  1.03423584  0.87238002 -0.7847386
   1.70685458 -0.44013995 -1.32967114  1.62237906 -0.68824321  0.85025251
  -1.08415461  1.22743475  1.86763048 -1.00724661]),
 <theano.compile.function_module.Function at 0x7fe8ff13fbd0>)

In [62]:
yhat = np.sign( yhat_parallel[0])

In [63]:
#(yhat == train_1_y[valid1_idx[:100]]).sum()/float(len(train_1_y[valid1_idx[:100]]))
(yhat == train_1_y[valid1_idx[:300]]).sum()/float(len(train_1_y[valid1_idx[:300]]))


Out[63]:
0.95333333333333337

In [64]:
len(valid1_idx)


Out[64]:
2472

In [65]:
yhat_1000 = SVM_1.make_predictions_parallel(train_1_X_scaled[valid1_idx[:1000]]) ;

In [67]:
yhat_1000 = np.sign( yhat_1000[0])

In [68]:
(yhat_1000 == train_1_y[valid1_idx[:1000]]).sum()/float(len(train_1_y[valid1_idx[:1000]]))


Out[68]:
0.95599999999999996

In [70]:
test_1_X_scaled = scaler.transform(test_1_X)

In [71]:
yhat_test = SVM_1.make_predictions_parallel(test_1_X_scaled) ;

In [73]:
yhat_test = np.sign( yhat_test[0])

In [74]:
(yhat_test == test_1_y).sum()/float(len(test_1_y))


Out[74]:
0.46725

In [42]:
train_1_y[valid1_idx[:100]]


Out[42]:
array([ 1.,  1., -1., -1.,  1., -1.,  1.,  1., -1.,  1., -1.,  1., -1.,
       -1., -1., -1.,  1., -1., -1.,  1.,  1.,  1.,  1., -1.,  1., -1.,
        1.,  1.,  1.,  1., -1.,  1., -1.,  1.,  1.,  1.,  1.,  1., -1.,
        1.,  1.,  1.,  1., -1.,  1., -1.,  1.,  1., -1.,  1.,  1., -1.,
        1.,  1., -1.,  1.,  1.,  1., -1.,  1.,  1., -1.,  1.,  1., -1.,
        1., -1., -1., -1.,  1., -1.,  1.,  1.,  1., -1., -1.,  1.,  1.,
        1.,  1.,  1., -1.,  1., -1.,  1.,  1.,  1.,  1.,  1., -1.,  1.,
        1., -1.,  1., -1.,  1., -1.,  1.,  1., -1.])

So other people have this same problem too with Python, inherently with Python: https://github.com/Theano/Theano/issues/689


In [46]:
import sys

In [34]:
sys.getrecursionlimit()


Out[34]:
5000

In [40]:
sys.setrecursionlimit(50000)

In [41]:
sys.getrecursionlimit()


Out[41]:
50000

In [ ]:
yhat_valid = SVM_1.make_predictions(train_1_X_scaled[valid1_idx])

In [79]:
SVM_1 = SVM_parallel(train_1_X_scaled,train_1_y,len(train_1_y),2.0,1.,0.01)

In [80]:
SVM_1.build_W();
SVM_1.build_update();

In [81]:
SVM_1.train_model_full(100)  # 8 hours
SVM_1.build_b()


Out[81]:
(Elemwise{mul,no_inplace}.0, OrderedUpdates())

In [82]:
yhat_test = SVM_1.make_predictions_parallel(test_1_X_scaled) ;

In [83]:
yhat_test = np.sign( yhat_test[0])

In [88]:
(yhat_test == test_1_y).sum()/float(len(test_1_y))


Out[88]:
0.69674999999999998

In [85]:
test_1_y


Out[85]:
array([ 0.,  0.,  0., ...,  1.,  1.,  1.])

In [87]:
test_1_y[ test_1_y < 1] = -1

In [86]:
yhat_test[ ]


Out[86]:
array([ 1.,  1., -1., ...,  1.,  1., -1.], dtype=float32)

In [90]:
# SVC
clf=SVC(C=2.0,gamma=2.0)
clf.fit(train_1_X_scaled,train_1_y)


Out[90]:
SVC(C=2.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=2.0, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

In [91]:
(clf.predict(test_1_X_scaled) == test_1_y).sum()/float(len(test_1_y))


Out[91]:
0.96650000000000003

In [ ]:
SVM_1_C2 = SVM_1

In [102]:
SVM_1 = SVM_parallel(train_1_X_scaled,train_1_y,len(train_1_y),2.0,0.25,0.001)

In [103]:
SVM_1.build_W();
SVM_1.build_update();

In [104]:
%time SVM_1.train_model_full(10)  # CPU times: user 43min 45s, sys: 1min 10s, total: 44min 56s
#Wall time: 44min 54s

SVM_1.build_b()


CPU times: user 43min 45s, sys: 1min 10s, total: 44min 56s
Wall time: 44min 54s
Out[104]:
(Elemwise{mul,no_inplace}.0, OrderedUpdates())

In [105]:
yhat_test = SVM_1.make_predictions_parallel(test_1_X_scaled) ;
yhat_test = np.sign( yhat_test[0]);

In [106]:
(yhat_test == test_1_y).sum()/float(len(test_1_y))


Out[106]:
0.94950000000000001

In [107]:
SVM_1_C2 = SVM_1

In [108]:
SVM_1 = SVM_parallel(train_1_X_scaled,train_1_y,len(train_1_y),2.0,0.20,0.001)  # sigma=0.2

In [109]:
SVM_1.build_W();
SVM_1.build_update();

In [110]:
%time SVM_1.train_model_full(20)  
SVM_1.build_b()


CPU times: user 1h 28min 16s, sys: 2min 33s, total: 1h 30min 49s
Wall time: 1h 30min 44s
Out[110]:
(Elemwise{mul,no_inplace}.0, OrderedUpdates())

In [111]:
yhat_test = SVM_1.make_predictions_parallel(test_1_X_scaled) ;
yhat_test = np.sign( yhat_test[0]);

In [112]:
(yhat_test == test_1_y).sum()/float(len(test_1_y))  # sigma = 0.2


Out[112]:
0.93125000000000002

In [113]:
SVM_1 = SVM_parallel(train_1_X_scaled,train_1_y,len(train_1_y),2.0,0.30,0.001)

In [114]:
SVM_1.build_W();
SVM_1.build_update();

In [115]:
%time SVM_1.train_model_full(15)  
SVM_1.build_b()


CPU times: user 1h 5min 30s, sys: 1min 47s, total: 1h 7min 18s
Wall time: 1h 7min 14s
Out[115]:
(Elemwise{mul,no_inplace}.0, OrderedUpdates())

In [116]:
yhat_test = SVM_1.make_predictions_parallel(test_1_X_scaled) ;
yhat_test = np.sign( yhat_test[0]);

In [117]:
(yhat_test == test_1_y).sum()/float(len(test_1_y))


Out[117]:
0.96125000000000005

Get and data clean/data wrangle/preprocess the test data, test_3 for vehicle data set


In [13]:
with open("./Data/test.3",'rb') as f:
    test_3_lst = f.readlines()
f.close()
# strip of '\n'
test_3_lst = [x.strip() for x in test_3_lst]
print(len(test_3_lst))

test_3_lst=[line.replace('1:','').replace('2:','').replace('3:','').replace('4:','').replace('5:','').replace('6:','').replace('7:','').replace('8:','').replace('9:','').replace('10:','').replace('11:','').replace('12:','').replace('13:','').replace('14:','').replace('15:','').replace('16:','').replace('17:','').replace('18:','').replace('19:','').replace('20:','').replace('21:','').replace('22:','') for line in test_3_lst]
test_3_lst=[line.split() for line in test_3_lst]
test_3_DF=pd.DataFrame( test_3_lst)


41

In [14]:
test_3_y = test_3_DF[0].as_matrix().astype(theano.config.floatX)
test_3_X = test_3_DF.ix[:,1:].as_matrix().astype(theano.config.floatX)
print(test_3_X.shape)
print(test_3_y.shape)


(41, 22)
(41,)

Scale the train.3 Vehicle data


In [15]:
scaler = StandardScaler()  
train_3_X_scaled = scaler.fit_transform(train_3_X)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-ace7b1ce77d4> in <module>()
      1 scaler = StandardScaler()
----> 2 train_3_X_scaled = scaler.fit_transform(train_3_X)

/home/topolo/Public/anaconda2/lib/python2.7/site-packages/sklearn/base.pyc in fit_transform(self, X, y, **fit_params)
    492         if y is None:
    493             # fit method of arity 1 (unsupervised transformation)
--> 494             return self.fit(X, **fit_params).transform(X)
    495         else:
    496             # fit method of arity 2 (supervised transformation)

/home/topolo/Public/anaconda2/lib/python2.7/site-packages/sklearn/preprocessing/data.pyc in fit(self, X, y)
    558         # Reset internal state before fitting
    559         self._reset()
--> 560         return self.partial_fit(X, y)
    561 
    562     def partial_fit(self, X, y=None):

/home/topolo/Public/anaconda2/lib/python2.7/site-packages/sklearn/preprocessing/data.pyc in partial_fit(self, X, y)
    581         X = check_array(X, accept_sparse=('csr', 'csc'), copy=self.copy,
    582                         ensure_2d=False, warn_on_dtype=True,
--> 583                         estimator=self, dtype=FLOAT_DTYPES)
    584 
    585         if X.ndim == 1:

/home/topolo/Public/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.pyc in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    405                              % (array.ndim, estimator_name))
    406         if force_all_finite:
--> 407             _assert_all_finite(array)
    408 
    409     shape_repr = _shape_repr(array.shape)

/home/topolo/Public/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.pyc in _assert_all_finite(X)
     56             and not np.isfinite(X).all()):
     57         raise ValueError("Input contains NaN, infinity"
---> 58                          " or a value too large for %r." % X.dtype)
     59 
     60 

ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

In [16]:
train_3_X


Out[16]:
array([[  6.42842576e-02,  -8.84741428e-04,   7.16804789e-05, ...,
          7.43429817e-04,   2.00000000e+01,   2.00000000e+01],
       [  4.14278917e-02,  -3.72796995e-03,   1.00939104e-03, ...,
          2.52108201e-02,   2.00000000e+01,   2.00000000e+01],
       [  3.57137099e-02,  -6.94778282e-03,   3.95991607e-03, ...,
          6.08996896e-04,   2.00000000e+01,   2.00000000e+01],
       ..., 
       [  4.28567417e-02,  -5.09361811e-02,   1.21772103e-01, ...,
          4.23710007e-04,   2.00000000e+01,   2.00000000e+01],
       [  1.25713795e-01,  -6.86596781e-02,   7.49681368e-02, ...,
          1.86802004e-03,   2.00000000e+01,   2.00000000e+01],
       [  1.72854006e-01,  -4.01350111e-02,   1.98206808e-02, ...,
          1.54349406e-03,   2.00000000e+01,   2.00000000e+01]], dtype=float32)

Clean the data where I choose to fill in missing values, NaN values, with the mean, due to the distribution of the data


In [17]:
train_3_X_pd = pd.DataFrame(train_3_X)
train_3_X_pd_cleaned = train_3_X_pd.where( pd.notnull( train_3_X_pd ), train_3_X_pd.mean(), axis='columns')

In [18]:
train_3_X_pd.describe()


Out[18]:
0 1 2 3 4 5 6 7 8 9 ... 12 13 14 15 16 17 18 19 20 21
count 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1.243000e+03 1243.000000 ... 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1105.0
mean 0.077383 -0.027725 0.065325 -0.189682 0.132142 -0.038490 0.115298 -0.039128 3.490100e-02 1.029767 ... 10.326463 10.078814 10.049204 10.620543 10.256516 10.200141 10.192832 0.008182 20.032875 20.0
std 0.129478 0.016686 0.080258 0.135888 0.180085 0.080136 0.174719 0.092178 1.102713e-01 0.361044 ... 0.288018 0.090949 0.079687 0.110412 0.122297 0.193113 0.099204 0.049838 0.177683 0.0
min 0.008572 -0.082940 0.000002 -0.711446 -0.000724 -0.711236 0.000008 -0.900021 2.221730e-10 0.000000 ... 10.049809 10.000000 10.000000 10.261244 10.010774 10.000000 10.050822 0.000007 20.000000 20.0
25% 0.032856 -0.039307 0.007915 -0.276503 0.016700 -0.036831 0.009182 -0.034672 8.048522e-05 1.000000 ... 10.106040 10.000000 10.000000 10.553076 10.166299 10.000000 10.136705 0.000349 20.000000 20.0
50% 0.045715 -0.026813 0.034872 -0.171972 0.060025 -0.007399 0.046326 -0.005454 1.616255e-03 1.000000 ... 10.188972 10.100000 10.000000 10.587928 10.243488 10.166667 10.167612 0.000789 20.000000 20.0
75% 0.075713 -0.015032 0.091868 -0.080340 0.176459 -0.000764 0.151591 -0.000681 1.746899e-02 1.000000 ... 10.470739 10.100000 10.100000 10.649028 10.328728 10.292857 10.211973 0.002019 20.000000 20.0
max 1.824250 0.000144 0.558680 0.000896 1.287944 0.000090 1.374966 0.000122 1.232450e+00 5.000000 ... 11.000000 10.500000 10.600000 11.000000 10.775702 11.000000 10.917459 0.708513 21.987804 20.0

8 rows × 22 columns


In [19]:
train_3_X_pd_cleaned.describe()


Out[19]:
0 1 2 3 4 5 6 7 8 9 ... 12 13 14 15 16 17 18 19 20 21
count 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1.243000e+03 1243.000000 ... 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.000000 1243.0
mean 0.077383 -0.027725 0.065325 -0.189682 0.132142 -0.038490 0.115298 -0.039128 3.490100e-02 1.029767 ... 10.326463 10.078814 10.049204 10.620543 10.256516 10.200141 10.192832 0.008182 20.032875 20.0
std 0.129478 0.016686 0.080258 0.135888 0.180085 0.080136 0.174719 0.092178 1.102713e-01 0.361044 ... 0.288018 0.090949 0.079687 0.110412 0.122297 0.193113 0.099204 0.049838 0.177683 0.0
min 0.008572 -0.082940 0.000002 -0.711446 -0.000724 -0.711236 0.000008 -0.900021 2.221730e-10 0.000000 ... 10.049809 10.000000 10.000000 10.261244 10.010774 10.000000 10.050822 0.000007 20.000000 20.0
25% 0.032856 -0.039307 0.007915 -0.276503 0.016700 -0.036831 0.009182 -0.034672 8.048522e-05 1.000000 ... 10.106040 10.000000 10.000000 10.553076 10.166299 10.000000 10.136705 0.000349 20.000000 20.0
50% 0.045715 -0.026813 0.034872 -0.171972 0.060025 -0.007399 0.046326 -0.005454 1.616255e-03 1.000000 ... 10.188972 10.100000 10.000000 10.587928 10.243488 10.166667 10.167612 0.000789 20.000000 20.0
75% 0.075713 -0.015032 0.091868 -0.080340 0.176459 -0.000764 0.151591 -0.000681 1.746899e-02 1.000000 ... 10.470739 10.100000 10.100000 10.649028 10.328728 10.292857 10.211973 0.002019 20.000000 20.0
max 1.824250 0.000144 0.558680 0.000896 1.287944 0.000090 1.374966 0.000122 1.232450e+00 5.000000 ... 11.000000 10.500000 10.600000 11.000000 10.775702 11.000000 10.917459 0.708513 21.987804 20.0

8 rows × 22 columns


In [20]:
train_3_X_scaled = scaler.fit_transform( train_3_X_pd_cleaned.as_matrix() )

In [21]:
train_3_y


Out[21]:
array([-1., -1., -1., ...,  1.,  1.,  1.], dtype=float32)

In [22]:
SVM_3 = SVM_parallel(train_3_X_scaled,train_3_y,len(train_3_y),128.0,2.0,0.001)  # sigma=2.0

In [23]:
SVM_3.build_W();
SVM_3.build_update();

In [24]:
%time SVM_3.train_model_full(20)  

SVM_3.build_b()


CPU times: user 14min 22s, sys: 40.6 s, total: 15min 3s
Wall time: 15min 3s
Out[24]:
(Elemwise{mul,no_inplace}.0, OrderedUpdates())

In [30]:
print(test_3_y.shape)
test_3_y


(41,)
Out[30]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [26]:
print(test_3_X.shape)
test_3_X_scaled = scaler.transform( test_3_X)


(41, 22)

In [32]:
pd.DataFrame( train_3_X_scaled).describe()


Out[32]:
0 1 2 3 4 5 6 7 8 9 ... 12 13 14 15 16 17 18 19 20 21
count 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 ... 1243.000000 1243.000000 1243.000000 1243.000000 1.243000e+03 1.243000e+03 1.243000e+03 1.243000e+03 1243.000000 1243.0
mean -1.381025e-08 1.306219e-07 -1.449117e-07 2.929882e-08 -5.078143e-08 -2.865147e-08 2.215394e-08 2.747664e-08 2.459950e-08 2.846805e-07 ... 0.000003 0.000015 -0.000006 0.000005 9.636484e-07 4.983198e-07 2.167651e-07 -3.995621e-08 0.000004 0.0
std 1.000402e+00 1.000403e+00 1.000402e+00 1.000403e+00 1.000403e+00 1.000402e+00 1.000403e+00 1.000402e+00 1.000402e+00 1.000403e+00 ... 1.000402 1.000401 1.000403 1.000403 1.000403e+00 1.000404e+00 1.000402e+00 1.000403e+00 1.000403 0.0
min -5.316681e-01 -3.310335e+00 -8.142446e-01 -3.841224e+00 -7.380965e-01 -8.398426e+00 -6.601254e-01 -9.343195e+00 -3.166287e-01 -2.853335e+00 ... -0.960921 -0.867209 -0.618117 -3.255454 -2.010187e+00 -1.036730e+00 -1.432054e+00 -1.641000e-01 -0.185063 0.0
25% -3.440345e-01 -6.943713e-01 -7.156037e-01 -6.391792e-01 -6.413019e-01 2.071848e-02 -6.075985e-01 4.836186e-02 -3.158985e-01 -8.247919e-02 ... -0.765611 -0.867209 -0.618117 -0.611274 -7.379700e-01 -1.036730e+00 -5.659849e-01 -1.572234e-01 -0.185063 0.0
50% -2.446804e-01 5.469952e-02 -3.795968e-01 1.303762e-01 -4.006212e-01 3.881401e-01 -3.949165e-01 3.654580e-01 -3.019657e-01 -8.247919e-02 ... -0.477552 0.232752 -0.618117 -0.295492 -1.065560e-01 -1.733252e-01 -2.543068e-01 -1.483909e-01 -0.185063 0.0
75% -1.290347e-02 7.609624e-01 3.308473e-01 8.049757e-01 2.461881e-01 4.709621e-01 2.078055e-01 4.172589e-01 -1.581467e-01 -8.247919e-02 ... 0.501137 0.232752 0.637297 0.258116 5.907115e-01 4.803927e-01 1.930406e-01 -1.237073e-01 -0.185063 0.0
max 1.349709e+01 1.670864e+00 6.149577e+00 1.403028e+00 6.420661e+00 4.816263e-01 7.212567e+00 4.259794e-01 1.086439e+01 1.100095e+01 ... 2.339471 4.632576 6.914347 3.438140 4.247005e+00 4.143688e+00 7.307405e+00 1.405786e+01 11.006802 0.0

8 rows × 22 columns


In [33]:
pd.DataFrame( test_3_X_scaled).describe()


Out[33]:
0 1 2 3 4 5 6 7 8 9 ... 12 13 14 15 16 17 18 19 20 21
count 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 4.100000e+01 ... 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.000000 41.0
mean 0.628913 -0.072235 -0.333583 -0.567236 -0.097946 0.133072 0.086335 0.041306 -0.060985 -8.247922e-02 ... -0.010889 0.152265 0.392336 2.098248 -0.321653 -0.084992 0.438351 0.218287 0.126233 0.0
std 1.313145 1.103456 0.897999 1.161787 1.093991 1.090196 1.292039 1.012287 0.953499 3.771569e-08 ... 0.825824 0.933028 0.981290 1.081335 1.239627 0.777570 1.294518 1.844804 1.401828 0.0
min -0.410250 -2.610751 -0.813554 -3.329840 -0.734176 -5.965058 -0.659601 -4.476038 -0.316629 -8.247919e-02 ... -0.960109 -0.867209 -0.618117 -0.508490 -1.870507 -1.036730 -0.816257 -0.161088 -0.185063 0.0
25% -0.266765 -0.744032 -0.764672 -1.177542 -0.658315 0.332455 -0.600345 0.339827 -0.316458 -8.247919e-02 ... -0.673804 -0.867209 -0.618117 1.453449 -1.097615 -0.605030 -0.301719 -0.150095 -0.185063 0.0
50% 0.130609 0.021227 -0.638665 -0.459539 -0.525770 0.456580 -0.394917 0.403236 -0.313516 -8.247919e-02 ... -0.042360 0.232752 0.637297 2.103961 -0.539676 -0.239744 0.151922 -0.126737 -0.185063 0.0
75% 0.814958 0.682428 -0.392678 0.222527 -0.070596 0.476920 -0.202196 0.420466 -0.298387 -8.247919e-02 ... 0.307476 0.232752 0.637297 3.330932 0.224026 0.258375 0.697668 -0.069630 -0.185063 0.0
max 5.715701 1.616837 4.231605 1.397880 4.701602 0.480505 4.735194 0.424651 5.287445 -8.247919e-02 ... 2.339471 2.432664 3.148115 3.438140 3.262834 2.416884 6.892943 11.675599 6.947817 0.0

8 rows × 22 columns


In [27]:
%time yhat_test3 = SVM_3.make_predictions_parallel( test_3_X_scaled)


CPU times: user 14.6 s, sys: 490 ms, total: 15.1 s
Wall time: 15.1 s

In [28]:
yhat_test3 = np.sign( yhat_test3[0]);

In [29]:
(yhat_test3 == test_3_y).sum()/float(len(test_3_y))


Out[29]:
0.87804878048780488

Developing Pratt scaling functionality to make adhoc probability likelihoood estimates (estimates of probability)


In [38]:
SVM_3._yhat.get_value()


Out[38]:
CudaNdarray([ 1.14547348  0.57705605  1.57161105  1.07463706  1.48355114  0.23874146
  0.86475205  1.70737624  0.81584662  1.0122205   0.80445641  1.08217835
  1.10863554  0.42177004 -0.11415616  1.08211875  0.42191112  0.41249883
 -0.26516664 -0.52185935  0.70107007  0.57608938  0.43174618  0.96968025
  2.43406391  0.22953208  2.14952993  0.54495817  1.58349645  0.55356413
 -0.07381859  0.58271039  1.30273211  0.97237408  0.10405514  0.31538069
  1.04433548  1.70931029  0.01500738  0.74253023  0.5940876 ])

In [41]:
yhat_test3[0]


Out[41]:
CudaNdarray([ 1.14547348  0.57705605  1.57161105  1.07463706  1.48355114  0.23874146
  0.86475205  1.70737624  0.81584662  1.0122205   0.80445641  1.08217835
  1.10863554  0.42177004 -0.11415616  1.08211875  0.42191112  0.41249883
 -0.26516664 -0.52185935  0.70107007  0.57608938  0.43174618  0.96968025
  2.43406391  0.22953208  2.14952993  0.54495817  1.58349645  0.55356413
 -0.07381859  0.58271039  1.30273211  0.97237408  0.10405514  0.31538069
  1.04433548  1.70931029  0.01500738  0.74253023  0.5940876 ])

In [42]:
np.sign( yhat_test3[0])


Out[42]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1., -1.,  1.,  1.,  1., -1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [45]:
test_3_y


Out[45]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [30]:
yhat_test3


Out[30]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1., -1.,  1., -1.,  1., -1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [31]:
np.place( yhat_test3, yhat_test3 < 0., 0.)

In [34]:
yhat_test3


Out[34]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  0.,  1.,  0.,  1.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [50]:
yPratt_test_results = SVM_3.make_prob_Pratt(yhat_test3)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-50-ab87504f0f27> in <module>()
----> 1 Pratt_test_results = SVM_3.make_prob_Pratt(yhat_test3)

/home/topolo/PropD/MLgrabbag/ML/SVM.py in make_prob_Pratt(self, y, alpha, training_steps)
    739                 costfunctional = T.nnet.binary_crossentropy( Prob_1_given_yhat, y_sh)
    740 
--> 741                 DA, DB = T.grad(costfunctional, [A,B])  # the gradient of costfunctional, with respect to A,B, respectively
    742 		train = theano.function(inputs=[],outputs=[Prob_1_given_yhat, costfunctional],
    743 								updates=[(A,A-alpha*DA),(B,B-alpha*DB)],name="train")

/home/topolo/PropD/Theano/theano/gradient.pyc in grad(cost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected, null_gradients)
    435 
    436     if cost is not None and cost.ndim != 0:
--> 437         raise TypeError("cost must be a scalar.")
    438 
    439     if isinstance(wrt, set):

TypeError: cost must be a scalar.

In [52]:
alpha = np.float32(0.01)
yhat = SVM_3._yhat
y_sh = theano.shared( yhat_test3.astype(theano.config.floatX ) )
A = theano.shared( np.float32( np.random.rand() ) )
B = theano.shared( np.float32( np.random.rand() ) )
Prob_1_given_yhat = np.float32(1.)/(np.float32(1.)+ T.exp(A*yhat +B)) 
costfunctional = T.nnet.binary_crossentropy( Prob_1_given_yhat, y_sh).mean()
DA, DB = T.grad(costfunctional, [A,B])
train = theano.function(inputs=[],outputs=[Prob_1_given_yhat, costfunctional],
                        updates=[(A,A-alpha*DA),(B,B-alpha*DB)],name="train")
probabilities = theano.function(inputs=[], outputs=Prob_1_given_yhat,name="probabilities")

In [54]:
training_steps=10000
for i in range(training_steps):
    pred,err = train()

probabilities_vals = probabilities()

In [57]:
print(len(yhat_test3))
print(len(probabilities_vals))


41
41

In [62]:
probabilities_vals


Out[62]:
array([ 0.99928313,  0.98043251,  0.99994075,  0.99891531,  0.9999007 ,
        0.87377053,  0.99630618,  0.99997318,  0.9950884 ,  0.99843794,
        0.99475175,  0.99896204,  0.99911076,  0.95282549,  0.46754223,
        0.99896169,  0.95286262,  0.9503265 ,  0.26628667,  0.07478557,
        0.99043196,  0.98032379,  0.95538086,  0.99799734,  0.99999964,
        0.86770695,  0.99999797,  0.97648513,  0.99994469,  0.97761393,
        0.5264734 ,  0.98105717,  0.9997142 ,  0.99802858,  0.75890678,
        0.91553086,  0.99870515,  0.99997354,  0.65151274,  0.99247736,
        0.98225546], dtype=float32)

In [61]:
(probabilities_vals > 0.5).astype(theano.config.floatX)


Out[61]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  0.,  1.,  1.,  1.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [33]:
np.place( yhat_test3, yhat_test3 < 0., 0.)

In [35]:
%time yPratt_test_results = SVM_3.make_prob_Pratt(yhat_test3)


CPU times: user 1.49 s, sys: 378 ms, total: 1.87 s
Wall time: 1.89 s

In [36]:
yPratt_test_results[0]


Out[36]:
array([ 0.99487936,  0.88368529,  0.99830699,  0.9960795 ,  0.99605054,
        0.70919895,  0.97908562,  0.99985695,  0.92074919,  0.99685007,
        0.99245584,  0.95796376,  0.91233993,  0.98523915,  0.15018506,
        0.98160523,  0.3685869 ,  0.93778259,  0.4924126 ,  0.37109119,
        0.76929235,  0.96158934,  0.85186851,  0.9810673 ,  0.99996877,
        0.94375938,  0.99997568,  0.98332351,  0.99735463,  0.90602487,
        0.45559299,  0.96871412,  0.99864727,  0.99937207,  0.95520949,
        0.91174346,  0.99876761,  0.99998116,  0.86224443,  0.99335116,
        0.9905811 ], dtype=float32)

In [37]:
(yPratt_test_results[0] > 0.7).astype(theano.config.floatX)


Out[37]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  0.,  1.,  0.,  1.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.], dtype=float32)

In [ ]: