Logistic Regression and Bad Initialization Value

In this lab, you will see what happens when you use the root mean square error cost or total loss function and select the bad initialization value for the parameter values.

Make Some Data

Create the Model and Cost Function the Pytorch way

Train the Model:Batch Gradient Descent

Estimated Time Needed: 30 min

Import the following libraries:



In [ ]:

    
import numpy as np
import matplotlib.pyplot as plt 
from mpl_toolkits import mplot3d



In [ ]:

    
import torch
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn

Helper functions

The class plot_error_surfaces is just to help you visualize the data space and the parameter space during training and has nothing to do with Pytorch.



In [ ]:

    
class plot_error_surfaces(object):
    def __init__(self,w_range, b_range,X,Y,n_samples=30,go=True):
        W = np.linspace(-w_range, w_range, n_samples)
        B = np.linspace(-b_range, b_range, n_samples)
        w, b = np.meshgrid(W, B)    
        Z=np.zeros((30,30))
        count1=0
        self.y=Y.numpy()
        self.x=X.numpy()
        for w1,b1 in zip(w,b):
            count2=0
            for w2,b2 in zip(w1,b1):
                1 / (1 + np.exp(w2*self.x+b2))
                Z[count1,count2]=np.mean((self.y-(1 / (1 + np.exp(w2*self.x+b2))))**2)
                count2 +=1
    
            count1 +=1
        self.Z=Z
        self.w=w
        self.b=b
        self.W=[]
        self.B=[]
        self.LOSS=[]
        self.n=0
        if go==True:
            plt.figure()
            plt.figure(figsize=(7.5,5))
            plt.axes(projection='3d').plot_surface(self.w, self.b, self.Z, rstride=1, cstride=1,cmap='viridis', edgecolor='none')
            plt.title('Loss Surface')
            plt.xlabel('w')
            plt.ylabel('b')
            plt.show()
            plt.figure()
            plt.title('Loss Surface Contour')
            plt.xlabel('w')
            plt.ylabel('b')
            plt.contour(self.w, self.b, self.Z)
            plt.show()
    def get_stuff(self,model,loss):
        self.n=self.n+1
        self.W.append(list(model.parameters())[0].item())
        self.B.append(list(model.parameters())[1].item())
        self.LOSS.append(loss)
        
    def final_plot(self): 
        ax = plt.axes(projection='3d')
        ax.plot_wireframe(self.w, self.b, self.Z)
        ax.scatter(self.W,self.B, self.LOSS, c='r', marker='x',s=200,alpha=1)
        plt.figure()
        plt.contour(self.w,self.b, self.Z)
        plt.scatter(self.W,self.B,c='r', marker='x')
        plt.xlabel('w')
        plt.ylabel('b')
        plt.show()
    def plot_ps(self):
        plt.subplot(121)
        plt.ylim
        plt.plot(self.x,self.y,'ro',label="training points")
        plt.plot(self.x,self.W[-1]*self.x+self.B[-1],label="estimated line")
        plt.plot(self.x,1 / (1 + np.exp(-1*(self.W[-1]*self.x+self.B[-1]))),label='sigmoid')
        plt.xlabel('x')
        plt.ylabel('y')
        plt.ylim((-0.1, 2))
        plt.title('Data Space Iteration: '+str(self.n))
        plt.legend()
        plt.show()
        plt.subplot(122)
        plt.contour(self.w,self.b, self.Z)
        plt.scatter(self.W,self.B,c='r', marker='x')
        plt.title('Loss Surface Contour Iteration'+str(self.n) )
        plt.xlabel('w')
        plt.ylabel('b')
        plt.legend()



In [ ]:

    
torch.manual_seed(0)



In [ ]:

    
def PlotStuff(X,Y,model,epoch,leg=True):
    plt.plot(X.numpy(),model(X).detach().numpy(),label='epoch '+str(epoch))
    plt.plot(X.numpy(),Y.numpy(),'r')
    if leg==True:
        plt.legend()
    else:
        pass

Get Some Data



In [ ]:

    
from torch.utils.data import Dataset, DataLoader
class Data(Dataset):
    def __init__(self):
        self.x=torch.arange(-1,1,0.1).view(-1,1)
        self.y=-torch.zeros(self.x.shape[0],1)
        self.y[self.x[:,0]>0.2]=1
        self.len=self.x.shape[0]
    def __getitem__(self,index):      
        return self.x[index],self.y[index]
    def __len__(self):
        return self.len



In [ ]:

    
data_set=Data()



In [ ]:

    
trainloader=DataLoader(dataset=data_set,batch_size=3)

Create the Model and Total Loss Function (cost)

Create a custom module for logistic regression:



In [ ]:

    
class logistic_regression(nn.Module):
    def __init__(self,n_inputs):
        super(logistic_regression,self).__init__()
        self.linear=nn.Linear(n_inputs,1)
    def forward(self,x):
        yhat=torch.sigmoid(self.linear(x))
        return yhat

Create a logistic regression object or model:



In [ ]:

    
model=logistic_regression(1)

Replace the random initialized variable values with some predetermined values that will not converge:



In [ ]:

    
model.state_dict() ['linear.weight'].data[0]=torch.tensor([[-5]])
model.state_dict() ['linear.bias'].data[0]=torch.tensor([[-10]])



In [ ]:

    
model.state_dict()

Create a plot_error_surfaces object to visualize the data space and the parameter space during training:



In [ ]:

    
get_surface=plot_error_surfaces(15,13,data_set[:][0],data_set[:][1],30)

Define the cost or criterion function:



In [ ]:

    
criterion_rms=nn.MSELoss()

Create a dataloader object:



In [ ]:

    
learning_rate=2

optimizer=torch.optim.SGD(model.parameters(), lr=learning_rate)

Train the Model via Batch Gradient Descent



In [ ]:

    
for epoch in range(100):
    
    for x,y in trainloader:
        #make a prediction 
        yhat= model(x)
        #calculate the loss
        loss = criterion_rms(yhat, y)
        #clear gradient
        optimizer.zero_grad()
        #Backward pass: compute gradient of the loss with respect to all the learnable parameters
        loss.backward()
        #the step function on an Optimizer makes an update to its parameters
        optimizer.step()
        #for plotting
        get_surface.get_stuff(model,loss.tolist())
        #plot every 20 iterataions
    if epoch%20==0:
        get_surface.plot_ps()

Get the actual class of each sample and calculate the accuracy on the test data:



In [ ]:

    
yhat=model(data_set.x)
lable=yhat>0.5
print(torch.mean((lable==data_set.y.type(torch.ByteTensor)).type(torch.float)))

Accuracy is 60% compared to 100% in the last lab using a good Initialization value.

About the Authors:

Joseph Santarcangelo has a PhD in Electrical Engineering. His research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition.

Other contributors: Michelle Carey, Mavis Zhou

Logistic Regression and Bad Initialization Value

Table of Contents

Get Some Data

Create the Model and Total Loss Function (cost)

Train the Model via Batch Gradient Descent

About the Authors: