Softmax Classifer

In this lab, you will use a single layer Softmax to classify handwritten digits from the MNIST database.

Define Softmax, Criterion Function, Optimizer, and Train the Model

Estimated Time Needed: 25 min

Helper functions



In [ ]:

    
!conda install -y torchvision
import torch 
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
import matplotlib.pylab as plt
import numpy as np

Use the following function to plot out the parameters of the Softmax function:



In [ ]:

    
def PlotParameters(model): 
    W=model.state_dict() ['linear.weight'].data
    w_min=W.min().item()
    w_max=W.max().item()
    fig, axes = plt.subplots(2, 5)
    fig.subplots_adjust(hspace=0.01, wspace=0.1)
    for i,ax in enumerate(axes.flat):
        if i<10:
             # Set the label for the sub-plot.
            ax.set_xlabel( "class: {0}".format(i))

            # Plot the image.
            ax.imshow(W[i,:].view(28,28), vmin=w_min, vmax=w_max, cmap='seismic')

            ax.set_xticks([])
            ax.set_yticks([])
        
        # Ensure the plot is shown correctly with multiple plots
        # in a single Notebook cell.
    plt.show()

Use the following function to visualize the data:



In [ ]:

    
def show_data(data_sample):

    plt.imshow(data_sample[0].numpy().reshape(28,28),cmap='gray')
    #print(data_sample[1].item())
    plt.title('y= '+ str(data_sample[1].item()))

Prepare Data

Load the training dataset by setting the parameters train to True and convert it to a tensor by placing a transform object in the argument transform.



In [ ]:

    
train_dataset=dsets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
train_dataset

Load the testing dataset by setting the parameters train False and convert it to a tensor by placing a transform object in the argument transform.



In [ ]:

    
validation_dataset=dsets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())
validation_dataset

You can see that the data type is long:



In [ ]:

    
train_dataset[0][1].type()

Data Visualization

Each element in the rectangular tensor corresponds to a number that represents a pixel intensity as demonstrated by the following image:

</a>

Print out the third label:



In [ ]:

    
train_dataset[3][1]

Plot the 3rd sample:



In [ ]:

    
show_data(train_dataset[3])

You see that it is a 1. Now, plot the second sample:



In [ ]:

    
show_data(train_dataset[2])

Build a Softmax Classifer

Build a Softmax classifier class:



In [ ]:

    
class SoftMax(nn.Module):
    def __init__(self,input_size,output_size):
        super(SoftMax,self).__init__()
        self.linear=nn.Linear(input_size,output_size)
    def forward(self,x):
        z=self.linear(x)
        return z

The Softmax function requires vector inputs. Note that the vector shape is 28x28.



In [ ]:

    
train_dataset[0][0].shape

Flatten the tensor as shown in this image:

</a>

The size of the tensor is now 784.

</a>

Set the input size and output size:



In [ ]:

    
input_dim=28*28
output_dim=10
input_dim

Define the Softmax Classifier, Criterion Function, Optimizer, and Train the Model



In [ ]:

    
model=SoftMax(input_dim,output_dim)
model

View the size of the model parameters:



In [ ]:

    
print('W:',list(model.parameters())[0].size())
print('b',list(model.parameters())[1].size())

You can cover the model parameters for each class to a rectangular grid:

Plot the model parameters for each class:



In [ ]:

    
PlotParameters(model)

Loss function:



In [ ]:

    
criterion=nn.CrossEntropyLoss()

Optimizer class:



In [ ]:

    
learning_rate=0.1
optimizer=torch.optim.SGD(model.parameters(), lr=learning_rate)

Define the dataset loader:



In [ ]:

    
train_loader=torch.utils.data.DataLoader(dataset=train_dataset,batch_size=100)
validation_loader=torch.utils.data.DataLoader(dataset=validation_dataset,batch_size=5000)

Train the model and determine validation accuracy (should take a few minutes):



In [ ]:

    
n_epochs=10
loss_list=[]
accuracy_list=[]
N_test=len(validation_dataset)
#n_epochs
for epoch in range(n_epochs):
    for x, y in train_loader:
      

        #clear gradient 
        optimizer.zero_grad()
        #make a prediction 
        z=model(x.view(-1,28*28))
        # calculate loss 
        loss=criterion(z,y)
        # calculate gradients of parameters 
        loss.backward()
        # update parameters 
        optimizer.step()
        
        
        
    correct=0
    #perform a prediction on the validation  data  
    for x_test, y_test in validation_loader:

        z=model(x_test.view(-1,28*28))
        _,yhat=torch.max(z.data,1)

        correct+=(yhat==y_test).sum().item()
        
   
    accuracy=correct/N_test

    accuracy_list.append(accuracy)
    
    loss_list.append(loss.data)
    accuracy_list.append(accuracy)

Analyze Results

Plot the loss and accuracy on the validation data:



In [ ]:

    
fig, ax1 = plt.subplots()
color = 'tab:red'
ax1.plot(loss_list,color=color)
ax1.set_xlabel('epoch',color=color)
ax1.set_ylabel('total loss',color=color)
ax1.tick_params(axis='y', color=color)
    
ax2 = ax1.twinx()  
color = 'tab:blue'
ax2.set_ylabel('accuracy', color=color)  
ax2.plot( accuracy_list, color=color)
ax2.tick_params(axis='y', labelcolor=color)
fig.tight_layout()

View the results of the parameters for each class after the training. You can see that they look like the corresponding numbers.



In [ ]:

    
PlotParameters(model)

Plot the first five misclassified samples:



In [ ]:

    
count=0
for x,y in validation_dataset:

    z=model(x.reshape(-1,28*28))
    _,yhat=torch.max(z,1)
    if yhat!=y:
        show_data((x,y))

        plt.show()
        print("yhat:",yhat)
        count+=1
    if count>=5:
        break

About the Authors:

Joseph Santarcangelo has a PhD in Electrical Engineering. His research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition.

Other contributors: Michelle Carey, Mavis Zhou

Softmax Classifer

Table of Contents

Helper functions

Prepare Data

Data Visualization

Build a Softmax Classifer

Define the Softmax Classifier, Criterion Function, Optimizer, and Train the Model

Analyze Results

About the Authors: