Practice: Use the Sequential Constructor to Test the Test Sigmoid, Tanh, and Relu Activations Functions on the MNIST Dataset

In this lab, you will test Sigmoid, Tanh, and Relu activations functions on the MNIST dataset.

Neural Network Module and Training Function

Estimated Time Needed: 25 min

Import the following libraries:



In [ ]:

    
import torch 
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
import torch.nn.functional as F
import matplotlib.pylab as plt
import numpy as np

Neural Network Module and Training Function

Define a function to train the model. In this case, the function returns a Python dictionary to store the training loss and accuracy on the validation data.



In [ ]:

    
def train(model,criterion, train_loader,validation_loader, optimizer, epochs=100):
    i=0
    useful_stuff={'training_loss':[],'validation_accuracy':[]}  
    
    #n_epochs
    for epoch in range(epochs):
        for i,(x, y) in enumerate(train_loader):

            #clear gradient 
            optimizer.zero_grad()
            #make a prediction logits 
            z=model(x.view(-1,28*28))
            # calculate loss 
            loss=criterion(z,y)
    
            # calculate gradients of parameters 
            loss.backward()
            # update parameters 
            optimizer.step()
            useful_stuff['training_loss'].append(loss.data.item())
        correct=0
        for x, y in validation_loader:
            #perform a prediction on the validation  data  
            yhat=model(x.view(-1,28*28))
            
            _,lable=torch.max(yhat,1)
            correct+=(lable==y).sum().item()
 
    
        accuracy=100*(correct/len(validation_dataset))
   
        useful_stuff['validation_accuracy'].append(accuracy)
    
    return useful_stuff

Prepare Data

Load the training dataset by setting the parameter train to True and convert it to a tensor by placing a transform object in the argument transform:



In [ ]:

    
train_dataset=dsets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())

Load the testing dataset by setting the parameter train to False and convert it to a tensor by placing a transform object in the argument transform:



In [ ]:

    
validation_dataset=dsets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())

Create the criterion function:



In [ ]:

    
criterion=nn.CrossEntropyLoss()

Create the training-data loader and the validation-data loader objects:



In [ ]:

    
train_loader=torch.utils.data.DataLoader(dataset=train_dataset,batch_size=2000,shuffle=True)
validation_loader=torch.utils.data.DataLoader(dataset=validation_dataset,batch_size=5000,shuffle=False)

Criterion Function

Create the criterion function:



In [ ]:

    
criterion=nn.CrossEntropyLoss()

Test Sigmoid, Tanh, and Relu and Train the Model

Use the following parameters to construct the model:



In [ ]:

    
input_dim=28*28
hidden_dim=100
output_dim=10

Use nn.Sequential to build a one hidden layer neural `model` network with a sigmoid activation to classify the 10 digits from the MNIST dataset.



In [ ]:



In [ ]:

    
learning_rate=0.01
optimizer=torch.optim.SGD(model.parameters(),lr=learning_rate)
training_results=train(model,criterion, train_loader,validation_loader, optimizer, epochs=30)

Double-click here for the solution.

Train the network by using the Tanh activations function:

Use nn.Sequential to build a one hidden layer neural `model_Tanh` network with a Tanh activation to classify the 10 digits from the MNIST dataset.



In [ ]:



In [ ]:

    
optimizer=torch.optim.SGD(model_Tanh.parameters(),lr=learning_rate)
training_results_tanch=train(model_Tanh,criterion, train_loader,validation_loader, optimizer, epochs=30)

Double-click here for the solution.



In [ ]:



In [ ]:

    
optimizer=torch.optim.SGD(model_Tanh.parameters(),lr=learning_rate)
training_results_tanch=train(model_Tanh,criterion, train_loader,validation_loader, optimizer, epochs=30)

Use nn.Sequential to build a one hidden layer neural `modelRelu` network with a Rulu activation to classify the 10 digits from the MNIST dataset.



In [ ]:



In [ ]:

    
optimizer=torch.optim.SGD(modelRelu.parameters(),lr=learning_rate)
training_results_tanch=train(modelRelu,criterion, train_loader,validation_loader, optimizer, epochs=30)

Double-click here for the solution.

Analyze Results

Compare the training loss for each activation:



In [ ]:

    
plt.plot(training_results_tanch['training_loss'],label='tanh')
plt.plot(training_results['training_loss'],label='sim')
plt.plot(training_results_relu['training_loss'],label='relu')
plt.ylabel('loss')
plt.title('training loss iterations')
plt.legend()

Compare the validation loss for each model:



In [ ]:

    
plt.plot(training_results_tanch['validation_accuracy'],label='tanh')
plt.plot(training_results['validation_accuracy'],label='sigmoid')
plt.plot(training_results_relu['validation_accuracy'],label='relu') 
plt.ylabel('validation accuracy')
plt.xlabel('epochs ')   
plt.legend()

About the Authors:

Joseph Santarcangelo has a PhD in Electrical Engineering. His research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition.

Other contributors: Michelle Carey, Mavis Zhou

Practice: Use the Sequential Constructor to Test the Test Sigmoid, Tanh, and Relu Activations Functions on the MNIST Dataset

Table of Contents

Neural Network Module and Training Function

Prepare Data

Criterion Function

Test Sigmoid, Tanh, and Relu and Train the Model

Use nn.Sequential to build a one hidden layer neural model network with a sigmoid activation to classify the 10 digits from the MNIST dataset.

Use nn.Sequential to build a one hidden layer neural model_Tanh network with a Tanh activation to classify the 10 digits from the MNIST dataset.

Use nn.Sequential to build a one hidden layer neural modelRelu network with a Rulu activation to classify the 10 digits from the MNIST dataset.

Analyze Results

About the Authors:

Use nn.Sequential to build a one hidden layer neural `model` network with a sigmoid activation to classify the 10 digits from the MNIST dataset.

Use nn.Sequential to build a one hidden layer neural `model_Tanh` network with a Tanh activation to classify the 10 digits from the MNIST dataset.

Use nn.Sequential to build a one hidden layer neural `modelRelu` network with a Rulu activation to classify the 10 digits from the MNIST dataset.