Hidden Layer Deep Network: Sigmoid, Tanh and Relu Activations Functions MNIST Dataset

In this lab, you will test Sigmoid, Tanh and Relu activations functions on the MNIST dataset with with two hidden Layers.

Neural Network Module and Training Function

Prepare Data

Define Several Neural Network, Criterion function, Optimizer

Test Sigmoid ,Tanh and Relu

Analyse Results

Estimated Time Needed: 25 min

You'll need the following libraries:



In [1]:

    
!conda install -y torchvision
import torch 
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
import torch.nn.functional as F
import matplotlib.pylab as plt
import numpy as np
torch.manual_seed(2)









    



Solving environment: done

# All requested packages already installed.







    Out[1]:





<torch._C.Generator at 0x7f35780ed8d0>

Neural Network Module and Training Function

define the neural network module or class, with two hidden Layers



In [2]:

    
class Net(nn.Module):
    def __init__(self,D_in,H1,H2,D_out):
        super(Net,self).__init__()
        self.linear1=nn.Linear(D_in,H1)
        self.linear2=nn.Linear(H1,H2)
        self.linear3=nn.Linear(H2,D_out)
       
    def forward(self,x):
        x=torch.sigmoid(self.linear1(x)) 
        x=torch.sigmoid(self.linear2(x))
        x=self.linear3(x)
        return x

define the class with the Tanh activation function



In [3]:

    
class NetTanh(nn.Module):
    def __init__(self,D_in,H1,H2,D_out):
        super(NetTanh,self).__init__()
        self.linear1=nn.Linear(D_in,H1)
        self.linear2=nn.Linear(H1,H2)
        self.linear3=nn.Linear(H2,D_out)
        
    def forward(self,x):
        x=torch.tanh(self.linear1(x))
        x=torch.tanh(self.linear2(x))
        x=self.linear3(x)
        return x

define the class for the Relu activation function



In [4]:

    
class NetRelu(nn.Module):
    def __init__(self,D_in,H1,H2,D_out):
        super(NetRelu,self).__init__()
        self.linear1=nn.Linear(D_in,H1)
        self.linear2=nn.Linear(H1,H2)
        self.linear3=nn.Linear(H2,D_out)
      
    def forward(self,x):
        x=F.relu(self.linear1(x))  
        x=F.relu(self.linear2(x))
        x=self.linear3(x)
        return x

define a function to train the model, in this case the function returns a Python dictionary to store the training loss and accuracy on the validation data



In [5]:

    
def train(model,criterion, train_loader,validation_loader, optimizer, epochs=100):
    i=0
    useful_stuff={'training_loss':[],'validation_accuracy':[]}  
    
    #n_epochs
    for epoch in range(epochs):
        for i,(x, y) in enumerate(train_loader):

            #clear gradient 
            optimizer.zero_grad()
            #make a prediction logits 
            z=model(x.view(-1,28*28))
            # calculate loss 
            loss=criterion(z,y)
    
            # calculate gradients of parameters 
            loss.backward()
            # update parameters 
            optimizer.step()
            useful_stuff['training_loss'].append(loss.data.item())
        correct=0
        for x, y in validation_loader:
            #perform a prediction on the validation  data  
            yhat=model(x.view(-1,28*28))
            
            _,lable=torch.max(yhat,1)
            correct+=(lable==y).sum().item()
 
    
        accuracy=100*(correct/len(validation_dataset))
   
        useful_stuff['validation_accuracy'].append(accuracy)
    
    return useful_stuff

Prepare Data

Load the training dataset by setting the parameters train to True and convert it to a tensor by placing a transform object int the argument transform



In [6]:

    
train_dataset=dsets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())

Load the testing dataset by setting the parameters train False and convert it to a tensor by placing a transform object int the argument transform



In [7]:

    
validation_dataset=dsets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())

create the criterion function



In [8]:

    
criterion=nn.CrossEntropyLoss()

create the training-data loader and the validation-data loader object



In [9]:

    
train_loader=torch.utils.data.DataLoader(dataset=train_dataset,batch_size=2000,shuffle=True)
validation_loader=torch.utils.data.DataLoader(dataset=validation_dataset,batch_size=5000,shuffle=False)

Define Neural Network, Criterion function, Optimizer and Train the Model

create the criterion function



In [10]:

    
criterion=nn.CrossEntropyLoss()

create the model with 100 hidden layers



In [11]:

    
input_dim=28*28
hidden_dim1=50
hidden_dim2=50
output_dim=10

print the model parameters

The epoch number in the video is 35. You can try 10 for now. If you try 35, it may take a long time.



In [12]:

    
cust_epochs = 10

Test Sigmoid ,Tanh and Relu

train the network using the Sigmoid activations function



In [13]:

    
model=Net(input_dim,hidden_dim1,hidden_dim2,output_dim)

learning_rate=0.01
optimizer=torch.optim.SGD(model.parameters(),lr=learning_rate)
training_results=train(model,criterion, train_loader,validation_loader, optimizer, epochs=cust_epochs)

train the network using the Tanh activations function



In [14]:

    
model_Tanh=NetTanh(input_dim,hidden_dim1,hidden_dim2,output_dim)
optimizer=torch.optim.SGD(model_Tanh.parameters(),lr=learning_rate)
training_results_tanch=train(model_Tanh,criterion, train_loader,validation_loader, optimizer, epochs=cust_epochs)

train the network using the Relu activations function



In [15]:

    
modelRelu=NetRelu(input_dim,hidden_dim1,hidden_dim2,output_dim)
optimizer=torch.optim.SGD(modelRelu.parameters(),lr=learning_rate)
training_results_relu=train(modelRelu,criterion, train_loader,validation_loader, optimizer, epochs=cust_epochs)

Analyse Results

compare the training loss for each activation



In [16]:

    
plt.plot(training_results_tanch['training_loss'],label='tanh')
plt.plot(training_results['training_loss'],label='sim')
plt.plot(training_results_relu['training_loss'],label='relu')
plt.ylabel('loss')
plt.title('training loss iterations')
plt.legend()









    Out[16]:





<matplotlib.legend.Legend at 0x7f35212cba90>

compare the validation loss for each model



In [17]:

    
plt.plot(training_results_tanch['validation_accuracy'],label='tanh')
plt.plot(training_results['validation_accuracy'],label='sigmoid')
plt.plot(training_results_relu['validation_accuracy'],label='relu') 
plt.ylabel('validation accuracy')
plt.xlabel('epochs ')   
plt.legend()









    Out[17]:





<matplotlib.legend.Legend at 0x7f352132a358>

About the Authors:

Joseph Santarcangelo has a PhD in Electrical Engineering. His research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition.

Other contributors: Michelle Carey, Mavis Zhou

Hidden Layer Deep Network: Sigmoid, Tanh and Relu Activations Functions MNIST Dataset

Table of Contents

Neural Network Module and Training Function

Prepare Data

Define Neural Network, Criterion function, Optimizer and Train the Model

Test Sigmoid ,Tanh and Relu

Analyse Results

About the Authors: