Define the network:


In [1]:
import torch # PyTorch base
from torch.autograd import Variable # Tensor class w gradients
import torch.nn as nn # modules, layers, loss fns
import torch.nn.functional as F # Conv,Pool,Loss,Actvn,Nrmlz fns from here

In [11]:
class Net(nn.Module):
    
    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # Kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an Affine Operation: y = Wx + b
        self.fc1 = nn.Linear(16*5*5, 120) # Linear is Dense/Fully-Connected
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        # Max pooling over a (2, 2) window
        # x = torch.nn.functional.max_pool2d(torch.nn.functional.relu(self.conv1(x)), (2,2))
        x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
        # If size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x
    
    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

In [12]:
net = Net()
print(net)


Net(
  (conv1): Conv2d (1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d (6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120)
  (fc2): Linear(in_features=120, out_features=84)
  (fc3): Linear(in_features=84, out_features=10)
)

You just have to define the forward function, and the backward function (where the gradients are computed) is automatically defined for you using autograd. You can use any of the Tensor operations in the forward function.

The learnable parameteres of a model are returns by net.parameters()


In [13]:
pars = list(net.parameters())
print(len(pars))
print(pars[0].size())  # conv1's .weight


10
torch.Size([6, 1, 5, 5])

The input to the forward is an autograd.Variable, and so is the output. NOTE: Expected input size to this net(LeNet) is 32x32. To use this net on MNIST dataset, please resize the images from the dataset to 32x32.


In [14]:
input = Variable(torch.randn(1, 1, 32, 32))
out = net(input)
print(out)


Variable containing:
 0.1121  0.1057  0.1048  0.0035  0.0175 -0.0641  0.0898 -0.0121  0.0252  0.0690
[torch.FloatTensor of size 1x10]

Zero the gradient buffers of all parameters and backprops with random gradients:


In [15]:
net.zero_grad()
out.backward(torch.randn(1, 10))

!NOTE¡:

torch.nn only supports mini-batches. The entire torch.nn package only supports inputs that are a mini-batch of samples, and not a single sample.

For example, nn.Conv2d will take in a 4D Tensor of nSamples x nChannels x Height x Width.

If you have a single sample, just use input.unsqueeze(0) to add a fake batch dimension.

Before proceeding further, let's recap all the classes you've seen so far.

Recap:

  • torch.Tensor - A multi-dimensional array.
  • autograd.Variable - Wraps a Tensor and records the history of operations applied to it. Has the same API as a Tensor, with some additions like backward(). Also holds the gradient wrt the tensor.
  • nn.Module - Neural network module. Convenient way of encapsulating parameters, with helpers for moving them to GPU, exporting, loading, etc.
  • nn.Parameter - A kind of Variable, that is automatically registered as a parameter when assigned as an attribute to a Module.
  • autograd.Function - Implements forward and backward definitions of an autograd operation. Every Variable operation creates at least a single Function node that connects to functions that created a Variable and encodes its history.

At this point, we covered:

  • Defining a neural network
  • Processing inputs and calling backward.

Still Left:

  • Computing the loss
  • Updating the weights of the network

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]: