This notebook is a small demo of how to use gpumon in Jupyter notebooks and some convenience methods for working with GPUs You will need to have PyTorch and Torchvision installed to run this as well as the python InfluxDB client
To install Pytorch and associated requiremetns run the following:
cuda install pytorch torchvision cuda80 -c python
To install python InfluxDB client
pip install influxdb
see here for more details on the InfluxDB client
In [37]:
from gpumon import device_count, device_name
In [38]:
device_count() # Returns the number of GPUs available
Out[38]:
In [39]:
device_name() # Returns the type of GPU available
Out[39]:
Let's create a simple CNN and run the CIFAR dataset against it to see the load on our GPU
In [1]:
import torch
import torchvision
import torchvision.transforms as transforms
In [2]:
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64,
shuffle=True, num_workers=4)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
In [3]:
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
In [4]:
net.cuda()
Out[4]:
In [5]:
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
In [6]:
from gpumon.influxdb import log_context
In [7]:
display_every_minibatches=100
Be carefull that you specify the correct host and credentials in the context below
In [8]:
with log_context('localhost', 'admin', 'password', 'gpudb', 'gpuseries'):
for epoch in range(20): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs
inputs, labels = data
# wrap them in Variable
inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.data[0]
print('[%d] loss: %.3f' %
(epoch + 1, running_loss / (i+1)))
print('Finished Training')
If you had your Grafana dashboard running you should have seen the measurements there. You can also pull the data from the database using the InfluxDB python client
In [22]:
from influxdb import InfluxDBClient, DataFrameClient
In [14]:
client = InfluxDBClient(host='localhost', username='admnin', password='password', database='gpudb')
In [15]:
client.get_list_measurements()
Out[15]:
In [30]:
data = client.query('select * from gpuseries limit 10;')
In [31]:
type(data)
Out[31]:
In [32]:
data
Out[32]:
In [23]:
df_client = DataFrameClient(host='localhost', username='admnin', password='password', database='gpudb')
In [33]:
df = df_client.query('select * from gpuseries limit 100;')['gpuseries']
In [36]:
df.head(100)
Out[36]:
In [ ]: