MXNet

MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavours of symbolic programming and imperative programming to maximize efficiency and productivity. In its core, a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. The library is portable and lightweight, and it scales to multiple GPUs and multiple machines.

Simple multi-layer perceptron:

Here we have the CPU version of a 3-layer MLP model to perform digit recognition on the MNIST dataset.


In [1]:
ENV["MXNET_HOME"] = "/home/ubuntu/mxnet/"
Base.compilecache("MXNet")
using MXNet

In [2]:
data = mx.Variable(:data)


Out[2]:
MXNet.mx.SymbolicNode(MXNet.mx.MX_SymbolHandle(Ptr{Void} @0x0000000002832110))

In [3]:
fc1  = mx.FullyConnected(data = data, name=:fc1, num_hidden=128)
act1 = mx.Activation(data = fc1, name=:relu1, act_type=:relu)
fc2  = mx.FullyConnected(data = act1, name=:fc2, num_hidden=64)
act2 = mx.Activation(data = fc2, name=:relu2, act_type=:relu)
fc3  = mx.FullyConnected(data = act2, name=:fc3, num_hidden=10)


Out[3]:
MXNet.mx.SymbolicNode(MXNet.mx.MX_SymbolHandle(Ptr{Void} @0x0000000002d1cdf0))

In [4]:
mlp  = mx.SoftmaxOutput(data = fc3, name=:softmax)


Out[4]:
MXNet.mx.SymbolicNode(MXNet.mx.MX_SymbolHandle(Ptr{Void} @0x0000000002d3e140))

In [5]:
mlp = @mx.chain mx.Variable(:data)             =>
  mx.FullyConnected(name=:fc1, num_hidden=128) =>
  mx.Activation(name=:relu1, act_type=:relu)   =>
  mx.FullyConnected(name=:fc2, num_hidden=64)  =>
  mx.Activation(name=:relu2, act_type=:relu)   =>
  mx.FullyConnected(name=:fc3, num_hidden=10)  =>
  mx.SoftmaxOutput(name=:softmax)


Out[5]:
MXNet.mx.SymbolicNode(MXNet.mx.MX_SymbolHandle(Ptr{Void} @0x0000000002e184f0))

In [6]:
batch_size = 100
include(Pkg.dir("MXNet", "examples", "mnist", "mnist-data.jl"))
train_provider, eval_provider = get_mnist_providers(batch_size)


Out[6]:
(MXNet.mx.MXDataProvider(MXNet.mx.MX_DataIterHandle(Ptr{Void} @0x000000000304d4f0),Tuple{Symbol,Tuple}[(:data,(784,100))],Tuple{Symbol,Tuple}[(:softmax_label,(100,))],100,true,true),MXNet.mx.MXDataProvider(MXNet.mx.MX_DataIterHandle(Ptr{Void} @0x0000000002db00b0),Tuple{Symbol,Tuple}[(:data,(784,100))],Tuple{Symbol,Tuple}[(:softmax_label,(100,))],100,true,true))

In [7]:
model = mx.FeedForward(mlp, context=mx.cpu())


Out[7]:
MXNet.mx.FeedForward(MXNet.mx.SymbolicNode(MXNet.mx.MX_SymbolHandle(Ptr{Void} @0x0000000002e184f0)),[CPU0],#undef,#undef,#undef)

In [8]:
optimizer = mx.SGD(lr=0.1, momentum=0.9, weight_decay=0.00001)


Out[8]:
MXNet.mx.SGD(MXNet.mx.SGDOptions(0.1,0.9,0,1.0e-5,MXNet.mx.LearningRate.Fixed(0.1),MXNet.mx.Momentum.Fixed(0.9)),#undef)

In [9]:
@time mx.fit(model, optimizer, train_provider, n_epoch=20, eval_data=eval_provider)


INFO: Start training on [CPU0]
INFO: Initializing parameters...
INFO: Creating KVStore...
INFO: TempSpace: Total 0 MB allocated on CPU0
INFO: Start training...
INFO: == Epoch 001 ==========
INFO: ## Training summary
INFO:           accuracy = 0.7601
INFO:               time = 79.2076 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9498
INFO: == Epoch 002 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9591
INFO:               time = 80.3366 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9619
INFO: == Epoch 003 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9728
INFO:               time = 84.7242 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9649
INFO: == Epoch 004 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9788
INFO:               time = 82.3494 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9703
INFO: == Epoch 005 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9823
INFO:               time = 83.3827 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9728
INFO: == Epoch 006 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9847
INFO:               time = 81.1912 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9738
INFO: == Epoch 007 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9861
INFO:               time = 83.6950 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9730
INFO: == Epoch 008 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9876
INFO:               time = 87.6382 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9697
INFO: == Epoch 009 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9885
INFO:               time = 112.5983 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9729
INFO: == Epoch 010 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9892
INFO:               time = 92.7522 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9748
INFO: == Epoch 011 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9912
INFO:               time = 84.8109 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9784
INFO: == Epoch 012 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9912
INFO:               time = 82.3110 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9757
INFO: == Epoch 013 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9916
INFO:               time = 83.8452 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9763
INFO: == Epoch 014 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9932
INFO:               time = 84.9149 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9744
INFO: == Epoch 015 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9927
INFO:               time = 82.5378 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9774
INFO: == Epoch 016 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9930
INFO:               time = 82.0644 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9761
INFO: == Epoch 017 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9932
INFO:               time = 84.4499 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9750
INFO: == Epoch 018 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9943
INFO:               time = 81.0815 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9780
INFO: == Epoch 019 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9938
INFO:               time = 84.1596 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9760
INFO: == Epoch 020 ==========
INFO: ## Training summary
INFO:           accuracy = 0.9945
INFO:               time = 83.8395 seconds
INFO: ## Validation summary
INFO:           accuracy = 0.9757
1803.345689 seconds (31.89 M allocations: 1.069 GB, 0.05% gc time)

In [21]:
probs = mx.predict(model, eval_provider)


INFO: TempSpace: Total 0 MB allocated on CPU0
Out[21]:
10x10000 Array{Float32,2}:
 4.59429e-14  3.18306e-18  2.52542e-13  …  5.56595e-14  1.71275e-16
 3.70478e-8   8.59895e-10  1.0             9.69901e-20  1.14819e-24
 3.6383e-12   1.0          5.80729e-10     1.6547e-22   2.9313e-24 
 4.11983e-10  1.58855e-17  9.64722e-15     3.17896e-16  3.41109e-25
 1.65231e-9   3.37197e-20  1.59045e-8      1.07894e-25  8.58869e-25
 1.55073e-11  6.13501e-20  7.77721e-10  …  1.0          1.96767e-16
 1.51797e-14  1.56352e-17  6.11919e-10     1.80598e-11  1.0        
 1.0          1.86019e-16  1.92821e-7      2.16589e-20  1.73618e-34
 8.00314e-11  6.92101e-18  4.12114e-8      6.71468e-10  1.03235e-16
 2.6602e-8    1.79244e-25  2.14456e-11     8.00378e-22  1.8358e-22 

In [22]:
# collect all labels from eval data
labels = Array[]
for batch in eval_provider
    push!(labels, copy(mx.get(eval_provider, batch, :softmax_label)))
end
labels = cat(1, labels...)


Out[22]:
10000-element Array{Float32,1}:
 7.0
 2.0
 1.0
 0.0
 4.0
 1.0
 4.0
 9.0
 5.0
 9.0
 0.0
 6.0
 9.0
 ⋮  
 5.0
 6.0
 7.0
 8.0
 9.0
 0.0
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0

In [23]:
# Now we use compute the accuracy
correct = 0
for i = 1:length(labels)
    # labels are 0...9
    if indmax(probs[:,i]) == labels[i]+1
        correct += 1
    end
end
accuracy = 100correct/length(labels)
println(mx.format("Accuracy on eval set: {1:.2f}%", accuracy))


Accuracy on eval set: 97.31%

In [24]:
labels


Out[24]:
10000-element Array{Float32,1}:
 7.0
 2.0
 1.0
 0.0
 4.0
 1.0
 4.0
 9.0
 5.0
 9.0
 0.0
 6.0
 9.0
 ⋮  
 5.0
 6.0
 7.0
 8.0
 9.0
 0.0
 1.0
 2.0
 3.0
 4.0
 5.0
 6.0

In [16]:
features = trainfeatures(60000)


Out[16]:
784-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 ⋮  
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

In [25]:
testfeatures(1)


Out[25]:
784-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 ⋮  
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

In [ ]: