Creating Deep Networks

In this last section, we'll construct our networks from scratch.

The target problem is again classification of Higgs Boson data.

Let's load BIDMat/BIDMach

In [ ]:
import BIDMat.{CMat,CSMat,DMat,Dict,IDict,Image,FMat,FND,GDMat,GMat,GIMat,GSDMat,GSMat,HMat,IMat,Mat,SMat,SBMat,SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.JPlotting._
import BIDMach.Learner
import BIDMach.models.{FM,GLM,KMeans,KMeansw,ICA,LDA,LDAgibbs,Model,NMF,RandomForest,SFA,SVD}
import BIDMach.networks.{Net}
import BIDMach.datasources.{DataSource,MatSource,FileSource,SFileSource}
import BIDMach.mixins.{CosineSim,Perplexity,Top,L1Regularizer,L2Regularizer}
import BIDMach.updaters.{ADAGrad,Batch,BatchNorm,IncMult,IncNorm,Telescoping}
import BIDMach.causal.{IPTW}

if (Mat.hasCUDA > 0) GPUmem

And define the root directory for this dataset.

In [ ]:
val dir = "/code/BIDMach/data/uci/Higgs/parts/"

Constructing a deep network Learner

The "Net" class is the parent class for Deep networks. By defining a learner, we also configure a datasource, an optimization method, and possibly a regularizer.

In [ ]:
val (mm, opts) = Net.learner(dir+"data%03d.fmat.lz4", dir+"label%03d.fmat.lz4")

The next step is to define the network to run. First we set some options:

In [ ]:
opts.hasBias = true;                    // Include additive bias in linear layers
opts.links = iones(1,1);                // The link functions specify output loss, 1= logistic
opts.nweight = 1e-4f                    // weight for normalization layers

Now we define the network itself. We'll import a couple of classes that define convenience functions to generate the nodes in the network.

In [ ]:
import BIDMach.networks.layers.Node._
import BIDMach.networks.layers.NodeSet

Now we'll define the network itself. Each layer is represented by a function of (one or more) input layers. Layers have optional arguments, specified in curried form (second group of parentheses). The layer types include:

  • input layer - mandatory as the first layer.
  • linear layer - takes an input, and an optional output dimension and bias
  • sigmoid layer - σ or "sigmoid" takes a single input
  • tanh layer - "tanh" with a single input
  • rectifying layer - "rect" with a single input (output = max(0,input))
  • softplus layer - "softplus" with a single input
  • normalization layer - takes an input and a weight parameter
  • output GLM layer - expects a "links" option with integer values which specify the type of link function, 1=logistic

In [ ]:
val in = input;                                                  // An input node
val lin1 = linear(in)(outdim = 1000, hasBias = opts.hasBias);    // A linear layer
val sig1 = σ(lin1)                                               // A sigmoid layer
val norm1 = norm(sig1)(weight = opts.nweight)                    // A normalization layer
val lin2 = linear(norm1)(outdim = 1, hasBias = opts.hasBias);    // A linear layer
val out = glm(lin2)(irow(1))                                     // Output GLM layer

Finally we assemble the net by placing the elements in an array, and passing a NodeSet from them to the Learner.

In [ ]:
val mynodes = Array(in, lin1, sig1, norm1, lin2, out);
opts.nodeset = new NodeSet(mynodes.length, mynodes)

Tuning Options

Here follow some tuning options

In [ ]:
opts.nend = 10                         // The last file number in the datasource
opts.npasses = 5                       // How many passes to make over the data 
opts.batchSize = 200                  // The minibatch size
opts.evalStep = 511                    // Count of minibatch between eval steps

opts.lrate = 0.01f;                    // Learning rate
opts.texp = 0.4f;                      // Time exponent for ADAGRAD

You invoke the learner the same way as before.

In [ ]:

Now lets extract the model and use it to predict labels on a held-out sample of data.

In [ ]:
val model = mm.model.asInstanceOf[Net]

val ta = loadFMat(dir + "data%03d.fmat.lz4" format 10);
val tc = loadFMat(dir + "label%03d.fmat.lz4" format 10);

val (nn,nopts) = Net.predictor(model, ta);

Let's run the predictor

In [ ]:

To evaluate, we extract the predictions as a floating matrix, and then compute a ROC curve with them. The mean of this curve is the AUC (Area Under the Curve).

In [ ]:
val pc = FMat(nn.preds(0))
val rc = roc(pc, tc, 1-tc, 1000);

In [ ]:

Tune It!

Try varying your nets design to see how accurate it can be. Feel free to write procedural code - i.e. generate your need using a loop and customize the layer sizes. What was your final accuracy ?

In [ ]: