In this last section, we'll construct our networks from scratch.
The target problem is again classification of Higgs Boson data.
Let's load BIDMat/BIDMach
In [ ]:
import BIDMat.{CMat,CSMat,DMat,Dict,IDict,Image,FMat,FND,GDMat,GMat,GIMat,GSDMat,GSMat,HMat,IMat,Mat,SMat,SBMat,SDMat}
import BIDMat.MatFunctions._
import BIDMat.SciFunctions._
import BIDMat.Solvers._
import BIDMat.JPlotting._
import BIDMach.Learner
import BIDMach.models.{FM,GLM,KMeans,KMeansw,ICA,LDA,LDAgibbs,Model,NMF,RandomForest,SFA,SVD}
import BIDMach.networks.{Net}
import BIDMach.datasources.{DataSource,MatSource,FileSource,SFileSource}
import BIDMach.mixins.{CosineSim,Perplexity,Top,L1Regularizer,L2Regularizer}
import BIDMach.updaters.{ADAGrad,Batch,BatchNorm,IncMult,IncNorm,Telescoping}
import BIDMach.causal.{IPTW}
Mat.checkMKL
Mat.checkCUDA
Mat.setInline
if (Mat.hasCUDA > 0) GPUmem
And define the root directory for this dataset.
In [ ]:
val dir = "/code/BIDMach/data/uci/Higgs/parts/"
The "Net" class is the parent class for Deep networks. By defining a learner, we also configure a datasource, an optimization method, and possibly a regularizer.
In [ ]:
val (mm, opts) = Net.learner(dir+"data%03d.fmat.lz4", dir+"label%03d.fmat.lz4")
The next step is to define the network to run. First we set some options:
In [ ]:
opts.hasBias = true; // Include additive bias in linear layers
opts.links = iones(1,1); // The link functions specify output loss, 1= logistic
opts.nweight = 1e-4f // weight for normalization layers
Now we define the network itself. We'll import a couple of classes that define convenience functions to generate the nodes in the network.
In [ ]:
import BIDMach.networks.layers.Node._
import BIDMach.networks.layers.NodeSet
Now we'll define the network itself. Each layer is represented by a function of (one or more) input layers. Layers have optional arguments, specified in curried form (second group of parentheses). The layer types include:
In [ ]:
val in = input; // An input node
val lin1 = linear(in)(outdim = 1000, hasBias = opts.hasBias); // A linear layer
val sig1 = σ(lin1) // A sigmoid layer
val norm1 = norm(sig1)(weight = opts.nweight) // A normalization layer
val lin2 = linear(norm1)(outdim = 1, hasBias = opts.hasBias); // A linear layer
val out = glm(lin2)(irow(1)) // Output GLM layer
Finally we assemble the net by placing the elements in an array, and passing a NodeSet from them to the Learner.
In [ ]:
val mynodes = Array(in, lin1, sig1, norm1, lin2, out);
opts.nodeset = new NodeSet(mynodes.length, mynodes)
Here follow some tuning options
In [ ]:
opts.nend = 10 // The last file number in the datasource
opts.npasses = 5 // How many passes to make over the data
opts.batchSize = 200 // The minibatch size
opts.evalStep = 511 // Count of minibatch between eval steps
opts.lrate = 0.01f; // Learning rate
opts.texp = 0.4f; // Time exponent for ADAGRAD
You invoke the learner the same way as before.
In [ ]:
mm.train
Now lets extract the model and use it to predict labels on a held-out sample of data.
In [ ]:
val model = mm.model.asInstanceOf[Net]
val ta = loadFMat(dir + "data%03d.fmat.lz4" format 10);
val tc = loadFMat(dir + "label%03d.fmat.lz4" format 10);
val (nn,nopts) = Net.predictor(model, ta);
nopts.batchSize=10000
Let's run the predictor
In [ ]:
nn.predict
To evaluate, we extract the predictions as a floating matrix, and then compute a ROC curve with them. The mean of this curve is the AUC (Area Under the Curve).
In [ ]:
val pc = FMat(nn.preds(0))
val rc = roc(pc, tc, 1-tc, 1000);
mean(rc)
In [ ]:
plot(rc)
Try varying your nets design to see how accurate it can be. Feel free to write procedural code - i.e. generate your need using a loop and customize the layer sizes. What was your final accuracy ?
In [ ]: