In [11]:
import mxnet as mx
import numpy as np
MXNet uses data in the form of Data Iterators. The code below illustrates how to encode a dataset into an iterator that MXNet can use. The data used in the example is made up of 2d data points with corresponding integer labels. The function we are trying to learn is:
y = x1 + 2x2 ,
where (x1,x2) is one training data point and y is the corresponding label
In [12]:
#Training data
train_data = np.array([[1,2],[3,4],[5,6],[3,2],[7,1],[6,9]])
train_label = np.array([5,11,17,7,9,24])
batch_size = 1
#Evaluation Data
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11,26,16])
Once we have the data ready, we need to put it into an iterator and specify parameters such as the 'batch_size', and 'shuffle' which will determine the size of data the iterator feeds during each pass, and whether or not the data will be shuffled respectively.
In [13]:
train_iter = mx.io.NDArrayIter(train_data,train_label, batch_size, shuffle=True,label_name='lin_reg_label')
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
In the above example, we have made use of NDArrayIter, which is used to iterate over numpy arrays. In general, there are many different types of iterators in MXNet based on the type of data you will be using. Their complete documentation can be found at: http://mxnet.io/api/python/io.html
Model Class: The model class in MXNet is used to define the overall entity of the model. It contains the variable we want to minimize, the training data and labels, and some additional parameters such as the learning rate and optimization algorithm are defined at the model level.
Symbols: The actual MXNet network is defined using symbols. MXNet has different types of symbols, including data placeholders, neural network layers, and loss function symbols based on our requirement.
IO: The IO class as we already saw works on the data, and carries out operations like breaking the data into batches and shuffling it.
MXNet uses Symbols for defining a model. Symbols are the building blocks of the model and compose various components of the model. Some of the parts symbols are used to define are:
The ones described above, and other symbols are chained one after the other, servng as input to one another to create the network topology. More information about the different types of symbols can be found [here][http://mxnet.io/api/python/symbol.html]
In [14]:
X = mx.sym.Variable('data')
Y = mx.symbol.Variable('lin_reg_label')
fully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)
lro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name="lro")
The above network uses the following layers:
a. data: Input to the layer (specify the symbol whose output should be fed here)
b. num_hidden: Number of hidden dimension which specifies the size of the output of the layer
a. data: Input to this layer (specify the symbol whose output should be fed here)
b. Label: The training label against whom we will compare the input to the layer for calculation of l2 loss
Note - Naming Convention: the label variable's name should be the same as the label_name parameter passed to your training data iterator. The default value of this is 'softmax_label', but we have updated it to lin_reg_label in this tutorial as you can see in Y = mx.symbol.Variable('lin_reg_label') and train_iter = mx.io.NDArrayIter(..., label_name="lin_reg_label")
Finally, the network is stored into a Module, where you define the symbol who's value is to be minimised (in our case, lro or the lin_reg_output"), the learning rate to be used while optimization and the number of epochs we want to train our model on.
In [15]:
model = mx.mod.Module(
symbol = lro ,
data_names=['data'],
label_names = ['lin_reg_label']# network structure
)
We can plot the network we have created in order to visualize it.
In [16]:
mx.viz.plot_network(symbol=lro)
Out[16]:
Once we have defined the model structure, the next step is to train the parameters of the model to fit the training data. This is done by using the fit() function of the Module class.
In [17]:
import logging
logging.basicConfig(level=logging.INFO)
model.fit(train_iter, eval_iter,
optimizer_params={'learning_rate':0.00005, 'momentum': 0.9},
num_epoch=10,
eval_metric='mse',
batch_end_callback = mx.callback.Speedometer(batch_size, 2))
Once we have a trained model, we can do multiple things on it. We can use it for inference, we can evaluate the trained model on test data. This is shown below.
In [18]:
#Inference
model.predict(eval_iter).asnumpy()
Out[18]:
We can also evaluate our model for some metric. In this example, we are evaulating our model's mean squared error on the evaluation data.
In [19]:
#Evaluation
metric = mx.metric.MSE()
model.score(eval_iter, metric)
Out[19]:
Let us try to add some noise to the evaluation data and see how the MSE changes
In [20]:
#Evaluation Data
eval_data = np.array([[7,2],[6,10],[12,2]])
eval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values
eval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)
model.score(eval_iter, metric)
Out[20]:
Finally, you can create your own metrics and use it to evauate your model. More information on metrics here: http://mxnet-test.readthedocs.io/en/latest/api/metric.html
In [ ]: