Neural networks

Neural networks inside hep_ml are very simple, but flexible. They are using theano library.

hep_ml.nnet also provides tools to optimize any continuos expression as a decision function (there is an example below).

Downloading a dataset

downloading dataset from UCI and splitting it into train and test


In [1]:
!cd toy_datasets; wget -O ../data/MiniBooNE_PID.txt -nc MiniBooNE_PID.txt https://archive.ics.uci.edu/ml/machine-learning-databases/00199/MiniBooNE_PID.txt


/bin/sh: 1: cd: can't cd to toy_datasets
File ‘../data/MiniBooNE_PID.txt’ already there; not retrieving.

In [2]:
import numpy, pandas
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

data = pandas.read_csv('../data/MiniBooNE_PID.txt', sep='\s\s*', skiprows=[0], header=None, engine='python')
labels = pandas.read_csv('../data/MiniBooNE_PID.txt', sep=' ', nrows=1, header=None)
labels = [1] * labels[1].values[0] + [0] * labels[2].values[0]
data.columns = ['feature_{}'.format(key) for key in data.columns]

train_data, test_data, train_labels, test_labels = train_test_split(data, labels, train_size=0.5, test_size=0.5, random_state=42)

Example of training a network

Training multilayer perceptron with one hidden layer with 5 neurons. In most cases, we simply use MLPClassifier with one or two hidden layers.


In [3]:
from hep_ml.nnet import MLPClassifier
from sklearn.metrics import roc_auc_score

clf = MLPClassifier(layers=[5], epochs=500)
clf.fit(train_data, train_labels)


WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
Out[3]:
MLPClassifier(epochs=500, layers=[5], loss='log_loss', random_state=None,
       scaler='standard', trainer='irprop-', trainer_parameters=None)

In [4]:
proba = clf.predict_proba(test_data)
print('Test quality:', roc_auc_score(test_labels, proba[:, 1]))


Test quality: 0.9713069468459772

In [5]:
proba = clf.predict_proba(train_data)
print('Train quality:', roc_auc_score(train_labels, proba[:, 1]))


Train quality: 0.9712880497811155

Creating your own neural network

To create own neural network, one should provide activation function and define parameters of a network.

You are not limited here to any kind of structure in this function, hep_ml.nnet will consider this as a black box for optimization.

Simplest way is to override prepare method of AbstractNeuralNetworkClassifier.


In [6]:
from hep_ml.nnet import AbstractNeuralNetworkClassifier
from theano import tensor as T

class SimpleNeuralNetwork(AbstractNeuralNetworkClassifier):
    def prepare(self):
        # getting number of layers in input, hidden, output layers
        # note that we support only one hidden layer here
        n1, n2, n3 = self.layers_
        
        # creating parameters of neural network
        W1 = self._create_matrix_parameter('W1', n1, n2)
        W2 = self._create_matrix_parameter('W2', n2, n3)
        
        # defining activation function
        def activation(input):
            first = T.nnet.sigmoid(T.dot(input, W1))
            return T.dot(first, W2)

        return activation

In [7]:
clf = SimpleNeuralNetwork(layers=[5], epochs=500)
clf.fit(train_data, train_labels)
print('Test quality:', roc_auc_score(test_labels, clf.predict_proba(test_data)[:, 1]))


Test quality: 0.9667676784980118

Example of a very specific neural network

this NN has one hidden layer, but the layer is quite strange, as it encounters correlations


In [8]:
from hep_ml.nnet import PairwiseNeuralNetwork
clf = PairwiseNeuralNetwork(layers=[5], epochs=500)
clf.fit(train_data, train_labels)
print('Test quality:', roc_auc_score(test_labels, clf.predict_proba(test_data)[:, 1]))


Test quality: 0.9713998457401084

Fitting very specific expressions as estimators

One can use hep_ml.nnet to optimize any expressions as black-box for simplicity, let's assume we have only three variables: $\text{var}_1, \text{var}_2, \text{var}_3.$

And for some physical intuition we are sure that this is good expression to discriminate signal and background: $$\text{output} = c_1 \text{var}_1 + c_2 \log \left[ \exp(\text{var}_2 + \text{var}_3) + \exp(c_3) \right] + c_4 \dfrac{\text{var}_3}{\text{var}_2} + c_5 $$

Note: I have written some random expression here, in practice it appears from physical intuition (or after looking at the data).


In [9]:
class CustomNeuralNetwork(AbstractNeuralNetworkClassifier):
    def prepare(self):
        # getting number of layers in input, hidden, output layers
        # note that we support only one hidden layer here
        n1, n2, n3 = self.layers_        
        # checking that we have three variables in input + constant
        assert n1 == 3 + 1 
        # creating parameters
        c1 = self._create_scalar_parameter('c1')
        c2 = self._create_scalar_parameter('c2')
        c3 = self._create_scalar_parameter('c3')
        c4 = self._create_scalar_parameter('c4')
        c5 = self._create_scalar_parameter('c5')
        
        # defining activation function
        def activation(input):
            v1, v2, v3 = input[:, 0], input[:, 1], input[:, 2]
            return c1 * v1 + c2 * T.log(T.exp(v2 + v3) + T.exp(c3)) + c4 * v3 / v2 + c5
        
        return activation

Writing custom pretransformer

Below we define a very simple scikit-learn transformer which will transform each feature uniform to range [0, 1]


In [10]:
from sklearn.base import BaseEstimator, TransformerMixin
from rep.utils import Flattener

class Uniformer(BaseEstimator, TransformerMixin):
    # leaving only 3 features and flattening each variable
    def fit(self, X, y=None):
        self.transformers = []
        X = numpy.array(X, dtype=float)
        for column in range(X.shape[1]):
            self.transformers.append(Flattener(X[:, column]))
        return self
        
    def transform(self, X):
        X = numpy.array(X, dtype=float)
        assert X.shape[1] == len(self.transformers)
        for column, trans in enumerate(self.transformers):
            X[:, column] = trans(X[:, column])
        return X

In [11]:
# selecting three features to train: 
train_features = train_data.columns[:3]

clf = CustomNeuralNetwork(layers=[5], epochs=1000, scaler=Uniformer())
clf.fit(train_data[train_features], train_labels)

print('Test quality:', roc_auc_score(test_labels, clf.predict_proba(test_data[train_features])[:, 1]))


Test quality: 0.9145783471715777

Ensembling of neural neworks

let's run AdaBoost algorithm over neural network. Boosting of the networks is rarely seen in practice due to the high cost and minor positive effects (but it is not senseless)


In [12]:
from sklearn.ensemble import AdaBoostClassifier

base_nnet = MLPClassifier(layers=[5], scaler=Uniformer())
clf = AdaBoostClassifier(base_estimator=base_nnet, n_estimators=10)
clf.fit(train_data, train_labels)


/home/axelr/py36/lib/python3.6/site-packages/sklearn/ensemble/weight_boosting.py:29: DeprecationWarning: numpy.core.umath_tests is an internal NumPy module and should not be imported. It will be removed in a future NumPy release.
  from numpy.core.umath_tests import inner1d
Out[12]:
AdaBoostClassifier(algorithm='SAMME.R',
          base_estimator=MLPClassifier(epochs=100, layers=[5], loss='log_loss', random_state=None,
       scaler=Uniformer(), trainer='irprop-', trainer_parameters=None),
          learning_rate=1.0, n_estimators=10, random_state=None)

In [13]:
print('Test quality:', roc_auc_score(test_labels, clf.predict_proba(test_data)[:, 1]))


Test quality: 0.9771387726319765

In [ ]: