Image classification of simulated AT-TPC events

Welcome to this project in applied machine learning. In this project we will tackle a simple classification problem of two different classes. The classes are simulated reaction types for the Ar(p, p') experiment conducted at MSU, in this task we'll focus on the classification task and simply treat the experiment as a black box.

This is a completed notebook with solution examples, for your implementation we suggest you implement your own solution in the project.ipynb notebook

This project has three tasks with a recommendation for the time to spend on each task:

  • Preparation, Data exploration and standardization: 0.5hr
  • Model construction: 1hr
  • Hyperparameter tuning and performance validation: 1hr

There is a notebook project_solution.ipynb included with suggestions to solutions for each task included, for reference or to easily move on to a part of the project more appealing to your interests.

Preparation:

This project uses python and the machine learning library keras. As well as some functionality from numpy and scikit-learn. We recommend a Python verson of >3.4. These libraries should be installed to your specific system by using the command pip3 install --user LIBRARY_NAME

Task 1: Data exploration and standardization

In machine learning, as in many other fields, the task of preparing data for analysis is as vital as it can be troublesome and tedious. In data-analysis the researcher can expect to spend the majority of their time merely processing data to prepare for analysis. In this projcet we will focus more on the entire pipeline of analysis, and so the data at hand has already been shaped to an image format suitable for our analysis.

Task 1a: Loading the data

The data is stored in the .npy format using vecotrized code to speed up the read process. The files pointed to in this task are downsampled images with dimensions $64 x 64$ (if the images are to big for your laptop to handle, the script included in ../scripts/downsample_images.py can further reduce the dimension).


In [1]:
import numpy as np # we'll be using this shorthand for the NumPy library throughout

dataset = np.load("../data/images/project_data.npy")
n_samples = dataset.shape[0]

print("Data shape: ", dataset.shape)


Data shape:  (8000, 64, 64, 1)

Task 1b: Inspecting the data

The data is stored as xy projections of the real events, who take place in a 3d volume. This allows a simple exploratiuon of the data as images. In this task you should plot a few different events in a grid using matplotlib.pyplot


In [27]:
import matplotlib.pyplot as plt

rows = 2
cols = 2
n_plots = rows*cols
fig, axs = plt.subplots(nrows=rows, ncols=cols, figsize=(10, 10 ))


for row in axs: 
    for ax in row:
        """
        one of pythons most wonderful attributes is that if an object is iterable it can be
        directly iterated over, like above. 
        ax is an axis object from the 2d array of axis objects
        """

        which = np.random.randint(0, n_samples)
        ax.imshow(dataset[which].reshape(64, 64))
        ax.axis("off")


Task 1c: Standardizing the data

An important part of the preprocessing of data is the standardization of the input. The intuition here is simply that the model should expect similar values in the input to mean the same.

You should implement a standardization of the input. Perhaps the most common standardization is the centering of the mean of the distribution, and scaling by the standard deviation:

$X_s = \frac{X - \mu}{\sigma}$

Note that for our data we only want to standardize the signal part of our image, we know the rest is zero and we don't want the standardization to be unduly effected. This also means we don't necessarily want a zero mean for our signal distribution. So for this example we stick with the scaling:

$X_s = \frac{X}{\sigma}$

Another important fact is that at already at this point is it recommended to separate the data in train and test sets. The partion of the data to test on should be roughly 10-20%. And to remember to compute the standardization variables only from the training set.

MORE OPEN ENDED !!!


In [3]:
from sklearn.model_selection import train_test_split

targets = np.load("../data/targets/project_targets.npy")
train_X, test_X, train_y, test_y = train_test_split(dataset, targets, test_size=0.15)

nonzero_indices = np.nonzero(train_X)
nonzero_elements = train_X[nonzero_indices]

print("Train Mean: ", nonzero_elements.mean())
print("Train Std.: ", nonzero_elements.std())
print("-------------")
print("Test Mean: ", test_X[np.nonzero(test_X)].mean())
print("Test Std.: ", test_X[np.nonzero(test_X)].std())
print("############")

nonzero_scaled = nonzero_elements/nonzero_elements.std()
train_X[nonzero_indices] = nonzero_scaled
test_X[np.nonzero(test_X)] /= nonzero_elements.std()

print("Train Mean: ", nonzero_scaled.mean())
print("Train Std.: ", nonzero_scaled.std())
print("-------------")
print("Test Mean: ", test_X[np.nonzero(test_X)].mean())
print("Test Std.: ", test_X[np.nonzero(test_X)].std())


Train Mean:  1.1315160335486445
Train Std.:  4.3519420757141924
-------------
Test Mean:  1.3043909829464193
Test Std.:  5.012595631160625
############
Train Mean:  0.26000254917523286
Train Std.:  1.0
-------------
Test Mean:  0.29972618207064644
Test Std.:  1.1518066058675684

We also want to plot up the data again to confirm that our scaling is sensible, you should reuse your code from above for this.


In [4]:
import matplotlib.pyplot as plt

rows = 2
cols = 2
n_plots = rows*cols
fig, axs = plt.subplots(nrows=rows, ncols=cols, figsize=(10, 10 ))


for row in axs: 
    for ax in row:
        """
        one of pythons most wonderful attributes is that if an object is iterable it can be
        directly iterated over, like above. 
        ax is an axis object from the 2d array of axis objects
        """

        which = np.random.randint(0, train_X.shape[0])
        ax.imshow(train_X[which].reshape(64, 64))
        ax.text(5, 5, "{}".format(int(train_y[which])), bbox={'facecolor': 'white', 'pad': 10})
        ax.axis("off")


1d: Encoding the targets:

For classification one ordinarily encodes the target as a n-element zero vector with one element valued at 1 indicating the target class. This is simply called one-hot encoding.

You should inspect the values of the target vectors and use the imported OneHotEncoder to convert the targets.

Note that this is not necessary for the two class case, but we do it to demonstrate a general approach.


In [5]:
from sklearn.preprocessing import OneHotEncoder

onehot_train_y = OneHotEncoder(sparse=False, categories="auto").fit_transform(train_y.reshape(-1, 1))
onehot_test_y = OneHotEncoder(sparse=False, categories="auto").fit_transform(test_y.reshape(-1, 1))

print("Onehot train targets:", onehot_train_y.shape)
print("Onehot test targets:",onehot_test_y.shape)


Onehot train targets: (6800, 2)
Onehot test targets: (1200, 2)

Prelude to task 2:

In this task we'll be constructing a CNN model for the two class data. Before that it is useful to characterize the performance of a less complex model, for example a logistic regression model. For this task then you should construct a logistic regression model to classify the two class problem.


In [26]:
from keras.models import Sequential, Model
from keras.layers import Dense
from keras.regularizers import l2
from keras.optimizers import SGD, adam


flat_train_X = np.reshape(train_X, (train_X.shape[0], train_X.shape[1]*train_X.shape[2]*train_X.shape[3]))
flat_test_X = np.reshape(test_X, (test_X.shape[0], train_X.shape[1]*train_X.shape[2]*train_X.shape[3]))

logreg = Sequential()
logreg.add(Dense(2, kernel_regularizer=l2(0.01), activation="softmax"))

eta = 0.001
optimizer = SGD(eta)
logreg.compile(optimizer, loss="binary_crossentropy", metrics=["accuracy",])

history = logreg.fit(
    x=flat_train_X,
    y=onehot_train_y,
    batch_size=100,
    epochs=200,
    validation_split=0.15,
    verbose=2
    )


Train on 5780 samples, validate on 1020 samples
Epoch 1/200
 - 0s - loss: 0.7375 - acc: 0.4744 - val_loss: 0.7357 - val_acc: 0.4951
Epoch 2/200
 - 0s - loss: 0.7325 - acc: 0.4777 - val_loss: 0.7311 - val_acc: 0.5039
Epoch 3/200
 - 0s - loss: 0.7283 - acc: 0.4829 - val_loss: 0.7272 - val_acc: 0.5118
Epoch 4/200
 - 0s - loss: 0.7247 - acc: 0.4901 - val_loss: 0.7238 - val_acc: 0.5137
Epoch 5/200
 - 0s - loss: 0.7215 - acc: 0.4939 - val_loss: 0.7210 - val_acc: 0.5196
Epoch 6/200
 - 0s - loss: 0.7188 - acc: 0.4960 - val_loss: 0.7185 - val_acc: 0.5265
Epoch 7/200
 - 0s - loss: 0.7164 - acc: 0.4972 - val_loss: 0.7163 - val_acc: 0.5176
Epoch 8/200
 - 0s - loss: 0.7143 - acc: 0.4981 - val_loss: 0.7145 - val_acc: 0.5147
Epoch 9/200
 - 0s - loss: 0.7124 - acc: 0.5010 - val_loss: 0.7128 - val_acc: 0.5127
Epoch 10/200
 - 0s - loss: 0.7107 - acc: 0.5022 - val_loss: 0.7113 - val_acc: 0.5284
Epoch 11/200
 - 0s - loss: 0.7091 - acc: 0.5073 - val_loss: 0.7099 - val_acc: 0.5265
Epoch 12/200
 - 0s - loss: 0.7077 - acc: 0.5052 - val_loss: 0.7087 - val_acc: 0.5216
Epoch 13/200
 - 0s - loss: 0.7064 - acc: 0.5062 - val_loss: 0.7075 - val_acc: 0.5225
Epoch 14/200
 - 0s - loss: 0.7052 - acc: 0.5114 - val_loss: 0.7065 - val_acc: 0.5118
Epoch 15/200
 - 0s - loss: 0.7041 - acc: 0.5080 - val_loss: 0.7055 - val_acc: 0.5225
Epoch 16/200
 - 0s - loss: 0.7031 - acc: 0.5061 - val_loss: 0.7046 - val_acc: 0.5167
Epoch 17/200
 - 0s - loss: 0.7021 - acc: 0.5024 - val_loss: 0.7038 - val_acc: 0.5127
Epoch 18/200
 - 0s - loss: 0.7012 - acc: 0.4979 - val_loss: 0.7030 - val_acc: 0.5108
Epoch 19/200
 - 0s - loss: 0.7003 - acc: 0.4953 - val_loss: 0.7022 - val_acc: 0.5088
Epoch 20/200
 - 0s - loss: 0.6995 - acc: 0.4958 - val_loss: 0.7015 - val_acc: 0.5069
Epoch 21/200
 - 0s - loss: 0.6987 - acc: 0.4971 - val_loss: 0.7008 - val_acc: 0.5010
Epoch 22/200
 - 0s - loss: 0.6979 - acc: 0.4995 - val_loss: 0.7001 - val_acc: 0.5020
Epoch 23/200
 - 0s - loss: 0.6972 - acc: 0.5010 - val_loss: 0.6994 - val_acc: 0.5059
Epoch 24/200
 - 0s - loss: 0.6965 - acc: 0.5076 - val_loss: 0.6988 - val_acc: 0.5108
Epoch 25/200
 - 0s - loss: 0.6958 - acc: 0.5109 - val_loss: 0.6982 - val_acc: 0.5176
Epoch 26/200
 - 0s - loss: 0.6951 - acc: 0.5144 - val_loss: 0.6977 - val_acc: 0.5235
Epoch 27/200
 - 0s - loss: 0.6945 - acc: 0.5159 - val_loss: 0.6971 - val_acc: 0.5265
Epoch 28/200
 - 0s - loss: 0.6939 - acc: 0.5204 - val_loss: 0.6966 - val_acc: 0.5294
Epoch 29/200
 - 0s - loss: 0.6933 - acc: 0.5235 - val_loss: 0.6960 - val_acc: 0.5333
Epoch 30/200
 - 0s - loss: 0.6927 - acc: 0.5270 - val_loss: 0.6955 - val_acc: 0.5392
Epoch 31/200
 - 0s - loss: 0.6921 - acc: 0.5308 - val_loss: 0.6950 - val_acc: 0.5422
Epoch 32/200
 - 0s - loss: 0.6916 - acc: 0.5343 - val_loss: 0.6945 - val_acc: 0.5480
Epoch 33/200
 - 0s - loss: 0.6910 - acc: 0.5372 - val_loss: 0.6941 - val_acc: 0.5500
Epoch 34/200
 - 0s - loss: 0.6905 - acc: 0.5405 - val_loss: 0.6936 - val_acc: 0.5510
Epoch 35/200
 - 0s - loss: 0.6900 - acc: 0.5420 - val_loss: 0.6932 - val_acc: 0.5549
Epoch 36/200
 - 0s - loss: 0.6895 - acc: 0.5458 - val_loss: 0.6927 - val_acc: 0.5569
Epoch 37/200
 - 0s - loss: 0.6890 - acc: 0.5507 - val_loss: 0.6923 - val_acc: 0.5598
Epoch 38/200
 - 0s - loss: 0.6885 - acc: 0.5533 - val_loss: 0.6919 - val_acc: 0.5627
Epoch 39/200
 - 0s - loss: 0.6880 - acc: 0.5554 - val_loss: 0.6915 - val_acc: 0.5657
Epoch 40/200
 - 0s - loss: 0.6876 - acc: 0.5573 - val_loss: 0.6910 - val_acc: 0.5676
Epoch 41/200
 - 0s - loss: 0.6871 - acc: 0.5590 - val_loss: 0.6906 - val_acc: 0.5706
Epoch 42/200
 - 0s - loss: 0.6867 - acc: 0.5623 - val_loss: 0.6903 - val_acc: 0.5725
Epoch 43/200
 - 0s - loss: 0.6862 - acc: 0.5630 - val_loss: 0.6899 - val_acc: 0.5745
Epoch 44/200
 - 0s - loss: 0.6858 - acc: 0.5654 - val_loss: 0.6895 - val_acc: 0.5775
Epoch 45/200
 - 0s - loss: 0.6854 - acc: 0.5666 - val_loss: 0.6891 - val_acc: 0.5775
Epoch 46/200
 - 0s - loss: 0.6850 - acc: 0.5680 - val_loss: 0.6888 - val_acc: 0.5784
Epoch 47/200
 - 0s - loss: 0.6846 - acc: 0.5697 - val_loss: 0.6884 - val_acc: 0.5794
Epoch 48/200
 - 0s - loss: 0.6842 - acc: 0.5709 - val_loss: 0.6881 - val_acc: 0.5814
Epoch 49/200
 - 0s - loss: 0.6838 - acc: 0.5725 - val_loss: 0.6877 - val_acc: 0.5843
Epoch 50/200
 - 0s - loss: 0.6834 - acc: 0.5734 - val_loss: 0.6874 - val_acc: 0.5863
Epoch 51/200
 - 0s - loss: 0.6830 - acc: 0.5742 - val_loss: 0.6870 - val_acc: 0.5873
Epoch 52/200
 - 0s - loss: 0.6826 - acc: 0.5754 - val_loss: 0.6867 - val_acc: 0.5873
Epoch 53/200
 - 0s - loss: 0.6823 - acc: 0.5761 - val_loss: 0.6864 - val_acc: 0.5882
Epoch 54/200
 - 0s - loss: 0.6819 - acc: 0.5775 - val_loss: 0.6861 - val_acc: 0.5892
Epoch 55/200
 - 0s - loss: 0.6815 - acc: 0.5784 - val_loss: 0.6857 - val_acc: 0.5892
Epoch 56/200
 - 0s - loss: 0.6812 - acc: 0.5801 - val_loss: 0.6854 - val_acc: 0.5912
Epoch 57/200
 - 0s - loss: 0.6808 - acc: 0.5806 - val_loss: 0.6851 - val_acc: 0.5912
Epoch 58/200
 - 0s - loss: 0.6805 - acc: 0.5820 - val_loss: 0.6848 - val_acc: 0.5912
Epoch 59/200
 - 0s - loss: 0.6801 - acc: 0.5832 - val_loss: 0.6845 - val_acc: 0.5922
Epoch 60/200
 - 0s - loss: 0.6798 - acc: 0.5841 - val_loss: 0.6842 - val_acc: 0.5922
Epoch 61/200
 - 0s - loss: 0.6795 - acc: 0.5844 - val_loss: 0.6839 - val_acc: 0.5941
Epoch 62/200
 - 0s - loss: 0.6791 - acc: 0.5846 - val_loss: 0.6837 - val_acc: 0.5941
Epoch 63/200
 - 0s - loss: 0.6788 - acc: 0.5860 - val_loss: 0.6834 - val_acc: 0.5951
Epoch 64/200
 - 0s - loss: 0.6785 - acc: 0.5865 - val_loss: 0.6831 - val_acc: 0.5951
Epoch 65/200
 - 0s - loss: 0.6782 - acc: 0.5875 - val_loss: 0.6828 - val_acc: 0.5971
Epoch 66/200
 - 0s - loss: 0.6779 - acc: 0.5882 - val_loss: 0.6825 - val_acc: 0.5971
Epoch 67/200
 - 0s - loss: 0.6776 - acc: 0.5888 - val_loss: 0.6823 - val_acc: 0.5971
Epoch 68/200
 - 0s - loss: 0.6773 - acc: 0.5893 - val_loss: 0.6820 - val_acc: 0.5971
Epoch 69/200
 - 0s - loss: 0.6770 - acc: 0.5898 - val_loss: 0.6817 - val_acc: 0.5971
Epoch 70/200
 - 0s - loss: 0.6767 - acc: 0.5900 - val_loss: 0.6815 - val_acc: 0.5971
Epoch 71/200
 - 0s - loss: 0.6764 - acc: 0.5907 - val_loss: 0.6812 - val_acc: 0.5971
Epoch 72/200
 - 0s - loss: 0.6761 - acc: 0.5917 - val_loss: 0.6810 - val_acc: 0.5971
Epoch 73/200
 - 0s - loss: 0.6758 - acc: 0.5922 - val_loss: 0.6807 - val_acc: 0.5971
Epoch 74/200
 - 0s - loss: 0.6755 - acc: 0.5933 - val_loss: 0.6805 - val_acc: 0.5971
Epoch 75/200
 - 0s - loss: 0.6752 - acc: 0.5939 - val_loss: 0.6802 - val_acc: 0.5971
Epoch 76/200
 - 0s - loss: 0.6749 - acc: 0.5945 - val_loss: 0.6800 - val_acc: 0.5971
Epoch 77/200
 - 0s - loss: 0.6746 - acc: 0.5953 - val_loss: 0.6798 - val_acc: 0.5990
Epoch 78/200
 - 0s - loss: 0.6744 - acc: 0.5962 - val_loss: 0.6795 - val_acc: 0.5990
Epoch 79/200
 - 0s - loss: 0.6741 - acc: 0.5972 - val_loss: 0.6793 - val_acc: 0.5990
Epoch 80/200
 - 0s - loss: 0.6738 - acc: 0.5972 - val_loss: 0.6790 - val_acc: 0.6000
Epoch 81/200
 - 0s - loss: 0.6736 - acc: 0.5978 - val_loss: 0.6788 - val_acc: 0.6000
Epoch 82/200
 - 0s - loss: 0.6733 - acc: 0.5984 - val_loss: 0.6786 - val_acc: 0.6000
Epoch 83/200
 - 0s - loss: 0.6730 - acc: 0.5986 - val_loss: 0.6784 - val_acc: 0.6010
Epoch 84/200
 - 0s - loss: 0.6728 - acc: 0.5990 - val_loss: 0.6781 - val_acc: 0.6029
Epoch 85/200
 - 0s - loss: 0.6725 - acc: 0.6002 - val_loss: 0.6779 - val_acc: 0.6029
Epoch 86/200
 - 0s - loss: 0.6723 - acc: 0.6002 - val_loss: 0.6777 - val_acc: 0.6029
Epoch 87/200
 - 0s - loss: 0.6720 - acc: 0.6005 - val_loss: 0.6775 - val_acc: 0.6029
Epoch 88/200
 - 0s - loss: 0.6718 - acc: 0.6007 - val_loss: 0.6773 - val_acc: 0.6029
Epoch 89/200
 - 0s - loss: 0.6715 - acc: 0.6010 - val_loss: 0.6771 - val_acc: 0.6029
Epoch 90/200
 - 0s - loss: 0.6713 - acc: 0.6014 - val_loss: 0.6768 - val_acc: 0.6029
Epoch 91/200
 - 0s - loss: 0.6710 - acc: 0.6016 - val_loss: 0.6766 - val_acc: 0.6029
Epoch 92/200
 - 0s - loss: 0.6708 - acc: 0.6019 - val_loss: 0.6764 - val_acc: 0.6029
Epoch 93/200
 - 0s - loss: 0.6705 - acc: 0.6024 - val_loss: 0.6762 - val_acc: 0.6029
Epoch 94/200
 - 0s - loss: 0.6703 - acc: 0.6026 - val_loss: 0.6760 - val_acc: 0.6049
Epoch 95/200
 - 0s - loss: 0.6701 - acc: 0.6031 - val_loss: 0.6758 - val_acc: 0.6049
Epoch 96/200
 - 0s - loss: 0.6698 - acc: 0.6033 - val_loss: 0.6756 - val_acc: 0.6059
Epoch 97/200
 - 0s - loss: 0.6696 - acc: 0.6038 - val_loss: 0.6754 - val_acc: 0.6069
Epoch 98/200
 - 0s - loss: 0.6694 - acc: 0.6040 - val_loss: 0.6752 - val_acc: 0.6069
Epoch 99/200
 - 0s - loss: 0.6692 - acc: 0.6043 - val_loss: 0.6750 - val_acc: 0.6078
Epoch 100/200
 - 0s - loss: 0.6689 - acc: 0.6050 - val_loss: 0.6748 - val_acc: 0.6078
Epoch 101/200
 - 0s - loss: 0.6687 - acc: 0.6050 - val_loss: 0.6746 - val_acc: 0.6088
Epoch 102/200
 - 0s - loss: 0.6685 - acc: 0.6050 - val_loss: 0.6745 - val_acc: 0.6098
Epoch 103/200
 - 0s - loss: 0.6683 - acc: 0.6052 - val_loss: 0.6743 - val_acc: 0.6098
Epoch 104/200
 - 0s - loss: 0.6680 - acc: 0.6055 - val_loss: 0.6741 - val_acc: 0.6098
Epoch 105/200
 - 0s - loss: 0.6678 - acc: 0.6055 - val_loss: 0.6739 - val_acc: 0.6098
Epoch 106/200
 - 0s - loss: 0.6676 - acc: 0.6057 - val_loss: 0.6737 - val_acc: 0.6098
Epoch 107/200
 - 0s - loss: 0.6674 - acc: 0.6057 - val_loss: 0.6735 - val_acc: 0.6098
Epoch 108/200
 - 0s - loss: 0.6672 - acc: 0.6057 - val_loss: 0.6734 - val_acc: 0.6098
Epoch 109/200
 - 0s - loss: 0.6670 - acc: 0.6057 - val_loss: 0.6732 - val_acc: 0.6098
Epoch 110/200
 - 0s - loss: 0.6668 - acc: 0.6059 - val_loss: 0.6730 - val_acc: 0.6098
Epoch 111/200
 - 0s - loss: 0.6665 - acc: 0.6064 - val_loss: 0.6728 - val_acc: 0.6098
Epoch 112/200
 - 0s - loss: 0.6663 - acc: 0.6069 - val_loss: 0.6726 - val_acc: 0.6098
Epoch 113/200
 - 0s - loss: 0.6661 - acc: 0.6071 - val_loss: 0.6725 - val_acc: 0.6098
Epoch 114/200
 - 0s - loss: 0.6659 - acc: 0.6071 - val_loss: 0.6723 - val_acc: 0.6098
Epoch 115/200
 - 0s - loss: 0.6657 - acc: 0.6074 - val_loss: 0.6721 - val_acc: 0.6098
Epoch 116/200
 - 0s - loss: 0.6655 - acc: 0.6076 - val_loss: 0.6720 - val_acc: 0.6098
Epoch 117/200
 - 0s - loss: 0.6653 - acc: 0.6080 - val_loss: 0.6718 - val_acc: 0.6098
Epoch 118/200
 - 0s - loss: 0.6651 - acc: 0.6078 - val_loss: 0.6716 - val_acc: 0.6098
Epoch 119/200
 - 0s - loss: 0.6649 - acc: 0.6085 - val_loss: 0.6715 - val_acc: 0.6098
Epoch 120/200
 - 0s - loss: 0.6647 - acc: 0.6090 - val_loss: 0.6713 - val_acc: 0.6098
Epoch 121/200
 - 0s - loss: 0.6645 - acc: 0.6090 - val_loss: 0.6711 - val_acc: 0.6098
Epoch 122/200
 - 0s - loss: 0.6643 - acc: 0.6093 - val_loss: 0.6710 - val_acc: 0.6098
Epoch 123/200
 - 0s - loss: 0.6641 - acc: 0.6095 - val_loss: 0.6708 - val_acc: 0.6098
Epoch 124/200
 - 0s - loss: 0.6640 - acc: 0.6099 - val_loss: 0.6707 - val_acc: 0.6098
Epoch 125/200
 - 0s - loss: 0.6638 - acc: 0.6099 - val_loss: 0.6705 - val_acc: 0.6098
Epoch 126/200
 - 0s - loss: 0.6636 - acc: 0.6102 - val_loss: 0.6703 - val_acc: 0.6098
Epoch 127/200
 - 0s - loss: 0.6634 - acc: 0.6104 - val_loss: 0.6702 - val_acc: 0.6098
Epoch 128/200
 - 0s - loss: 0.6632 - acc: 0.6106 - val_loss: 0.6700 - val_acc: 0.6098
Epoch 129/200
 - 0s - loss: 0.6630 - acc: 0.6107 - val_loss: 0.6699 - val_acc: 0.6098
Epoch 130/200
 - 0s - loss: 0.6628 - acc: 0.6107 - val_loss: 0.6697 - val_acc: 0.6098
Epoch 131/200
 - 0s - loss: 0.6626 - acc: 0.6111 - val_loss: 0.6696 - val_acc: 0.6098
Epoch 132/200
 - 0s - loss: 0.6625 - acc: 0.6112 - val_loss: 0.6694 - val_acc: 0.6098
Epoch 133/200
 - 0s - loss: 0.6623 - acc: 0.6114 - val_loss: 0.6693 - val_acc: 0.6098
Epoch 134/200
 - 0s - loss: 0.6621 - acc: 0.6116 - val_loss: 0.6691 - val_acc: 0.6098
Epoch 135/200
 - 0s - loss: 0.6619 - acc: 0.6116 - val_loss: 0.6690 - val_acc: 0.6098
Epoch 136/200
 - 0s - loss: 0.6617 - acc: 0.6116 - val_loss: 0.6688 - val_acc: 0.6098
Epoch 137/200
 - 0s - loss: 0.6616 - acc: 0.6118 - val_loss: 0.6687 - val_acc: 0.6098
Epoch 138/200
 - 0s - loss: 0.6614 - acc: 0.6121 - val_loss: 0.6685 - val_acc: 0.6098
Epoch 139/200
 - 0s - loss: 0.6612 - acc: 0.6118 - val_loss: 0.6684 - val_acc: 0.6098
Epoch 140/200
 - 0s - loss: 0.6610 - acc: 0.6123 - val_loss: 0.6682 - val_acc: 0.6098
Epoch 141/200
 - 0s - loss: 0.6609 - acc: 0.6121 - val_loss: 0.6681 - val_acc: 0.6098
Epoch 142/200
 - 0s - loss: 0.6607 - acc: 0.6126 - val_loss: 0.6680 - val_acc: 0.6098
Epoch 143/200
 - 0s - loss: 0.6605 - acc: 0.6128 - val_loss: 0.6678 - val_acc: 0.6108
Epoch 144/200
 - 0s - loss: 0.6604 - acc: 0.6126 - val_loss: 0.6677 - val_acc: 0.6108
Epoch 145/200
 - 0s - loss: 0.6602 - acc: 0.6126 - val_loss: 0.6675 - val_acc: 0.6118
Epoch 146/200
 - 0s - loss: 0.6600 - acc: 0.6126 - val_loss: 0.6674 - val_acc: 0.6118
Epoch 147/200
 - 0s - loss: 0.6598 - acc: 0.6126 - val_loss: 0.6673 - val_acc: 0.6118
Epoch 148/200
 - 0s - loss: 0.6597 - acc: 0.6128 - val_loss: 0.6671 - val_acc: 0.6118
Epoch 149/200
 - 0s - loss: 0.6595 - acc: 0.6128 - val_loss: 0.6670 - val_acc: 0.6127
Epoch 150/200
 - 0s - loss: 0.6594 - acc: 0.6130 - val_loss: 0.6669 - val_acc: 0.6127
Epoch 151/200
 - 0s - loss: 0.6592 - acc: 0.6131 - val_loss: 0.6667 - val_acc: 0.6127
Epoch 152/200
 - 0s - loss: 0.6590 - acc: 0.6133 - val_loss: 0.6666 - val_acc: 0.6127
Epoch 153/200
 - 0s - loss: 0.6589 - acc: 0.6133 - val_loss: 0.6665 - val_acc: 0.6127
Epoch 154/200
 - 0s - loss: 0.6587 - acc: 0.6137 - val_loss: 0.6663 - val_acc: 0.6127
Epoch 155/200
 - 0s - loss: 0.6585 - acc: 0.6137 - val_loss: 0.6662 - val_acc: 0.6127
Epoch 156/200
 - 0s - loss: 0.6584 - acc: 0.6140 - val_loss: 0.6661 - val_acc: 0.6137
Epoch 157/200
 - 0s - loss: 0.6582 - acc: 0.6144 - val_loss: 0.6659 - val_acc: 0.6137
Epoch 158/200
 - 0s - loss: 0.6581 - acc: 0.6140 - val_loss: 0.6658 - val_acc: 0.6137
Epoch 159/200
 - 0s - loss: 0.6579 - acc: 0.6144 - val_loss: 0.6657 - val_acc: 0.6137
Epoch 160/200
 - 0s - loss: 0.6577 - acc: 0.6144 - val_loss: 0.6656 - val_acc: 0.6137
Epoch 161/200
 - 0s - loss: 0.6576 - acc: 0.6142 - val_loss: 0.6654 - val_acc: 0.6147
Epoch 162/200
 - 0s - loss: 0.6574 - acc: 0.6144 - val_loss: 0.6653 - val_acc: 0.6147
Epoch 163/200
 - 0s - loss: 0.6573 - acc: 0.6144 - val_loss: 0.6652 - val_acc: 0.6147
Epoch 164/200
 - 0s - loss: 0.6571 - acc: 0.6144 - val_loss: 0.6651 - val_acc: 0.6147
Epoch 165/200
 - 0s - loss: 0.6570 - acc: 0.6145 - val_loss: 0.6649 - val_acc: 0.6147
Epoch 166/200
 - 0s - loss: 0.6568 - acc: 0.6149 - val_loss: 0.6648 - val_acc: 0.6147
Epoch 167/200
 - 0s - loss: 0.6567 - acc: 0.6152 - val_loss: 0.6647 - val_acc: 0.6147
Epoch 168/200
 - 0s - loss: 0.6565 - acc: 0.6152 - val_loss: 0.6646 - val_acc: 0.6147
Epoch 169/200
 - 0s - loss: 0.6564 - acc: 0.6152 - val_loss: 0.6645 - val_acc: 0.6147
Epoch 170/200
 - 0s - loss: 0.6562 - acc: 0.6154 - val_loss: 0.6643 - val_acc: 0.6147
Epoch 171/200
 - 0s - loss: 0.6561 - acc: 0.6154 - val_loss: 0.6642 - val_acc: 0.6147
Epoch 172/200
 - 0s - loss: 0.6559 - acc: 0.6152 - val_loss: 0.6641 - val_acc: 0.6147
Epoch 173/200
 - 0s - loss: 0.6558 - acc: 0.6156 - val_loss: 0.6640 - val_acc: 0.6147
Epoch 174/200
 - 0s - loss: 0.6556 - acc: 0.6156 - val_loss: 0.6639 - val_acc: 0.6147
Epoch 175/200
 - 0s - loss: 0.6555 - acc: 0.6156 - val_loss: 0.6638 - val_acc: 0.6147
Epoch 176/200
 - 0s - loss: 0.6553 - acc: 0.6157 - val_loss: 0.6636 - val_acc: 0.6147
Epoch 177/200
 - 0s - loss: 0.6552 - acc: 0.6157 - val_loss: 0.6635 - val_acc: 0.6147
Epoch 178/200
 - 0s - loss: 0.6550 - acc: 0.6159 - val_loss: 0.6634 - val_acc: 0.6147
Epoch 179/200
 - 0s - loss: 0.6549 - acc: 0.6164 - val_loss: 0.6633 - val_acc: 0.6147
Epoch 180/200
 - 0s - loss: 0.6548 - acc: 0.6166 - val_loss: 0.6632 - val_acc: 0.6147
Epoch 181/200
 - 0s - loss: 0.6546 - acc: 0.6166 - val_loss: 0.6631 - val_acc: 0.6147
Epoch 182/200
 - 0s - loss: 0.6545 - acc: 0.6164 - val_loss: 0.6630 - val_acc: 0.6147
Epoch 183/200
 - 0s - loss: 0.6543 - acc: 0.6166 - val_loss: 0.6629 - val_acc: 0.6157
Epoch 184/200
 - 0s - loss: 0.6542 - acc: 0.6166 - val_loss: 0.6628 - val_acc: 0.6157
Epoch 185/200
 - 0s - loss: 0.6541 - acc: 0.6166 - val_loss: 0.6626 - val_acc: 0.6157
Epoch 186/200
 - 0s - loss: 0.6539 - acc: 0.6171 - val_loss: 0.6625 - val_acc: 0.6157
Epoch 187/200
 - 0s - loss: 0.6538 - acc: 0.6171 - val_loss: 0.6624 - val_acc: 0.6157
Epoch 188/200
 - 0s - loss: 0.6536 - acc: 0.6171 - val_loss: 0.6623 - val_acc: 0.6167
Epoch 189/200
 - 0s - loss: 0.6535 - acc: 0.6173 - val_loss: 0.6622 - val_acc: 0.6167
Epoch 190/200
 - 0s - loss: 0.6534 - acc: 0.6173 - val_loss: 0.6621 - val_acc: 0.6167
Epoch 191/200
 - 0s - loss: 0.6532 - acc: 0.6175 - val_loss: 0.6620 - val_acc: 0.6167
Epoch 192/200
 - 0s - loss: 0.6531 - acc: 0.6175 - val_loss: 0.6619 - val_acc: 0.6167
Epoch 193/200
 - 0s - loss: 0.6530 - acc: 0.6178 - val_loss: 0.6618 - val_acc: 0.6167
Epoch 194/200
 - 0s - loss: 0.6528 - acc: 0.6178 - val_loss: 0.6617 - val_acc: 0.6167
Epoch 195/200
 - 0s - loss: 0.6527 - acc: 0.6178 - val_loss: 0.6616 - val_acc: 0.6167
Epoch 196/200
 - 0s - loss: 0.6526 - acc: 0.6178 - val_loss: 0.6615 - val_acc: 0.6167
Epoch 197/200
 - 0s - loss: 0.6524 - acc: 0.6178 - val_loss: 0.6614 - val_acc: 0.6167
Epoch 198/200
 - 0s - loss: 0.6523 - acc: 0.6178 - val_loss: 0.6613 - val_acc: 0.6176
Epoch 199/200
 - 0s - loss: 0.6522 - acc: 0.6178 - val_loss: 0.6612 - val_acc: 0.6176
Epoch 200/200
 - 0s - loss: 0.6520 - acc: 0.6178 - val_loss: 0.6611 - val_acc: 0.6176

The logistic regression model doesn't work, clearly. What about the data prohobits it from doing so?

2a: Creating a model

In this task we will create a CNN with fully connected bottom-layers for classification. You should base your code on Morten's code for a model. We suggest you complete one of the following for this task:

  1. Implement a class or function cnn that returns a compiled Keras model with an arbitrary number of convolutional and fully connected layers with optional configuration of regularization terms or layers.
  2. Implement a simple hard-coded function cnn that returns a Keras model object. The architecture should be specified in the function.

Both implementations should include multiple convolutional layers and ending with a couple fully connected layers. The output of the network should be a softmax or log-softmax layer of logits.

You should experiment with where in the network you place the non-linearities and whether to use striding or pooling to reduce the input. As well as the use of padding, would you need one for the first layer?


In [6]:
model_config = {
    "n_conv":2,
    "receptive_fields":[3, 3],
    "strides":[1, 1,],
    "n_filters":[2, 2],
    "conv_activation":[1, 1],
    "max_pool":[1, 1],
    
    "n_dense":1,
    "neurons":[10,],
    "dense_activation":[1,]
    }

In [7]:
from keras.models import Sequential, Model
from keras.layers import Dense, Conv2D, Flatten, MaxPooling2D, ReLU, Input, Softmax
from keras.regularizers import l2

def create_convolutional_neural_network_keras(input_shape, config, n_classes=2):
    """
    Modified from MH Jensen's course on machine learning in physics: 
    https://github.com/CompPhysics/MachineLearningMSU/blob/master/doc/pub/CNN/ipynb/CNN.ipynb
    """
    
    model=Sequential()
    
    for i in range(config["n_conv"]):
        receptive_field = config["receptive_fields"][i]
        strides = config["strides"][i]
        n_filters = config["n_filters"][i]
        pad = "same" if i == 0 else "same" 
        input_shape = input_shape if i==0 else None
        
        if i == 0:
            conv = Conv2D( 
                n_filters,
                (receptive_field, receptive_field),
                input_shape=input_shape,
                padding=pad,
                strides=strides,
                kernel_regularizer=l2(0.01)
                )
        else:
            conv = Conv2D( 
                    n_filters,
                    (receptive_field, receptive_field),
                    padding=pad,
                    strides=strides,
                    kernel_regularizer=l2(0.01)
                    )
        
        model.add(conv)
        
        pool = config["max_pool"][i]
        activation = config["conv_activation"][i]
        
        if activation:
            model.add(ReLU())
            
        if pool:
            model.add(MaxPooling2D(2))
    
    model.add(Flatten())
    
    for i in range(config["n_dense"]):
        n_neurons = config["neurons"][i]
        model.add(
            Dense(
                n_neurons,
                kernel_regularizer=l2(0.01)
            ))
        
        activation = config["dense_activation"][i]
        if activation:
            model.add(ReLU())
    
    model.add(
        Dense(
            n_classes,
            activation='softmax',
            kernel_regularizer=l2(0.01))
            )
    return model

model_o = create_convolutional_neural_network_keras(train_X.shape[1:], model_config, n_classes=2)
#model_o = mhj(train_X.shape[1:], 3, 2, 10, 2, 0.01)
print(model_o.summary())


Using TensorFlow backend.
/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.7
  return f(*args, **kwds)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 64, 64, 2)         20        
_________________________________________________________________
re_lu_1 (ReLU)               (None, 64, 64, 2)         0         
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 32, 32, 2)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 32, 2)         38        
_________________________________________________________________
re_lu_2 (ReLU)               (None, 32, 32, 2)         0         
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 16, 16, 2)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                5130      
_________________________________________________________________
re_lu_3 (ReLU)               (None, 10)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 22        
=================================================================
Total params: 5,210
Trainable params: 5,210
Non-trainable params: 0
_________________________________________________________________
None

2b: Plot your model

Keras provides a convenient class for plotting your model architecture. You should both do this and inspect the model summary to see how many trainable parameters you have as well as to confirm that your model is reasonably put together with no dangling edges in the graph etc.


In [8]:
from keras.utils import plot_model

plot_model(model_o, to_file="convnet.png")

2b: Compiling your model

With the constructed model ready it can now be compiled. Compiling entails unrolling the computational graph underlying the model and attaching losses at the layers you specify. For more complex models one can attach loss functions at arbitrary layers or one could define a specific loss for your particular problem.

For our case we will simply use a categorical cross-entropy, which means our network parametrizes an output of logits which we softmax to produce probabilities. In this task you should simply compile the above model with an optimizer of your choice and a categorical cross-entropy loss.


In [9]:
eta = 0.01
sgd = SGD(lr=eta, )
adam = adam(lr=eta, beta_1=0.5, )
model_o.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])

2c: Running your model

Here you should simply use the .fit method of the model to train on the training set. Select a suitable subset of train to use as validation, this should not be the test-set. Take care to note how many trainable parameters your model has. A model with $10^5$ parameters takes about a minute per epoch to run on a 7th gen i9 intel processor. If your laptop has a nvidia GPU training should be considerably faster.

Hint: this model is quite easy to over-fit, you should build your network with a relatively low complexity ($10^3$ parameters).

The .fit method returns a history object that you can use to plot the progress of your training.


In [10]:
%matplotlib notebook 
import matplotlib.pyplot as plt
history = model_o.fit(
    x=train_X,
    y=onehot_train_y,
    batch_size=50,
    epochs=40,
    validation_split=0.15,
    verbose=2
    )


Train on 5780 samples, validate on 1020 samples
Epoch 1/40
 - 2s - loss: 0.6566 - acc: 0.6301 - val_loss: 0.6303 - val_acc: 0.6441
Epoch 2/40
 - 2s - loss: 0.6337 - acc: 0.6382 - val_loss: 0.6174 - val_acc: 0.6461
Epoch 3/40
 - 2s - loss: 0.6386 - acc: 0.6374 - val_loss: 0.6275 - val_acc: 0.6471
Epoch 4/40
 - 2s - loss: 0.6245 - acc: 0.6462 - val_loss: 0.6318 - val_acc: 0.6422
Epoch 5/40
 - 2s - loss: 0.6153 - acc: 0.6813 - val_loss: 0.5978 - val_acc: 0.8020
Epoch 6/40
 - 2s - loss: 0.6424 - acc: 0.6618 - val_loss: 0.6292 - val_acc: 0.6422
Epoch 7/40
 - 2s - loss: 0.6329 - acc: 0.6931 - val_loss: 0.6347 - val_acc: 0.8206
Epoch 8/40
 - 2s - loss: 0.6165 - acc: 0.7820 - val_loss: 0.6447 - val_acc: 0.7510
Epoch 9/40
 - 2s - loss: 0.5525 - acc: 0.8578 - val_loss: 0.5272 - val_acc: 0.8961
Epoch 10/40
 - 2s - loss: 0.5092 - acc: 0.8822 - val_loss: 0.5536 - val_acc: 0.8971
Epoch 11/40
 - 2s - loss: 0.5103 - acc: 0.8913 - val_loss: 0.5681 - val_acc: 0.8422
Epoch 12/40
 - 2s - loss: 0.5074 - acc: 0.8955 - val_loss: 0.5371 - val_acc: 0.8892
Epoch 13/40
 - 2s - loss: 0.4682 - acc: 0.9078 - val_loss: 0.4817 - val_acc: 0.8863
Epoch 14/40
 - 2s - loss: 0.4531 - acc: 0.9173 - val_loss: 0.4635 - val_acc: 0.9118
Epoch 15/40
 - 2s - loss: 0.4594 - acc: 0.9196 - val_loss: 0.5155 - val_acc: 0.9137
Epoch 16/40
 - 2s - loss: 0.4648 - acc: 0.9092 - val_loss: 0.5196 - val_acc: 0.8451
Epoch 17/40
 - 2s - loss: 0.4379 - acc: 0.9258 - val_loss: 0.4699 - val_acc: 0.9275
Epoch 18/40
 - 2s - loss: 0.5216 - acc: 0.8849 - val_loss: 0.6609 - val_acc: 0.7637
Epoch 19/40
 - 2s - loss: 0.4965 - acc: 0.9007 - val_loss: 0.4744 - val_acc: 0.9127
Epoch 20/40
 - 2s - loss: 0.4587 - acc: 0.9253 - val_loss: 0.4474 - val_acc: 0.9402
Epoch 21/40
 - 2s - loss: 0.4417 - acc: 0.9358 - val_loss: 0.4751 - val_acc: 0.9353
Epoch 22/40
 - 2s - loss: 0.4421 - acc: 0.9344 - val_loss: 0.4660 - val_acc: 0.9137
Epoch 23/40
 - 2s - loss: 0.4361 - acc: 0.9396 - val_loss: 0.4468 - val_acc: 0.9255
Epoch 24/40
 - 2s - loss: 0.4313 - acc: 0.9431 - val_loss: 0.5309 - val_acc: 0.8510
Epoch 25/40
 - 2s - loss: 0.4382 - acc: 0.9386 - val_loss: 0.4500 - val_acc: 0.9382
Epoch 26/40
 - 2s - loss: 0.4831 - acc: 0.9211 - val_loss: 0.7402 - val_acc: 0.7716
Epoch 27/40
 - 2s - loss: 0.5047 - acc: 0.9157 - val_loss: 0.4675 - val_acc: 0.9196
Epoch 28/40
 - 2s - loss: 0.4341 - acc: 0.9374 - val_loss: 0.4826 - val_acc: 0.9363
Epoch 29/40
 - 2s - loss: 0.4315 - acc: 0.9474 - val_loss: 0.4518 - val_acc: 0.9412
Epoch 30/40
 - 2s - loss: 0.4263 - acc: 0.9483 - val_loss: 0.4349 - val_acc: 0.9431
Epoch 31/40
 - 2s - loss: 0.4254 - acc: 0.9497 - val_loss: 0.4586 - val_acc: 0.9441
Epoch 32/40
 - 2s - loss: 0.4196 - acc: 0.9528 - val_loss: 0.4572 - val_acc: 0.9324
Epoch 33/40
 - 2s - loss: 0.5379 - acc: 0.8640 - val_loss: 0.4335 - val_acc: 0.9382
Epoch 34/40
 - 2s - loss: 0.4148 - acc: 0.9519 - val_loss: 0.4528 - val_acc: 0.9392
Epoch 35/40
 - 2s - loss: 0.4133 - acc: 0.9533 - val_loss: 0.4366 - val_acc: 0.9490
Epoch 36/40
 - 2s - loss: 0.4138 - acc: 0.9578 - val_loss: 0.4613 - val_acc: 0.9245
Epoch 37/40
 - 2s - loss: 0.4219 - acc: 0.9512 - val_loss: 0.4355 - val_acc: 0.9480
Epoch 38/40
 - 2s - loss: 0.4230 - acc: 0.9540 - val_loss: 0.4309 - val_acc: 0.9461
Epoch 39/40
 - 2s - loss: 0.4329 - acc: 0.9486 - val_loss: 0.4381 - val_acc: 0.9451
Epoch 40/40
 - 2s - loss: 0.4217 - acc: 0.9524 - val_loss: 0.4234 - val_acc: 0.9441

In [11]:
# copied from https://keras.io/visualization/
# Plot training & validation accuracy values
fig, axs= plt.subplots(figsize=(10, 8), nrows=2)
fig.suptitle('Model performance')
axs[0].plot(history.history['acc'], "x-",alpha=0.8)
axs[0].plot(history.history['val_acc'], "x-", alpha=0.8)

axs[0].set_ylabel('Accuracy')
axs[0].set_xlabel('Epoch')
axs[0].legend(['Train', 'Test'], loc='upper left')


# Plot training & validation loss values

axs[1].plot(history.history['loss'], "o-",alpha=0.8)
axs[1].plot(history.history['val_loss'], "o-", alpha=0.8)

axs[1].set_ylabel('Loss')
axs[1].legend(['Train', 'Test'], loc='upper left')


Out[11]:
<matplotlib.legend.Legend at 0x12d84b6a0>

3a: Performance validation

As mentioned in the lectures machine learning models suffer from the problem of overfitting as scaling to an almost arbitrary complexity is simple with todays hardware. The challenge then is often that of finding the correct architecture and type of model suitable for your problem. In this task you will be familiarized with some tools to monitor and estimate the degree of overfitting.

In task 1c you separated your data in training and test sets. The test set is used to estimate generalization performance. Typically we use this to fine-tune the hyperparameters. Another trick is to use a validation set during training, which we implemented in the last task for the training.

In this task we will start with attaching callbacks to the fitting process. They are listed in the documentation for Kerashere: https://keras.io/callbacks/ Pick ones you think are suitable for our problem and re-run the training from above.


In [12]:
from keras.callbacks import EarlyStopping, ModelCheckpoint

callbacks = [EarlyStopping(min_delta=0.0001, patience=4), ModelCheckpoint("../checkpoints/ckpt")

history = model_o.fit(
    x=train_X,
    y=onehot_train_y,
    batch_size=50,
    epochs=150,
    validation_split=0.15,
    verbose=2,
    callbacks=callbacks
    )


Train on 5780 samples, validate on 1020 samples
Epoch 1/150
 - 2s - loss: 0.4109 - acc: 0.9566 - val_loss: 0.4189 - val_acc: 0.9510
Epoch 2/150
 - 2s - loss: 0.4241 - acc: 0.9502 - val_loss: 0.4276 - val_acc: 0.9569
Epoch 3/150
 - 2s - loss: 0.4174 - acc: 0.9559 - val_loss: 0.4488 - val_acc: 0.9343
Epoch 4/150
 - 2s - loss: 0.4388 - acc: 0.9514 - val_loss: 0.4379 - val_acc: 0.9451
Epoch 5/150
 - 2s - loss: 0.4200 - acc: 0.9543 - val_loss: 0.4451 - val_acc: 0.9392

3b: Hyperparameter tuning

Hyperparameters, like the number of layers, learning rate or others can have a very big impact on the model quality. The model performance should also be statistically quantified using cross validation, bootstrapped confidence intervals for your performance metrics or other tools depending on model. In this task you should then implement a function or for loop doing either random search or a grid search over parameters and finally you should plot those results in a suitable way.