In [0]:
# install software on the backend, which is located at
# Google's Super Secret Sky Server in an alternate universe.
# The backend is called a 'hosted runtime' if it is on their server.
# A local runtime would start a colab notebook on your machine locally.
# Think of google colab as a Google Docs version of Jupyter Notebooks
# remove display of install details
%%capture --no-display
# pip install
!pip install numpy matplotlib scipy pandas scikit-learn astropy seaborn ipython jupyter #standard install for DSFP
!pip install keras tensorflow # required for deep learning
!pip install pycm
In [0]:
# standard-ish imports
import numpy as np
import matplotlib.pyplot as plt
import time
import itertools
# non-standard, but stable package for confusion matrices
from pycm import ConfusionMatrix
# neural network / machine learning packages
from sklearn import metrics
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization
from keras import backend as K
Our main task will be to Classify Handwritten Digits (Is it a "zero" [0] or a "one" [1]?; ooooh, the suspense). This is very useful though, because it's an opportunity to separate yourself from the science of the data. MNIST is a benchmark data set for machine learning. Astronomy doesn't yet have machine learning-specific benchmark data sets.
In [0]:
# import MNIST data
(x_train_temp, y_train_temp), (x_test_temp, y_test_temp) = mnist.load_data()
In [0]:
# Print the shapes
print("Train Data Shape:", x_train_temp.shape)
print("Test Data Shape:", x_test_temp.shape)
print("Train Label Shape:", y_train_temp.shape)
print("Test Label Shape:", y_test_temp.shape)
Do the shapes of 'data' and 'label' (for train and test, respectively) match? If they don't now, Keras/TF will kindly yell at you later.
In [0]:
# Print an example
print("Example:")
print("y_train[0] is the label for the 0th image, and it is a", y_train_temp[0])
print("x_train[0] is the image data, and you kind of see the pattern in the array of numbers")
print(x_train_temp[0])
Can you see the pattern of the number in the array?
In [0]:
# Plot the data!
f = plt.figure()
f.add_subplot(1,2, 1)
plt.imshow(x_train_temp[0])
f.add_subplot(1,2, 2)
plt.imshow(x_train_temp[1])
plt.show(block=True)
In [0]:
print("Before:", np.min(x_train_temp), np.max(x_train_temp))
x_train = x_train_temp.astype('float32')
x_test = x_test_temp.astype('float32')
x_train /= 255
x_test /= 255
y_train = y_train_temp
y_test = y_test_temp
print("After:", np.min(x_train), np.max(x_train))
In [0]:
# Flatten the images
img_rows, img_cols = x_train[0].shape[0], x_train[0].shape[1]
img_size = img_rows*img_cols
x_train = x_train.reshape(x_train.shape[0], img_size)
x_test = x_test.reshape(x_test.shape[0], img_size)
print("New shape", x_train.shape)
This increases the efficiency of the matrix algebra during network training and evaluation.
In [0]:
# One-hot encoding
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
In [0]:
model = Sequential()
For later: check out the keras documentation to see what other kinds of model formats are available. What might they be useful for?
In [0]:
model.add(Dense(units=32, activation='sigmoid', input_shape=(img_size,))) # dense layer of 32 neurons
model.add(Dense(units=num_classes, activation='softmax')) # dense layer of 'num_classes' neurons, because these are the number of options for the classification
#model.add(BatchNormalization())
model.summary() # look at the network output
Select three key options
In [0]:
model.compile(optimizer="sgd", loss='categorical_crossentropy', metrics=['accuracy'])
The optimizer is an element that I sometimes tune if I want more control over the training. Check the Keras docs for the optiosn available for the optimizer.
In [0]:
# Training parameters
batch_size = 32 # number of images per epoch
num_epochs = 5 # number of epochs
validation_split = 0.8 # fraction of the training set that is for validation only
In [0]:
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=num_epochs,
validation_split=validation_split,
verbose=True)
"Exerciiiiiiisee the neurons! This Network is cleeeah!"
(bonus points if you know the 90's movie, from which this is a near-quote.)
In [0]:
loss_train, acc_train = model.evaluate(x_train, y_train, verbose=False)
loss_test, acc_test = model.evaluate(x_test, y_test, verbose=False)
print(f'Train acc/loss: {acc_train:.3}, {loss_train:.3}')
print(f'Test acc/loss: {acc_test:.3}, {loss_test:.3}')
In [0]:
y_pred_train = model.predict(x_train, verbose=True)
y_pred_test = model.predict(x_test,verbose=True)
In [0]:
# set up figure
f = plt.figure(figsize=(12,5))
f.add_subplot(1,2, 1)
# plot accuracy as a function of epoch
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['training', 'validation'], loc='best')
# plot loss as a function of epoch
f.add_subplot(1,2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['training', 'validation'], loc='best')
plt.show(block=True)
In [0]:
# Function: Convert from categorical back to numerical value
def convert_to_index(array_categorical):
array_index = [np.argmax(array_temp) for array_temp in array_categorical]
return array_index
def plot_confusion_matrix(cm,
normalize=False,
title='Confusion matrix',
cmap=plt.cm.Blues):
"""
This function modified to plots the ConfusionMatrix object.
Normalization can be applied by setting `normalize=True`.
Code Reference :
http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html
This script is derived from PyCM repository: https://github.com/sepandhaghighi/pycm
"""
plt_cm = []
for i in cm.classes :
row=[]
for j in cm.classes:
row.append(cm.table[i][j])
plt_cm.append(row)
plt_cm = np.array(plt_cm)
if normalize:
plt_cm = plt_cm.astype('float') / plt_cm.sum(axis=1)[:, np.newaxis]
plt.imshow(plt_cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(cm.classes))
plt.xticks(tick_marks, cm.classes, rotation=45)
plt.yticks(tick_marks, cm.classes)
fmt = '.2f' if normalize else 'd'
thresh = plt_cm.max() / 2.
for i, j in itertools.product(range(plt_cm.shape[0]), range(plt_cm.shape[1])):
plt.text(j, i, format(plt_cm[i, j], fmt),
horizontalalignment="center",
color="white" if plt_cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('Actual')
plt.xlabel('Predict')
In [0]:
# apply conversion function to data
y_test_ind = convert_to_index(y_test)
y_pred_test_ind = convert_to_index(y_pred_test)
# compute confusion matrix
cm_test = ConfusionMatrix(y_test_ind, y_pred_test_ind)
np.set_printoptions(precision=2)
# plot confusion matrix result
plt.figure()
plot_confusion_matrix(cm_test,title='cm')
Problem 3f: Create a new diagnostic that shows examples of the images and how they were classified. For a given architecture and training, plot four columns of images, five rows down. Each row is just another example of something that is classified. The four columns are
So this will be a grid of images that are examples of classifications. This is important to give a sense of how well the network is classifying different types of objects.
Some advice for this problem:
Sub-problems
When developing an NN model, we think about the balance between accuracy, computation time, and network complexity. This exercise is intended to draw out some of the patterns in the confluence of those aspects. The goal of this problem is to produce plots that show general trends.
We have mock strong lensing data made with Lenspop.
In [0]:
!gsutil cp -r gs://lsst-dsfp2018/stronglenses ./
!ls .
In [0]:
MNIST is great for learning and in some cases benchmarking, but it's not the best for physics and astronomy problems. In our questions, there's a physical underpinning, there are first principles that underly the images we make. So, when developing a neural network, it can be very useful to generate some data that morphologically looks like your astro data. This can help in developing the architecture before deploying on more complicated data, as well as diagnosing systematics in the more complicated data.