Creating a CNN to identify real objects in kbmod data



In [1]:

    
import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.utils import np_utils
%matplotlib inline









    



Using TensorFlow backend.

Training Set

Here we are going to use Keras to create a neural network to identify real asteroids from noise in the kbmod results. We have a dataset generated from a kbmod run that contains real objects and false detections. We are going to split it into a training set and a test set and train a neural network to filter between the two.



In [2]:

    
data = np.genfromtxt('../data/postage_stamp_training.dat')

We will first normalize the data to be between 0.0 and 1.0



In [3]:

    
for idx in range(len(data)):
    data[idx] -= np.min(data[idx])
    data[idx] /= np.max(data[idx])

First we need to classify the data by eye. 0 will be a false image and 1 will be a true detection. We only show the first of 10 sets of 100 here. To iterate just change the "set_on" parameter.



In [4]:

    
fig = plt.figure(figsize=(50, 25))
set_on = 1
print 'Starting at %i' % int((set_on - 1)*100)
for i in range((set_on-1)*100,set_on*100):
    fig.add_subplot(10,10,i-(set_on-1)*100+1)
    plt.imshow(data[i].reshape(25,25), cmap=plt.cm.Greys_r, 
               interpolation=None)
    plt.title(str(i))









    



Starting at 0



In [5]:

    
classes = np.zeros(1000)

The following are the positive identifications from our training set as classified by eye.



In [6]:

    
classes[[29, 37, 58, 79, 86, 99, 115, 118, 123, 130, 131, 
         135, 138, 142, 149, 157, 160, 165, 166, 172, 177, 
         227, 262, 347, 369, 393, 
         426, 468, 478, 530, 560, 567, 602, 681]] = 1.



In [7]:

    
plt.imshow(data[138].reshape(25,25), cmap=plt.cm.Greys_r,
          interpolation=None)









    Out[7]:





<matplotlib.image.AxesImage at 0x7f37cd4e5d50>

Now we will divide the data into training and test sets (70/30 split).



In [44]:

    
np.random.seed(42)
assignments = np.random.choice(np.arange(1000), replace=False, size=1000)
train = assignments[:700]
test = assignments[700:]
train_set = data[train]
test_set = data[test]
train_classes = classes[train]
test_classes = classes[test]

Creating and Training Keras Neural Network



In [45]:

    
from keras.models import Sequential
from keras.layers import Dense



In [46]:

    
model = Sequential()

We are currently using a simple neural network model with a 128 unit hidden layer with a sigmoid activation function. The 25 x 25 pixel postage stamps are passed in as a 1-d array for a total of 625 input features. The final output is a binary classification where 1 means a positive identification as an asteroid-like object.



In [47]:

    
model.add(Dense(128, input_shape=(625,), activation='sigmoid'))
model.add(Dense(1, input_shape=(128,), activation='sigmoid'))



In [48]:

    
model.output_shape









    Out[48]:





(None, 1)

The loss function we choose is the binary cross-entropy function.



In [49]:

    
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

We fit our model and iterate 100 times in the optimization process.



In [50]:

    
model.fit(train_set, train_classes, batch_size=32, verbose=0, nb_epoch=100)









    Out[50]:





<keras.callbacks.History at 0x7f37b2231f50>

Evaluating Results

Once the model is fit we evaluate how it does on the test set and calculate the accuracy. This fit gets to 100% accuracy with our current training data.



In [51]:

    
score = model.evaluate(test_set, test_classes, verbose=0)
print score, model.metrics_names









    



[0.00018648342593271158, 1.0] ['loss', 'acc']

Here we take a look at the results with the class predictor and plot one of the positive identifications.



In [52]:

    
class_results = model.predict_classes(test_set, batch_size=32)









    



300/300 [==============================] - 0s



In [38]:

    
pos_results = np.where(class_results==1.)[0]
print pos_results









    



[ 16  89 167 239 257 259 261 268 280 283]



In [28]:

    
plt.imshow(test_set[pos_results[6]].reshape(25,25), cmap=plt.cm.Greys_r,
          interpolation=None)









    Out[28]:





<matplotlib.image.AxesImage at 0x7f37cc307610>

Save Model

Satisfied with our results we save the model to use in real analysis.



In [19]:

    
model.save('../data/kbmod_model.h5')