In this notebook the fish classification is done. We are going to classify in four classes: Tuna fish (TUNA), LAG, DOL and SHARK. The detector will save the cropped image of a fish. Here we will take this image and we will use a CNN to classify it.
In the original Kaggle competition there are six classes of fish: ALB, BET, YFT, DOL, LAG and SHARK. We started trying to classify them all, but three of them are vey similar: ALB, BET and YFT. In fact, they are all different tuna species, while the other fishes come from different families. Therefore, the classification of those species was difficult and the results were not too good. We will make a small comparison of both on the presentation, but here we will only upload the clsifier with four classes.
In [1]:
from PIL import Image
import tensorflow as tf
import numpy as np
import scipy
import os
import cv2
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import log_loss
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.convolutional import Convolution2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense
from keras.layers.core import Dropout
from keras import backend as K
import matplotlib.pyplot as plt
In [2]:
#Define some values and constants
fish_classes = ['TUNA','DOL','SHARK','LAG']
fish_classes_test = fish_classes
number_classes = len(fish_classes)
main_path_train = '../train_cut_oversample'
main_path_test = '../test'
channels = 3
ROWS_RESIZE = 100
COLS_RESIZE = 100
Now we read the data from the file where the fish detection part has stored the images.
We also preprocess slightly the images to convert them to the same size (100x100). The aspect ratio of the images is important, so instead of just resizing the image, we have created the function resize(im). This function takes an image and resizes its longest side to 100, keeping the aspect ratio. In other words, the short side of the image will be smaller than 100 poixels. This image is pasted onto the middle of a white layer that is 100x100. So, our image will have white pixels on two of its sides. This is not optimum, but it is still better than changing the aspect ratio. We have also tried with other colors, but the best results were achieved with white.
In [13]:
# Get data and preproccess it
def resize(image):
rows = image.shape[0]
cols = image.shape[1]
dominant = max(rows,cols)
ratio = ROWS_RESIZE/float(dominant)
im_res = scipy.misc.imresize(image,ratio)
rows = im_res.shape[0]
cols = im_res.shape[1]
im_res = Image.fromarray(im_res)
layer = Image.new('RGB',[ROWS_RESIZE,COLS_RESIZE],(255,255,255))
if rows > cols:
layer.paste(im_res,(COLS_RESIZE/2-cols/2,0))
if cols > rows:
layer.paste(im_res,(0,ROWS_RESIZE/2-rows/2))
if rows == cols:
layer.paste(im_res,(0,0))
return np.array(layer)
X_train = []
y_labels = []
for classes in fish_classes:
path_class = os.path.join(main_path_train,classes)
y_class = np.tile(classes,len(os.listdir(path_class)))
y_labels.extend(y_class)
for image in os.listdir(path_class):
path = os.path.join(path_class,image)
im = scipy.misc.imread(path)
im = resize(im)
X_train.append(np.array(im))
X_train = np.array(X_train)
# Convert labels into one hot vectors
y_labels = LabelEncoder().fit_transform(y_labels)
y_train = np_utils.to_categorical(y_labels)
X_test = []
y_test = []
for classes in fish_classes_test:
path_class = os.path.join(main_path_test,classes)
y_class = np.tile(classes,len(os.listdir(path_class)))
y_test.extend(y_class)
for image in os.listdir(path_class):
path = os.path.join(path_class,image)
im = scipy.misc.imread(path)
im = resize(im)
X_test.append(np.array(im))
X_test = np.array(X_test)
# Convert labels into one hot vectors
y_test = LabelEncoder().fit_transform(y_test)
y_test = np_utils.to_categorical(y_test)
X_train = np.reshape(X_train,(X_train.shape[0],ROWS_RESIZE,COLS_RESIZE,channels))
X_test = np.reshape(X_test,(X_test.shape[0],ROWS_RESIZE,COLS_RESIZE,channels))
print('X_train shape: ',X_train.shape)
print('y_train shape: ',y_train.shape)
print('X_test shape: ',X_test.shape)
print('y_test shape: ',y_test.shape)
The data is now organized in the following way:
-The training has been done with 23581 images of size 100x100x3 (rgb).
-There are 4 possible classes: LAG, SHARK, DOL and TUNA.
-The test has been done with 400 images of the same size, 100 per class.
We are now ready to build and train the classifier. Th CNN has 7 convolutional layers, 4 pooling layers and three fully connected layers at the end. Dropout has been used in the fully connected layers to avoid overfitting. The loss function used is multi class logloss because is the one used by Kaggle in the competition. The optimizeer is gradient descent.
In [4]:
def center_normalize(x):
return (x-K.mean(x))/K.std(x)
# Convolutional net
model = Sequential()
model.add(Activation(activation=center_normalize,input_shape=(ROWS_RESIZE,COLS_RESIZE,channels)))
model.add(Convolution2D(6,20,20,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(2,2),dim_ordering='tf'))
model.add(Convolution2D(12,10,10,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(Convolution2D(12,10,10,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(2,2),dim_ordering='tf'))
model.add(Convolution2D(24,5,5,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(Convolution2D(24,5,5,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(2,2),dim_ordering='tf'))
model.add(Convolution2D(24,5,5,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(Convolution2D(24,5,5,border_mode='same',activation='relu',dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(2,2),dim_ordering='tf'))
model.add(Flatten())
model.add(Dense(4092,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1024,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(number_classes))
model.add(Activation('softmax'))
print(model.summary())
model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(X_train,y_train,nb_epoch=1,verbose=1)
Out[4]:
Since there are a lot of images the training takes around one hour. Once it is done we can pass the test set to the classifier and measure its accuracy.
In [15]:
(loss,accuracy) = model.evaluate(X_test,y_test,verbose=1)
print('accuracy',accuracy)