In this example we will demo how to use a pretrained Inception-BatchNorm Network.
The network is described in
Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).
For network structure, you can visualize it in Composite Symbol Demo
The pre-trained Inception-BatchNorm network is able to be downloaded from: http://data.mxnet.io/mxnet/models/imagenet/inception-bn.tar.gz This model achieves Top-1 Accuracy: 72.5% and Top-5 Accuracy: 90.8%
In [8]:
%matplotlib inline
import mxnet as mx
import logging
import numpy as np
from skimage import io, transform
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
In [9]:
# Load the pre-trained model
prefix = "Inception/Inception-BN"
num_round = 126
model = mx.model.FeedForward.load(prefix, num_round, ctx=mx.gpu(), numpy_batch_size=1)
In [10]:
# mean image
mean_rgb = np.array([123.68, 116.779, 103.939])
mean_rgb = mean_rgb.reshape((3, 1, 1))
In [11]:
# if you like, you can plot the network
# mx.viz.plot_network(model.symbol, shape={"data" : (1, 3, 224, 224)})
In [12]:
# load synset (text label)
synset = [l.strip() for l in open('Inception/synset.txt').readlines()]
Then we will show how to use this model to classify image. The image was taken in Notre-Dame Basilica, Montreal.
In [13]:
def PreprocessImage(path, show_img=False):
# load image
img = io.imread(path)
print("Original Image Shape: ", img.shape)
# we crop image from center
short_egde = min(img.shape[:2])
yy = int((img.shape[0] - short_egde) / 2)
xx = int((img.shape[1] - short_egde) / 2)
crop_img = img[yy : yy + short_egde, xx : xx + short_egde]
# resize to 224, 224
resized_img = transform.resize(crop_img, (224, 224))
if show_img:
io.imshow(resized_img)
# convert to numpy.ndarray
sample = np.asarray(resized_img) * 256
# swap axes to make image from (224, 224, 3) to (3, 224, 224)
sample = np.swapaxes(sample, 0, 2)
sample = np.swapaxes(sample, 1, 2)
# sub mean
normed_img = sample - mean_rgb
print normed_img.shape, sample.shape, mean_rgb.shape
normed_img = normed_img.reshape((1, 3, 224, 224))
return normed_img
# Get preprocessed batch (single image batch)
batch = PreprocessImage('ILSVRC2012_val_00000001.JPEG', True)
# Get prediction probability of 1000 classes from model
prob = model.predict(batch)[0]
# Argsort, get prediction index from largest prob to lowest
pred = np.argsort(prob)[::-1]
# Get top1 label
top1 = synset[pred[0]]
print("Top1: ", top1)
# Get top5 label
top5 = [synset[pred[i]] for i in range(5)]
print("Top5: ", top5)
To extract feature, it is in similar to steps of CIFAR-10
In [14]:
# get internals from model's symbol
internals = model.symbol.get_internals()
# get feature layer symbol out of internals
fea_symbol = internals["global_pool_output"]
# Make a new model by using an internal symbol. We can reuse all parameters from model we trained before
# In this case, we must set ```allow_extra_params``` to True
# Because we don't need params from FullyConnected symbol
feature_extractor = mx.model.FeedForward(ctx=mx.gpu(), symbol=fea_symbol, numpy_batch_size=1,
arg_params=model.arg_params, aux_params=model.aux_params,
allow_extra_params=True)
# predict feature
global_pooling_feature = feature_extractor.predict(batch)
print(global_pooling_feature.shape)
In [ ]: