In this tutorial, we'll go through how one may practically apply deep learning to Image Classification and Object Detection
In [1]:
import graphlab
import graphlab.mxnet
graphlab.canvas.set_target('ipynb')
Suppose we need to classify products into Backpacks and Mountain Bikes. We know that deep learning models are state-of-the-art in image classification. So lets load a model that has been trained on ImageNet and apply it, after loading in our dataset
In [2]:
products_train = graphlab.SFrame('products_train.sf/')
products_test = graphlab.SFrame('products_test.sf/')
In [3]:
products_test['image'].show()
In [4]:
pretrained_model = graphlab.mxnet.pretrained_model.load_path('mxnet_models/imagenet1k_inception_bn/')
In [5]:
predictions = pretrained_model.predict_topk(products_test.head(10), k=1)
In [6]:
predictions['label']
Out[6]:
As you can see above, the label set is overly large: and they don't match the actual labels of 'Backpack' and 'Mountain Bike'
Transfer learning is a method for adapting an existing model to a new task. We can use this idea with neural networks: vectorize the image using the nueral network in a process called extracting features, then putting a simple classifier on top. This works well.
In [7]:
#products_train['extracted_features'] = pretrained_model.extract_features(products_train)
In [8]:
#products_test['extracted_features'] = pretrained_model.extract_features(products_test)
In [9]:
transfer_model = graphlab.logistic_classifier.create(products_train, features=['extracted_features'], target='label', validation_set=products_test)
Validation Accuracy appears to be quite good.
Let us try searching for visually similar products.
In [10]:
products_all = products_test.append(products_train)
products_all = products_all.add_row_number()
In [11]:
nearest_neighbors_model = graphlab.nearest_neighbors.create(products_all, features=['extracted_features'])
In [12]:
query = products_all[0:1]
query['image'].show()
In [13]:
query_results = nearest_neighbors_model.query(query)
In [14]:
query_results
Out[14]:
In [15]:
filtered_results = products_all.filter_by(query_results['reference_label'], 'id')
In [16]:
filtered_results['image'].show()
If you want a network that is more custom-tailored to a task, you can use a pre-defined network architechture (that has worked on other tasks) on your own task. Let's try it here.
In [17]:
network = graphlab.deeplearning.create(products_train, target='label')
In [18]:
network
Out[18]:
In [19]:
products_test['image'] = graphlab.image_analysis.resize(products_test['image'], 224, 224, 3)
In [20]:
neural_net_model = graphlab.neuralnet_classifier.create(products_train,network = network, features=['image'], target='label', validation_set=products_test, max_iterations = 3)
You may want to customize the architechture as well. For instance, you may want . Typically, one modifies existing network architechtures instead of building one completely from scrach.
In [21]:
network
Out[21]:
Let's add some hidden units to the network
In [22]:
network.layers[3] = graphlab.deeplearning.layers.FullConnectionLayer(200)
In [23]:
network
Out[23]:
In [24]:
neural_net_model = graphlab.neuralnet_classifier.create(products_train,network = network, features=['image'], target='label', validation_set=products_test, max_iterations = 3)
Sometimes, there are many objects in an image and it may be important to identify each seperately. Or, it may be important to identify the location of a particular object in an image. This is called Object Detection.
In [25]:
detector_query = graphlab.image_analysis.load_images('detection_query')
In [26]:
detector_query['image'][0].show()
In [27]:
detector = graphlab.mxnet.pretrained_model.load_path('mxnet_models/coco_vgg_16/')
In [28]:
detections = detector.detect(detector_query['image'][0])
In [29]:
detections
Out[29]:
In [30]:
backpack_detections = detections.filter_by(['backpack'], 'class')
In [31]:
backpack_detections
Out[31]:
In [32]:
visualize = detector.visualize_detection(detector_query['image'][0], backpack_detections)
In [33]:
visualize.show()
Now, let us take the identified backpack in the image, crop it, and find the most similar images in our products SFrame.
In [34]:
def crop(gl_img, box):
_format = {'JPG': 0, 'PNG': 1, 'RAW': 2, 'UNDEFINED': 3}
pil_img = gl_img._to_pil_image()
cropped = pil_img.crop([int(c) for c in box])
height = cropped.size[1]
width = cropped.size[0]
if cropped.mode == 'L':
image_data = bytearray([z for z in cropped.getdata()])
channels = 1
elif cropped.mode == 'RGB':
image_data = bytearray([z for l in cropped.getdata() for z in l ])
channels = 3
else:
image_data = bytearray([z for l in cropped.getdata() for z in l])
channels = 4
format_enum = _format['RAW']
image_data_size = len(image_data)
img = graphlab.Image(_image_data=image_data, _width=width, _height=height, _channels=channels, _format_enum=format_enum, _image_data_size=image_data_size)
return img
In [35]:
cropped = crop(detector_query['image'][0], backpack_detections['box'][0])
In [36]:
cropped.show()
In [37]:
query_sf = graphlab.SFrame({'image' : [cropped]})
In [38]:
query_sf['image'].show()
In [39]:
query_sf['extracted_features'] = pretrained_model.extract_feature(query_sf)
In [40]:
query_sf
Out[40]:
In [41]:
products_all
Out[41]:
In [42]:
similar_backpacks = nearest_neighbors_model.query(query_sf)
In [43]:
similar_backpacks
Out[43]:
In [44]:
filtered_similar_backpacks = products_all.filter_by(similar_backpacks['reference_label'], 'id')
In [45]:
filtered_similar_backpacks['image'].show()
In [ ]: