In [1]:
import graphlab

Load CIFAR-10 dataset


In [2]:
image_train = graphlab.SFrame('image_train_data_2/')


[INFO] This non-commercial license of GraphLab Create is assigned to akshay.narayan@u.nus.edu and will expire on September 26, 2016. For commercial licensing options, visit https://dato.com/buy/.

[INFO] Start server at: ipc:///tmp/graphlab_server-7626 - Server binary: /usr/local/lib/python2.7/dist-packages/graphlab/unity_server - Server log: /tmp/graphlab_server_1449378909.log
[INFO] GraphLab Server Version: 1.6.1

In [3]:
image_train.head()


Out[3]:
id image label deep_features image_array
24 Height: 32 Width: 32 bird [0.242871761322,
1.09545373917, 0.0, ...
[73.0, 77.0, 58.0, 71.0,
68.0, 50.0, 77.0, 69.0, ...
33 Height: 32 Width: 32 cat [0.525087952614, 0.0,
0.0, 0.0, 0.0, 0.0, ...
[7.0, 5.0, 8.0, 7.0, 5.0,
8.0, 5.0, 4.0, 6.0, 7.0, ...
36 Height: 32 Width: 32 cat [0.566015958786, 0.0,
0.0, 0.0, 0.0, 0.0, ...
[169.0, 122.0, 65.0,
131.0, 108.0, 75.0, ...
70 Height: 32 Width: 32 dog [1.12979578972, 0.0, 0.0,
0.778194487095, 0.0, ...
[154.0, 179.0, 152.0,
159.0, 183.0, 157.0, ...
90 Height: 32 Width: 32 bird [1.71786928177, 0.0, 0.0,
0.0, 0.0, 0.0, ...
[216.0, 195.0, 180.0,
201.0, 178.0, 160.0, ...
97 Height: 32 Width: 32 automobile [1.57818555832, 0.0, 0.0,
0.0, 0.0, 0.0, ...
[33.0, 44.0, 27.0, 29.0,
44.0, 31.0, 32.0, 45.0, ...
107 Height: 32 Width: 32 dog [0.0, 0.0,
0.220677852631, 0.0, ...
[97.0, 51.0, 31.0, 104.0,
58.0, 38.0, 107.0, 61.0, ...
121 Height: 32 Width: 32 bird [0.0, 0.23753464222, 0.0,
0.0, 0.0, 0.0, ...
[93.0, 96.0, 88.0, 102.0,
106.0, 97.0, 117.0, ...
136 Height: 32 Width: 32 automobile [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 7.5737862587, 0.0, ...
[35.0, 59.0, 53.0, 36.0,
56.0, 56.0, 42.0, 62.0, ...
138 Height: 32 Width: 32 bird [0.658935725689, 0.0,
0.0, 0.0, 0.0, 0.0, ...
[205.0, 193.0, 195.0,
200.0, 187.0, 193.0, ...
[10 rows x 5 columns]

Creating nearest neighbors model for retrieving images using deep features


In [4]:
knn_model = graphlab.nearest_neighbors.create(image_train,
                                             features=['deep_features'],
                                             label='id')


PROGRESS: Starting brute force nearest neighbors model training.

Using knn model to find similar images


In [5]:
cat = image_train[18:19]

In [6]:
graphlab.canvas.set_target('ipynb')

In [7]:
cat['image'].show()



In [8]:
knn_model.query(cat)


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 10.652ms     |
PROGRESS: | Done         |         | 100         | 186.896ms    |
PROGRESS: +--------------+---------+-------------+--------------+
Out[8]:
query_label reference_label distance rank
0 384 0.0 1
0 6910 36.9403137951 2
0 39777 38.4634888975 3
0 36870 39.7559623119 4
0 41734 39.7866014148 5
[5 rows x 4 columns]


In [10]:
def get_images_from_ids(query_result):
    return image_train.filter_by(query_result['reference_label'], 
                                'id')

In [11]:
cat_neighbors = get_images_from_ids(knn_model.query(cat))


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 8.325ms      |
PROGRESS: | Done         |         | 100         | 187.995ms    |
PROGRESS: +--------------+---------+-------------+--------------+

In [12]:
cat_neighbors['image'].show()



In [13]:
car = image_train[8:9]

In [14]:
car['image'].show()



In [15]:
get_images_from_ids(knn_model.query(car))['image'].show()


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 14.863ms     |
PROGRESS: | Done         |         | 100         | 185.29ms     |
PROGRESS: +--------------+---------+-------------+--------------+

Create a lambda to find and show the NN images


In [16]:
show_neighbors = lambda i: get_images_from_ids(knn_model.query(image_train[i:i+1]))['image'].show()

In [17]:
show_neighbors(8)


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 9.391ms      |
PROGRESS: | Done         |         | 100         | 182.124ms    |
PROGRESS: +--------------+---------+-------------+--------------+

In [18]:
show_neighbors(26)


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 11.755ms     |
PROGRESS: | Done         |         | 100         | 196.706ms    |
PROGRESS: +--------------+---------+-------------+--------------+

In [19]:
show_neighbors(122)


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 9.103ms      |
PROGRESS: | Done         |         | 100         | 190.016ms    |
PROGRESS: +--------------+---------+-------------+--------------+

In [20]:
show_neighbors(1222)


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 8.623ms      |
PROGRESS: | Done         |         | 100         | 181.369ms    |
PROGRESS: +--------------+---------+-------------+--------------+

In [21]:
show_neighbors(2000)


PROGRESS: Starting pairwise querying.
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | Query points | # Pairs | % Complete. | Elapsed Time |
PROGRESS: +--------------+---------+-------------+--------------+
PROGRESS: | 0            | 1       | 0.0498753   | 10.369ms     |
PROGRESS: | Done         |         | 100         | 183.209ms    |
PROGRESS: +--------------+---------+-------------+--------------+

In [23]:
image_train['label'].sketch_summary()


Out[23]:
+------------------+-------+----------+
|       item       | value | is exact |
+------------------+-------+----------+
|      Length      |  2005 |   Yes    |
| # Missing Values |   0   |   Yes    |
| # unique values  |   4   |    No    |
+------------------+-------+----------+

Most frequent items:
+-------+------------+-----+-----+------+
| value | automobile | cat | dog | bird |
+-------+------------+-----+-----+------+
| count |    509     | 509 | 509 | 478  |
+-------+------------+-----+-----+------+

In [30]:
dog_sframe = image_train[image_train['label'] == 'dog']

In [31]:
dog_sframe.show()



In [32]:
len(dog_sframe)


Out[32]:
509

In [33]:
cat_sframe = image_train[image_train['label'] == 'cat']

In [34]:
bird_sframe = image_train[image_train['label'] == 'bird']

In [35]:
automobile_sframe = image_train[image_train['label'] == 'automobile']

In [38]:
len(cat_sframe), len(bird_sframe), len(automobile_sframe), len(dog_sframe)


Out[38]:
(509, 478, 509, 509)

In [39]:
cat_sframe.show()



In [40]:
dog_sframe.show()



In [41]:
bird_sframe.show()



In [42]:
automobile_sframe.show()