In [1]:
import graphlab as gl

Load a commun image data analysis dataset


In [2]:
image_train = gl.SFrame('image_train_data/')
image_test = gl.SFrame('image_test_data/')


[INFO] This non-commercial license of GraphLab Create is assigned to iliassweb@gmail.comand will expire on September 22, 2016. For commercial licensing options, visit https://dato.com/buy/.

[INFO] Start server at: ipc:///tmp/graphlab_server-4558 - Server binary: /home/zax/anaconda/lib/python2.7/site-packages/graphlab/unity_server - Server log: /tmp/graphlab_server_1446351785.log
[INFO] GraphLab Server Version: 1.6.1

Expolring the image data


In [3]:
gl.canvas.set_target('ipynb')

In [5]:
image_train['image'].show()


Train classifier on the raw image pixels Using logistic regression


In [7]:
raw_pixel_model = gl.logistic_classifier.create(image_train,
                                        target='label',
                                        features=['image_array'])


PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 1900
PROGRESS: Number of classes           : 4
PROGRESS: Number of feature columns   : 1
PROGRESS: Number of unpacked features : 3072
PROGRESS: Number of coefficients    : 9219
PROGRESS: Starting L-BFGS
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 6        | 0.000017  | 2.374218     | 0.357368          | 0.352381            |
PROGRESS: | 2         | 8        | 1.000000  | 3.355262     | 0.384737          | 0.438095            |
PROGRESS: | 3         | 9        | 1.000000  | 3.957353     | 0.430526          | 0.466667            |
PROGRESS: | 4         | 10       | 1.000000  | 4.560749     | 0.436842          | 0.523810            |
PROGRESS: | 5         | 11       | 1.000000  | 5.135515     | 0.445263          | 0.504762            |
PROGRESS: | 6         | 12       | 1.000000  | 5.740434     | 0.463684          | 0.533333            |
PROGRESS: | 10        | 16       | 1.000000  | 8.168517     | 0.505263          | 0.590476            |
PROGRESS: +-----------+----------+-----------+--------------+-------------------+---------------------+

Make a prediction with the simple model based on raw pixels


In [8]:
image_test[0:3]['image'].show()



In [10]:
image_test[0:3]['label']


Out[10]:
dtype: str
Rows: 3
['cat', 'automobile', 'cat']

In [11]:
raw_pixel_model.predict(image_test[0:3])


Out[11]:
dtype: str
Rows: 3
['bird', 'cat', 'bird']

Evaluation raw pixel model on test data


In [12]:
raw_pixel_model.evaluate(image_test)


Out[12]:
{'accuracy': 0.477, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     bird     |       dog       |  173  |
 |     dog      |       cat       |  238  |
 |     cat      |       cat       |  334  |
 |     bird     |    automobile   |  123  |
 |  automobile  |    automobile   |  616  |
 |     dog      |    automobile   |   97  |
 |     dog      |       dog       |  406  |
 |     cat      |       dog       |  287  |
 |  automobile  |       dog       |  105  |
 |  automobile  |       bird      |  121  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

Can we improve the module using deep features


In [14]:
len(image_train)


Out[14]:
2005

In [19]:
deep_learning_model = gl.load_model('imagenet_model')
image_train['deep_features'] = deep_learning_model.extract_features(image_train)


---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-19-31442fbefb7b> in <module>()
----> 1 deep_learning_model = gl.load_model('imagenet_model')
      2 image_train['deep_features'] = deep_learning_model.extract_features(image_train)

/home/zax/anaconda/lib/python2.7/site-packages/graphlab/toolkits/_model.pyc in load_model(location)
     61         else:
     62             # Not a ToolkitError so try unpickling the model.
---> 63             unpickler = gl_pickle.GLUnpickler(location)
     64 
     65             # Get the version

/home/zax/anaconda/lib/python2.7/site-packages/graphlab/_gl_pickle.pyc in __init__(self, filename)
    450         else:
    451             if not _os.path.exists(filename):
--> 452                 raise IOError('%s is not a valid file name.' % filename)
    453 
    454         # GLC 1.3 Pickle file

IOError: imagenet_model is not a valid file name.

In [18]:
image_train.head()


Out[18]:
id image label deep_features image_array
24 Height: 32 Width: 32 bird [0.242871761322,
1.09545373917, 0.0, ...
[73.0, 77.0, 58.0, 71.0,
68.0, 50.0, 77.0, 69.0, ...
33 Height: 32 Width: 32 cat [0.525087952614, 0.0,
0.0, 0.0, 0.0, 0.0, ...
[7.0, 5.0, 8.0, 7.0, 5.0,
8.0, 5.0, 4.0, 6.0, 7.0, ...
36 Height: 32 Width: 32 cat [0.566015958786, 0.0,
0.0, 0.0, 0.0, 0.0, ...
[169.0, 122.0, 65.0,
131.0, 108.0, 75.0, ...
70 Height: 32 Width: 32 dog [1.12979578972, 0.0, 0.0,
0.778194487095, 0.0, ...
[154.0, 179.0, 152.0,
159.0, 183.0, 157.0, ...
90 Height: 32 Width: 32 bird [1.71786928177, 0.0, 0.0,
0.0, 0.0, 0.0, ...
[216.0, 195.0, 180.0,
201.0, 178.0, 160.0, ...
97 Height: 32 Width: 32 automobile [1.57818555832, 0.0, 0.0,
0.0, 0.0, 0.0, ...
[33.0, 44.0, 27.0, 29.0,
44.0, 31.0, 32.0, 45.0, ...
107 Height: 32 Width: 32 dog [0.0, 0.0,
0.220677852631, 0.0, ...
[97.0, 51.0, 31.0, 104.0,
58.0, 38.0, 107.0, 61.0, ...
121 Height: 32 Width: 32 bird [0.0, 0.23753464222, 0.0,
0.0, 0.0, 0.0, ...
[93.0, 96.0, 88.0, 102.0,
106.0, 97.0, 117.0, ...
136 Height: 32 Width: 32 automobile [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 7.5737862587, 0.0, ...
[35.0, 59.0, 53.0, 36.0,
56.0, 56.0, 42.0, 62.0, ...
138 Height: 32 Width: 32 bird [0.658935725689, 0.0,
0.0, 0.0, 0.0, 0.0, ...
[205.0, 193.0, 195.0,
200.0, 187.0, 193.0, ...
[10 rows x 5 columns]


In [ ]:


In [20]:

Given the deep features, let's train a Classifer


In [ ]:
deep_features_model = gl.logistic_classifier.create(image_train,
                                                    features=['deep_features'],
                                            target='label'
                                            )