Session 4: Visualizing Representations

Creative Applications of Deep Learning with Google's Tensorflow Parag K. Mital Kadenze, Inc.

Learning Goals

  • Learn how to inspect deep networks by visualizing their gradients
  • Learn how to "deep dream" with different objective functions and regularization techniques
  • Learn how to "stylize" an image using content and style losses from different images

Introduction

So far, we've seen that a deep convolutional network can get very high accuracy in classifying the MNIST dataset, a dataset of handwritten digits numbered 0 - 9. What happens when the number of classes grows higher than 10 possibilities? Or the images get much larger? We're going to explore a few new datasets and bigger and better models to try and find out. We'll then explore a few interesting visualization tehcniques to help us understand what the networks are representing in its deeper layers and how these techniques can be used for some very interesting creative applications.

Deep Convolutional Networks

Almost 30 years of computer vision and machine learning research based on images takes an approach to processing images like what we saw at the end of Session 1: you take an image, convolve it with a set of edge detectors like the gabor filter we created, and then find some thresholding of this image to find more interesting features, such as corners, or look at histograms of the number of some orientation of edges in a particular window. In the previous session, we started to see how Deep Learning has allowed us to move away from hand crafted features such as Gabor-like filters to letting data discover representations. Though, how well does it scale?

A seminal shift in the perceived capabilities of deep neural networks occurred in 2012. A network dubbed AlexNet, after its primary author, Alex Krizevsky, achieved remarkable performance on one of the most difficult computer vision datasets at the time, ImageNet. <TODO: Insert montage of ImageNet>. ImageNet is a dataset used in a yearly challenge called the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), started in 2010. The dataset contains nearly 1.2 million images composed of 1000 different types of objects. Each object has anywhere between 600 - 1200 different images. <TODO: Histogram of object labels>

Up until now, the most number of labels we've considered is 10! The image sizes were also very small, only 28 x 28 pixels, and it didn't even have color.

Let's look at a state-of-the-art network that has already been trained on ImageNet.

Loading a Pretrained Network

We can use an existing network that has been trained by loading the model's weights into a network definition. The network definition is basically saying what are the set of operations in the tensorflow graph. So how is the image manipulated, filtered, in order to get from an input image to a probability saying which 1 of 1000 possible objects is the image describing? It also restores the model's weights. Those are the values of every parameter in the network learned through gradient descent. Luckily, many researchers are releasing their model definitions and weights so we don't have to train them! We just have to load them up and then we can use the model straight away. That's very lucky for us because these models take a lot of time, cpu, memory, and money to train.

To get the files required for these models, you'll need to download them from the resources page.

First, let's import some necessary libraries.


In [1]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import IPython.display as ipyd
from libs import gif, nb_utils

In [2]:
# Bit of formatting because I don't like the default inline code style:
from IPython.core.display import HTML
HTML("""<style> .rendered_html code { 
    padding: 2px 4px;
    color: #c7254e;
    background-color: #f9f2f4;
    border-radius: 4px;
} </style>""")


Out[2]:

Start an interactive session:


In [3]:
sess = tf.InteractiveSession()

Now we'll load Google's Inception model, which is a pretrained network for classification built using the ImageNet database. I've included some helper functions for getting this model loaded and setup w/ Tensorflow.


In [4]:
from libs import inception
net = inception.get_inception_model()

Here's a little extra that wasn't in the lecture. We can visualize the graph definition using the nb_utils module's show_graph function. This function is taken from an example in the Tensorflow repo so I can't take credit for it! It uses Tensorboard, which we didn't get a chance to discuss, Tensorflow's web interface for visualizing graphs and training performance. It is very useful but we sadly did not have enough time to discuss this!


In [5]:
nb_utils.show_graph(net['graph_def'])


We'll now get the graph from the storage container, and tell tensorflow to use this as its own graph. This will add all the computations we need to compute the entire deep net, as well as all of the pre-trained parameters.


In [6]:
tf.import_graph_def(net['graph_def'], name='inception')

In [7]:
net['labels']


Out[7]:
[(0, 'dummy'),
 (1, 'kit fox'),
 (2, 'English setter'),
 (3, 'Siberian husky'),
 (4, 'Australian terrier'),
 (5, 'English springer'),
 (6, 'grey whale'),
 (7, 'lesser panda'),
 (8, 'Egyptian cat'),
 (9, 'ibex'),
 (10, 'Persian cat'),
 (11, 'cougar'),
 (12, 'gazelle'),
 (13, 'porcupine'),
 (14, 'sea lion'),
 (15, 'malamute'),
 (16, 'badger'),
 (17, 'Great Dane'),
 (18, 'Walker hound'),
 (19, 'Welsh springer spaniel'),
 (20, 'whippet'),
 (21, 'Scottish deerhound'),
 (22, 'killer whale'),
 (23, 'mink'),
 (24, 'African elephant'),
 (25, 'Weimaraner'),
 (26, 'soft-coated wheaten terrier'),
 (27, 'Dandie Dinmont'),
 (28, 'red wolf'),
 (29, 'Old English sheepdog'),
 (30, 'jaguar'),
 (31, 'otterhound'),
 (32, 'bloodhound'),
 (33, 'Airedale'),
 (34, 'hyena'),
 (35, 'meerkat'),
 (36, 'giant schnauzer'),
 (37, 'titi'),
 (38, 'three-toed sloth'),
 (39, 'sorrel'),
 (40, 'black-footed ferret'),
 (41, 'dalmatian'),
 (42, 'black-and-tan coonhound'),
 (43, 'papillon'),
 (44, 'skunk'),
 (45, 'Staffordshire bullterrier'),
 (46, 'Mexican hairless'),
 (47, 'Bouvier des Flandres'),
 (48, 'weasel'),
 (49, 'miniature poodle'),
 (50, 'Cardigan'),
 (51, 'malinois'),
 (52, 'bighorn'),
 (53, 'fox squirrel'),
 (54, 'colobus'),
 (55, 'tiger cat'),
 (56, 'Lhasa'),
 (57, 'impala'),
 (58, 'coyote'),
 (59, 'Yorkshire terrier'),
 (60, 'Newfoundland'),
 (61, 'brown bear'),
 (62, 'red fox'),
 (63, 'Norwegian elkhound'),
 (64, 'Rottweiler'),
 (65, 'hartebeest'),
 (66, 'Saluki'),
 (67, 'grey fox'),
 (68, 'schipperke'),
 (69, 'Pekinese'),
 (70, 'Brabancon griffon'),
 (71, 'West Highland white terrier'),
 (72, 'Sealyham terrier'),
 (73, 'guenon'),
 (74, 'mongoose'),
 (75, 'indri'),
 (76, 'tiger'),
 (77, 'Irish wolfhound'),
 (78, 'wild boar'),
 (79, 'EntleBucher'),
 (80, 'zebra'),
 (81, 'ram'),
 (82, 'French bulldog'),
 (83, 'orangutan'),
 (84, 'basenji'),
 (85, 'leopard'),
 (86, 'Bernese mountain dog'),
 (87, 'Maltese dog'),
 (88, 'Norfolk terrier'),
 (89, 'toy terrier'),
 (90, 'vizsla'),
 (91, 'cairn'),
 (92, 'squirrel monkey'),
 (93, 'groenendael'),
 (94, 'clumber'),
 (95, 'Siamese cat'),
 (96, 'chimpanzee'),
 (97, 'komondor'),
 (98, 'Afghan hound'),
 (99, 'Japanese spaniel'),
 (100, 'proboscis monkey'),
 (101, 'guinea pig'),
 (102, 'white wolf'),
 (103, 'ice bear'),
 (104, 'gorilla'),
 (105, 'borzoi'),
 (106, 'toy poodle'),
 (107, 'Kerry blue terrier'),
 (108, 'ox'),
 (109, 'Scotch terrier'),
 (110, 'Tibetan mastiff'),
 (111, 'spider monkey'),
 (112, 'Doberman'),
 (113, 'Boston bull'),
 (114, 'Greater Swiss Mountain dog'),
 (115, 'Appenzeller'),
 (116, 'Shih-Tzu'),
 (117, 'Irish water spaniel'),
 (118, 'Pomeranian'),
 (119, 'Bedlington terrier'),
 (120, 'warthog'),
 (121, 'Arabian camel'),
 (122, 'siamang'),
 (123, 'miniature schnauzer'),
 (124, 'collie'),
 (125, 'golden retriever'),
 (126, 'Irish terrier'),
 (127, 'affenpinscher'),
 (128, 'Border collie'),
 (129, 'hare'),
 (130, 'boxer'),
 (131, 'silky terrier'),
 (132, 'beagle'),
 (133, 'Leonberg'),
 (134, 'German short-haired pointer'),
 (135, 'patas'),
 (136, 'dhole'),
 (137, 'baboon'),
 (138, 'macaque'),
 (139, 'Chesapeake Bay retriever'),
 (140, 'bull mastiff'),
 (141, 'kuvasz'),
 (142, 'capuchin'),
 (143, 'pug'),
 (144, 'curly-coated retriever'),
 (145, 'Norwich terrier'),
 (146, 'flat-coated retriever'),
 (147, 'hog'),
 (148, 'keeshond'),
 (149, 'Eskimo dog'),
 (150, 'Brittany spaniel'),
 (151, 'standard poodle'),
 (152, 'Lakeland terrier'),
 (153, 'snow leopard'),
 (154, 'Gordon setter'),
 (155, 'dingo'),
 (156, 'standard schnauzer'),
 (157, 'hamster'),
 (158, 'Tibetan terrier'),
 (159, 'Arctic fox'),
 (160, 'wire-haired fox terrier'),
 (161, 'basset'),
 (162, 'water buffalo'),
 (163, 'American black bear'),
 (164, 'Angora'),
 (165, 'bison'),
 (166, 'howler monkey'),
 (167, 'hippopotamus'),
 (168, 'chow'),
 (169, 'giant panda'),
 (170, 'American Staffordshire terrier'),
 (171, 'Shetland sheepdog'),
 (172, 'Great Pyrenees'),
 (173, 'Chihuahua'),
 (174, 'tabby'),
 (175, 'marmoset'),
 (176, 'Labrador retriever'),
 (177, 'Saint Bernard'),
 (178, 'armadillo'),
 (179, 'Samoyed'),
 (180, 'bluetick'),
 (181, 'redbone'),
 (182, 'polecat'),
 (183, 'marmot'),
 (184, 'kelpie'),
 (185, 'gibbon'),
 (186, 'llama'),
 (187, 'miniature pinscher'),
 (188, 'wood rabbit'),
 (189, 'Italian greyhound'),
 (190, 'lion'),
 (191, 'cocker spaniel'),
 (192, 'Irish setter'),
 (193, 'dugong'),
 (194, 'Indian elephant'),
 (195, 'beaver'),
 (196, 'Sussex spaniel'),
 (197, 'Pembroke'),
 (198, 'Blenheim spaniel'),
 (199, 'Madagascar cat'),
 (200, 'Rhodesian ridgeback'),
 (201, 'lynx'),
 (202, 'African hunting dog'),
 (203, 'langur'),
 (204, 'Ibizan hound'),
 (205, 'timber wolf'),
 (206, 'cheetah'),
 (207, 'English foxhound'),
 (208, 'briard'),
 (209, 'sloth bear'),
 (210, 'Border terrier'),
 (211, 'German shepherd'),
 (212, 'otter'),
 (213, 'koala'),
 (214, 'tusker'),
 (215, 'echidna'),
 (216, 'wallaby'),
 (217, 'platypus'),
 (218, 'wombat'),
 (219, 'revolver'),
 (220, 'umbrella'),
 (221, 'schooner'),
 (222, 'soccer ball'),
 (223, 'accordion'),
 (224, 'ant'),
 (225, 'starfish'),
 (226, 'chambered nautilus'),
 (227, 'grand piano'),
 (228, 'laptop'),
 (229, 'strawberry'),
 (230, 'airliner'),
 (231, 'warplane'),
 (232, 'airship'),
 (233, 'balloon'),
 (234, 'space shuttle'),
 (235, 'fireboat'),
 (236, 'gondola'),
 (237, 'speedboat'),
 (238, 'lifeboat'),
 (239, 'canoe'),
 (240, 'yawl'),
 (241, 'catamaran'),
 (242, 'trimaran'),
 (243, 'container ship'),
 (244, 'liner'),
 (245, 'pirate'),
 (246, 'aircraft carrier'),
 (247, 'submarine'),
 (248, 'wreck'),
 (249, 'half track'),
 (250, 'tank'),
 (251, 'missile'),
 (252, 'bobsled'),
 (253, 'dogsled'),
 (254, 'bicycle-built-for-two'),
 (255, 'mountain bike'),
 (256, 'freight car'),
 (257, 'passenger car'),
 (258, 'barrow'),
 (259, 'shopping cart'),
 (260, 'motor scooter'),
 (261, 'forklift'),
 (262, 'electric locomotive'),
 (263, 'steam locomotive'),
 (264, 'amphibian'),
 (265, 'ambulance'),
 (266, 'beach wagon'),
 (267, 'cab'),
 (268, 'convertible'),
 (269, 'jeep'),
 (270, 'limousine'),
 (271, 'minivan'),
 (272, 'Model T'),
 (273, 'racer'),
 (274, 'sports car'),
 (275, 'go-kart'),
 (276, 'golfcart'),
 (277, 'moped'),
 (278, 'snowplow'),
 (279, 'fire engine'),
 (280, 'garbage truck'),
 (281, 'pickup'),
 (282, 'tow truck'),
 (283, 'trailer truck'),
 (284, 'moving van'),
 (285, 'police van'),
 (286, 'recreational vehicle'),
 (287, 'streetcar'),
 (288, 'snowmobile'),
 (289, 'tractor'),
 (290, 'mobile home'),
 (291, 'tricycle'),
 (292, 'unicycle'),
 (293, 'horse cart'),
 (294, 'jinrikisha'),
 (295, 'oxcart'),
 (296, 'bassinet'),
 (297, 'cradle'),
 (298, 'crib'),
 (299, 'four-poster'),
 (300, 'bookcase'),
 (301, 'china cabinet'),
 (302, 'medicine chest'),
 (303, 'chiffonier'),
 (304, 'table lamp'),
 (305, 'file'),
 (306, 'park bench'),
 (307, 'barber chair'),
 (308, 'throne'),
 (309, 'folding chair'),
 (310, 'rocking chair'),
 (311, 'studio couch'),
 (312, 'toilet seat'),
 (313, 'desk'),
 (314, 'pool table'),
 (315, 'dining table'),
 (316, 'entertainment center'),
 (317, 'wardrobe'),
 (318, 'Granny Smith'),
 (319, 'orange'),
 (320, 'lemon'),
 (321, 'fig'),
 (322, 'pineapple'),
 (323, 'banana'),
 (324, 'jackfruit'),
 (325, 'custard apple'),
 (326, 'pomegranate'),
 (327, 'acorn'),
 (328, 'hip'),
 (329, 'ear'),
 (330, 'rapeseed'),
 (331, 'corn'),
 (332, 'buckeye'),
 (333, 'organ'),
 (334, 'upright'),
 (335, 'chime'),
 (336, 'drum'),
 (337, 'gong'),
 (338, 'maraca'),
 (339, 'marimba'),
 (340, 'steel drum'),
 (341, 'banjo'),
 (342, 'cello'),
 (343, 'violin'),
 (344, 'harp'),
 (345, 'acoustic guitar'),
 (346, 'electric guitar'),
 (347, 'cornet'),
 (348, 'French horn'),
 (349, 'trombone'),
 (350, 'harmonica'),
 (351, 'ocarina'),
 (352, 'panpipe'),
 (353, 'bassoon'),
 (354, 'oboe'),
 (355, 'sax'),
 (356, 'flute'),
 (357, 'daisy'),
 (358, "yellow lady's slipper"),
 (359, 'cliff'),
 (360, 'valley'),
 (361, 'alp'),
 (362, 'volcano'),
 (363, 'promontory'),
 (364, 'sandbar'),
 (365, 'coral reef'),
 (366, 'lakeside'),
 (367, 'seashore'),
 (368, 'geyser'),
 (369, 'hatchet'),
 (370, 'cleaver'),
 (371, 'letter opener'),
 (372, 'plane'),
 (373, 'power drill'),
 (374, 'lawn mower'),
 (375, 'hammer'),
 (376, 'corkscrew'),
 (377, 'can opener'),
 (378, 'plunger'),
 (379, 'screwdriver'),
 (380, 'shovel'),
 (381, 'plow'),
 (382, 'chain saw'),
 (383, 'cock'),
 (384, 'hen'),
 (385, 'ostrich'),
 (386, 'brambling'),
 (387, 'goldfinch'),
 (388, 'house finch'),
 (389, 'junco'),
 (390, 'indigo bunting'),
 (391, 'robin'),
 (392, 'bulbul'),
 (393, 'jay'),
 (394, 'magpie'),
 (395, 'chickadee'),
 (396, 'water ouzel'),
 (397, 'kite'),
 (398, 'bald eagle'),
 (399, 'vulture'),
 (400, 'great grey owl'),
 (401, 'black grouse'),
 (402, 'ptarmigan'),
 (403, 'ruffed grouse'),
 (404, 'prairie chicken'),
 (405, 'peacock'),
 (406, 'quail'),
 (407, 'partridge'),
 (408, 'African grey'),
 (409, 'macaw'),
 (410, 'sulphur-crested cockatoo'),
 (411, 'lorikeet'),
 (412, 'coucal'),
 (413, 'bee eater'),
 (414, 'hornbill'),
 (415, 'hummingbird'),
 (416, 'jacamar'),
 (417, 'toucan'),
 (418, 'drake'),
 (419, 'red-breasted merganser'),
 (420, 'goose'),
 (421, 'black swan'),
 (422, 'white stork'),
 (423, 'black stork'),
 (424, 'spoonbill'),
 (425, 'flamingo'),
 (426, 'American egret'),
 (427, 'little blue heron'),
 (428, 'bittern'),
 (429, 'crane'),
 (430, 'limpkin'),
 (431, 'American coot'),
 (432, 'bustard'),
 (433, 'ruddy turnstone'),
 (434, 'red-backed sandpiper'),
 (435, 'redshank'),
 (436, 'dowitcher'),
 (437, 'oystercatcher'),
 (438, 'European gallinule'),
 (439, 'pelican'),
 (440, 'king penguin'),
 (441, 'albatross'),
 (442, 'great white shark'),
 (443, 'tiger shark'),
 (444, 'hammerhead'),
 (445, 'electric ray'),
 (446, 'stingray'),
 (447, 'barracouta'),
 (448, 'coho'),
 (449, 'tench'),
 (450, 'goldfish'),
 (451, 'eel'),
 (452, 'rock beauty'),
 (453, 'anemone fish'),
 (454, 'lionfish'),
 (455, 'puffer'),
 (456, 'sturgeon'),
 (457, 'gar'),
 (458, 'loggerhead'),
 (459, 'leatherback turtle'),
 (460, 'mud turtle'),
 (461, 'terrapin'),
 (462, 'box turtle'),
 (463, 'banded gecko'),
 (464, 'common iguana'),
 (465, 'American chameleon'),
 (466, 'whiptail'),
 (467, 'agama'),
 (468, 'frilled lizard'),
 (469, 'alligator lizard'),
 (470, 'Gila monster'),
 (471, 'green lizard'),
 (472, 'African chameleon'),
 (473, 'Komodo dragon'),
 (474, 'triceratops'),
 (475, 'African crocodile'),
 (476, 'American alligator'),
 (477, 'thunder snake'),
 (478, 'ringneck snake'),
 (479, 'hognose snake'),
 (480, 'green snake'),
 (481, 'king snake'),
 (482, 'garter snake'),
 (483, 'water snake'),
 (484, 'vine snake'),
 (485, 'night snake'),
 (486, 'boa constrictor'),
 (487, 'rock python'),
 (488, 'Indian cobra'),
 (489, 'green mamba'),
 (490, 'sea snake'),
 (491, 'horned viper'),
 (492, 'diamondback'),
 (493, 'sidewinder'),
 (494, 'European fire salamander'),
 (495, 'common newt'),
 (496, 'eft'),
 (497, 'spotted salamander'),
 (498, 'axolotl'),
 (499, 'bullfrog'),
 (500, 'tree frog'),
 (501, 'tailed frog'),
 (502, 'whistle'),
 (503, 'wing'),
 (504, 'paintbrush'),
 (505, 'hand blower'),
 (506, 'oxygen mask'),
 (507, 'snorkel'),
 (508, 'loudspeaker'),
 (509, 'microphone'),
 (510, 'screen'),
 (511, 'mouse'),
 (512, 'electric fan'),
 (513, 'oil filter'),
 (514, 'strainer'),
 (515, 'space heater'),
 (516, 'stove'),
 (517, 'guillotine'),
 (518, 'barometer'),
 (519, 'rule'),
 (520, 'odometer'),
 (521, 'scale'),
 (522, 'analog clock'),
 (523, 'digital clock'),
 (524, 'wall clock'),
 (525, 'hourglass'),
 (526, 'sundial'),
 (527, 'parking meter'),
 (528, 'stopwatch'),
 (529, 'digital watch'),
 (530, 'stethoscope'),
 (531, 'syringe'),
 (532, 'magnetic compass'),
 (533, 'binoculars'),
 (534, 'projector'),
 (535, 'sunglasses'),
 (536, 'loupe'),
 (537, 'radio telescope'),
 (538, 'bow'),
 (539, 'cannon [ground]'),
 (540, 'assault rifle'),
 (541, 'rifle'),
 (542, 'projectile'),
 (543, 'computer keyboard'),
 (544, 'typewriter keyboard'),
 (545, 'crane'),
 (546, 'lighter'),
 (547, 'abacus'),
 (548, 'cash machine'),
 (549, 'slide rule'),
 (550, 'desktop computer'),
 (551, 'hand-held computer'),
 (552, 'notebook'),
 (553, 'web site'),
 (554, 'harvester'),
 (555, 'thresher'),
 (556, 'printer'),
 (557, 'slot'),
 (558, 'vending machine'),
 (559, 'sewing machine'),
 (560, 'joystick'),
 (561, 'switch'),
 (562, 'hook'),
 (563, 'car wheel'),
 (564, 'paddlewheel'),
 (565, 'pinwheel'),
 (566, "potter's wheel"),
 (567, 'gas pump'),
 (568, 'carousel'),
 (569, 'swing'),
 (570, 'reel'),
 (571, 'radiator'),
 (572, 'puck'),
 (573, 'hard disc'),
 (574, 'sunglass'),
 (575, 'pick'),
 (576, 'car mirror'),
 (577, 'solar dish'),
 (578, 'remote control'),
 (579, 'disk brake'),
 (580, 'buckle'),
 (581, 'hair slide'),
 (582, 'knot'),
 (583, 'combination lock'),
 (584, 'padlock'),
 (585, 'nail'),
 (586, 'safety pin'),
 (587, 'screw'),
 (588, 'muzzle'),
 (589, 'seat belt'),
 (590, 'ski'),
 (591, 'candle'),
 (592, "jack-o'-lantern"),
 (593, 'spotlight'),
 (594, 'torch'),
 (595, 'neck brace'),
 (596, 'pier'),
 (597, 'tripod'),
 (598, 'maypole'),
 (599, 'mousetrap'),
 (600, 'spider web'),
 (601, 'trilobite'),
 (602, 'harvestman'),
 (603, 'scorpion'),
 (604, 'black and gold garden spider'),
 (605, 'barn spider'),
 (606, 'garden spider'),
 (607, 'black widow'),
 (608, 'tarantula'),
 (609, 'wolf spider'),
 (610, 'tick'),
 (611, 'centipede'),
 (612, 'isopod'),
 (613, 'Dungeness crab'),
 (614, 'rock crab'),
 (615, 'fiddler crab'),
 (616, 'king crab'),
 (617, 'American lobster'),
 (618, 'spiny lobster'),
 (619, 'crayfish'),
 (620, 'hermit crab'),
 (621, 'tiger beetle'),
 (622, 'ladybug'),
 (623, 'ground beetle'),
 (624, 'long-horned beetle'),
 (625, 'leaf beetle'),
 (626, 'dung beetle'),
 (627, 'rhinoceros beetle'),
 (628, 'weevil'),
 (629, 'fly'),
 (630, 'bee'),
 (631, 'grasshopper'),
 (632, 'cricket'),
 (633, 'walking stick'),
 (634, 'cockroach'),
 (635, 'mantis'),
 (636, 'cicada'),
 (637, 'leafhopper'),
 (638, 'lacewing'),
 (639, 'dragonfly'),
 (640, 'damselfly'),
 (641, 'admiral'),
 (642, 'ringlet'),
 (643, 'monarch'),
 (644, 'cabbage butterfly'),
 (645, 'sulphur butterfly'),
 (646, 'lycaenid'),
 (647, 'jellyfish'),
 (648, 'sea anemone'),
 (649, 'brain coral'),
 (650, 'flatworm'),
 (651, 'nematode'),
 (652, 'conch'),
 (653, 'snail'),
 (654, 'slug'),
 (655, 'sea slug'),
 (656, 'chiton'),
 (657, 'sea urchin'),
 (658, 'sea cucumber'),
 (659, 'iron'),
 (660, 'espresso maker'),
 (661, 'microwave'),
 (662, 'Dutch oven'),
 (663, 'rotisserie'),
 (664, 'toaster'),
 (665, 'waffle iron'),
 (666, 'vacuum'),
 (667, 'dishwasher'),
 (668, 'refrigerator'),
 (669, 'washer'),
 (670, 'Crock Pot'),
 (671, 'frying pan'),
 (672, 'wok'),
 (673, 'caldron'),
 (674, 'coffeepot'),
 (675, 'teapot'),
 (676, 'spatula'),
 (677, 'altar'),
 (678, 'triumphal arch'),
 (679, 'patio'),
 (680, 'steel arch bridge'),
 (681, 'suspension bridge'),
 (682, 'viaduct'),
 (683, 'barn'),
 (684, 'greenhouse'),
 (685, 'palace'),
 (686, 'monastery'),
 (687, 'library'),
 (688, 'apiary'),
 (689, 'boathouse'),
 (690, 'church'),
 (691, 'mosque'),
 (692, 'stupa'),
 (693, 'planetarium'),
 (694, 'restaurant'),
 (695, 'cinema'),
 (696, 'home theater'),
 (697, 'lumbermill'),
 (698, 'coil'),
 (699, 'obelisk'),
 (700, 'totem pole'),
 (701, 'castle'),
 (702, 'prison'),
 (703, 'grocery store'),
 (704, 'bakery'),
 (705, 'barbershop'),
 (706, 'bookshop'),
 (707, 'butcher shop'),
 (708, 'confectionery'),
 (709, 'shoe shop'),
 (710, 'tobacco shop'),
 (711, 'toyshop'),
 (712, 'fountain'),
 (713, 'cliff dwelling'),
 (714, 'yurt'),
 (715, 'dock'),
 (716, 'brass'),
 (717, 'megalith'),
 (718, 'bannister'),
 (719, 'breakwater'),
 (720, 'dam'),
 (721, 'chainlink fence'),
 (722, 'picket fence'),
 (723, 'worm fence'),
 (724, 'stone wall'),
 (725, 'grille'),
 (726, 'sliding door'),
 (727, 'turnstile'),
 (728, 'mountain tent'),
 (729, 'scoreboard'),
 (730, 'honeycomb'),
 (731, 'plate rack'),
 (732, 'pedestal'),
 (733, 'beacon'),
 (734, 'mashed potato'),
 (735, 'bell pepper'),
 (736, 'head cabbage'),
 (737, 'broccoli'),
 (738, 'cauliflower'),
 (739, 'zucchini'),
 (740, 'spaghetti squash'),
 (741, 'acorn squash'),
 (742, 'butternut squash'),
 (743, 'cucumber'),
 (744, 'artichoke'),
 (745, 'cardoon'),
 (746, 'mushroom'),
 (747, 'shower curtain'),
 (748, 'jean'),
 (749, 'carton'),
 (750, 'handkerchief'),
 (751, 'sandal'),
 (752, 'ashcan'),
 (753, 'safe'),
 (754, 'plate'),
 (755, 'necklace'),
 (756, 'croquet ball'),
 (757, 'fur coat'),
 (758, 'thimble'),
 (759, 'pajama'),
 (760, 'running shoe'),
 (761, 'cocktail shaker'),
 (762, 'chest'),
 (763, 'manhole cover'),
 (764, 'modem'),
 (765, 'tub'),
 (766, 'tray'),
 (767, 'balance beam'),
 (768, 'bagel'),
 (769, 'prayer rug'),
 (770, 'kimono'),
 (771, 'hot pot'),
 (772, 'whiskey jug'),
 (773, 'knee pad'),
 (774, 'book jacket'),
 (775, 'spindle'),
 (776, 'ski mask'),
 (777, 'beer bottle'),
 (778, 'crash helmet'),
 (779, 'bottlecap'),
 (780, 'tile roof'),
 (781, 'mask'),
 (782, 'maillot'),
 (783, 'Petri dish'),
 (784, 'football helmet'),
 (785, 'bathing cap'),
 (786, 'teddy bear'),
 (787, 'holster'),
 (788, 'pop bottle'),
 (789, 'photocopier'),
 (790, 'vestment'),
 (791, 'crossword puzzle'),
 (792, 'golf ball'),
 (793, 'trifle'),
 (794, 'suit'),
 (795, 'water tower'),
 (796, 'feather boa'),
 (797, 'cloak'),
 (798, 'red wine'),
 (799, 'drumstick'),
 (800, 'shield'),
 (801, 'Christmas stocking'),
 (802, 'hoopskirt'),
 (803, 'menu'),
 (804, 'stage'),
 (805, 'bonnet'),
 (806, 'meat loaf'),
 (807, 'baseball'),
 (808, 'face powder'),
 (809, 'scabbard'),
 (810, 'sunscreen'),
 (811, 'beer glass'),
 (812, 'hen-of-the-woods'),
 (813, 'guacamole'),
 (814, 'lampshade'),
 (815, 'wool'),
 (816, 'hay'),
 (817, 'bow tie'),
 (818, 'mailbag'),
 (819, 'water jug'),
 (820, 'bucket'),
 (821, 'dishrag'),
 (822, 'soup bowl'),
 (823, 'eggnog'),
 (824, 'mortar'),
 (825, 'trench coat'),
 (826, 'paddle'),
 (827, 'chain'),
 (828, 'swab'),
 (829, 'mixing bowl'),
 (830, 'potpie'),
 (831, 'wine bottle'),
 (832, 'shoji'),
 (833, 'bulletproof vest'),
 (834, 'drilling platform'),
 (835, 'binder'),
 (836, 'cardigan'),
 (837, 'sweatshirt'),
 (838, 'pot'),
 (839, 'birdhouse'),
 (840, 'hamper'),
 (841, 'ping-pong ball'),
 (842, 'pencil box'),
 (843, 'pay-phone'),
 (844, 'consomme'),
 (845, 'apron'),
 (846, 'punching bag'),
 (847, 'backpack'),
 (848, 'groom'),
 (849, 'bearskin'),
 (850, 'pencil sharpener'),
 (851, 'broom'),
 (852, 'mosquito net'),
 (853, 'abaya'),
 (854, 'mortarboard'),
 (855, 'poncho'),
 (856, 'crutch'),
 (857, 'Polaroid camera'),
 (858, 'space bar'),
 (859, 'cup'),
 (860, 'racket'),
 (861, 'traffic light'),
 (862, 'quill'),
 (863, 'radio'),
 (864, 'dough'),
 (865, 'cuirass'),
 (866, 'military uniform'),
 (867, 'lipstick'),
 (868, 'shower cap'),
 (869, 'monitor'),
 (870, 'oscilloscope'),
 (871, 'mitten'),
 (872, 'brassiere'),
 (873, 'French loaf'),
 (874, 'vase'),
 (875, 'milk can'),
 (876, 'rugby ball'),
 (877, 'paper towel'),
 (878, 'earthstar'),
 (879, 'envelope'),
 (880, 'miniskirt'),
 (881, 'cowboy hat'),
 (882, 'trolleybus'),
 (883, 'perfume'),
 (884, 'bathtub'),
 (885, 'hotdog'),
 (886, 'coral fungus'),
 (887, 'bullet train'),
 (888, 'pillow'),
 (889, 'toilet tissue'),
 (890, 'cassette'),
 (891, "carpenter's kit"),
 (892, 'ladle'),
 (893, 'stinkhorn'),
 (894, 'lotion'),
 (895, 'hair spray'),
 (896, 'academic gown'),
 (897, 'dome'),
 (898, 'crate'),
 (899, 'wig'),
 (900, 'burrito'),
 (901, 'pill bottle'),
 (902, 'chain mail'),
 (903, 'theater curtain'),
 (904, 'window shade'),
 (905, 'barrel'),
 (906, 'washbasin'),
 (907, 'ballpoint'),
 (908, 'basketball'),
 (909, 'bath towel'),
 (910, 'cowboy boot'),
 (911, 'gown'),
 (912, 'window screen'),
 (913, 'agaric'),
 (914, 'cellular telephone'),
 (915, 'nipple'),
 (916, 'barbell'),
 (917, 'mailbox'),
 (918, 'lab coat'),
 (919, 'fire screen'),
 (920, 'minibus'),
 (921, 'packet'),
 (922, 'maze'),
 (923, 'pole'),
 (924, 'horizontal bar'),
 (925, 'sombrero'),
 (926, 'pickelhaube'),
 (927, 'rain barrel'),
 (928, 'wallet'),
 (929, 'cassette player'),
 (930, 'comic book'),
 (931, 'piggy bank'),
 (932, 'street sign'),
 (933, 'bell cote'),
 (934, 'fountain pen'),
 (935, 'Windsor tie'),
 (936, 'volleyball'),
 (937, 'overskirt'),
 (938, 'sarong'),
 (939, 'purse'),
 (940, 'bolo tie'),
 (941, 'bib'),
 (942, 'parachute'),
 (943, 'sleeping bag'),
 (944, 'television'),
 (945, 'swimming trunks'),
 (946, 'measuring cup'),
 (947, 'espresso'),
 (948, 'pizza'),
 (949, 'breastplate'),
 (950, 'shopping basket'),
 (951, 'wooden spoon'),
 (952, 'saltshaker'),
 (953, 'chocolate sauce'),
 (954, 'ballplayer'),
 (955, 'goblet'),
 (956, 'gyromitra'),
 (957, 'stretcher'),
 (958, 'water bottle'),
 (959, 'dial telephone'),
 (960, 'soap dispenser'),
 (961, 'jersey'),
 (962, 'school bus'),
 (963, 'jigsaw puzzle'),
 (964, 'plastic bag'),
 (965, 'reflex camera'),
 (966, 'diaper'),
 (967, 'Band Aid'),
 (968, 'ice lolly'),
 (969, 'velvet'),
 (970, 'tennis ball'),
 (971, 'gasmask'),
 (972, 'doormat'),
 (973, 'Loafer'),
 (974, 'ice cream'),
 (975, 'pretzel'),
 (976, 'quilt'),
 (977, 'maillot'),
 (978, 'tape player'),
 (979, 'clog'),
 (980, 'iPod'),
 (981, 'bolete'),
 (982, 'scuba diver'),
 (983, 'pitcher'),
 (984, 'matchstick'),
 (985, 'bikini'),
 (986, 'sock'),
 (987, 'CD player'),
 (988, 'lens cap'),
 (989, 'thatch'),
 (990, 'vault'),
 (991, 'beaker'),
 (992, 'bubble'),
 (993, 'cheeseburger'),
 (994, 'parallel bars'),
 (995, 'flagpole'),
 (996, 'coffee mug'),
 (997, 'rubber eraser'),
 (998, 'stole'),
 (999, 'carbonara'),
 ...]

<TODO: visual of graph>

Let's have a look at the graph:


In [8]:
g = tf.get_default_graph()
names = [op.name for op in g.get_operations()]
names


Out[8]:
['inception/input',
 'inception/conv2d0_w',
 'inception/conv2d0_b',
 'inception/conv2d1_w',
 'inception/conv2d1_b',
 'inception/conv2d2_w',
 'inception/conv2d2_b',
 'inception/mixed3a_1x1_w',
 'inception/mixed3a_1x1_b',
 'inception/mixed3a_3x3_bottleneck_w',
 'inception/mixed3a_3x3_bottleneck_b',
 'inception/mixed3a_3x3_w',
 'inception/mixed3a_3x3_b',
 'inception/mixed3a_5x5_bottleneck_w',
 'inception/mixed3a_5x5_bottleneck_b',
 'inception/mixed3a_5x5_w',
 'inception/mixed3a_5x5_b',
 'inception/mixed3a_pool_reduce_w',
 'inception/mixed3a_pool_reduce_b',
 'inception/mixed3b_1x1_w',
 'inception/mixed3b_1x1_b',
 'inception/mixed3b_3x3_bottleneck_w',
 'inception/mixed3b_3x3_bottleneck_b',
 'inception/mixed3b_3x3_w',
 'inception/mixed3b_3x3_b',
 'inception/mixed3b_5x5_bottleneck_w',
 'inception/mixed3b_5x5_bottleneck_b',
 'inception/mixed3b_5x5_w',
 'inception/mixed3b_5x5_b',
 'inception/mixed3b_pool_reduce_w',
 'inception/mixed3b_pool_reduce_b',
 'inception/mixed4a_1x1_w',
 'inception/mixed4a_1x1_b',
 'inception/mixed4a_3x3_bottleneck_w',
 'inception/mixed4a_3x3_bottleneck_b',
 'inception/mixed4a_3x3_w',
 'inception/mixed4a_3x3_b',
 'inception/mixed4a_5x5_bottleneck_w',
 'inception/mixed4a_5x5_bottleneck_b',
 'inception/mixed4a_5x5_w',
 'inception/mixed4a_5x5_b',
 'inception/mixed4a_pool_reduce_w',
 'inception/mixed4a_pool_reduce_b',
 'inception/mixed4b_1x1_w',
 'inception/mixed4b_1x1_b',
 'inception/mixed4b_3x3_bottleneck_w',
 'inception/mixed4b_3x3_bottleneck_b',
 'inception/mixed4b_3x3_w',
 'inception/mixed4b_3x3_b',
 'inception/mixed4b_5x5_bottleneck_w',
 'inception/mixed4b_5x5_bottleneck_b',
 'inception/mixed4b_5x5_w',
 'inception/mixed4b_5x5_b',
 'inception/mixed4b_pool_reduce_w',
 'inception/mixed4b_pool_reduce_b',
 'inception/mixed4c_1x1_w',
 'inception/mixed4c_1x1_b',
 'inception/mixed4c_3x3_bottleneck_w',
 'inception/mixed4c_3x3_bottleneck_b',
 'inception/mixed4c_3x3_w',
 'inception/mixed4c_3x3_b',
 'inception/mixed4c_5x5_bottleneck_w',
 'inception/mixed4c_5x5_bottleneck_b',
 'inception/mixed4c_5x5_w',
 'inception/mixed4c_5x5_b',
 'inception/mixed4c_pool_reduce_w',
 'inception/mixed4c_pool_reduce_b',
 'inception/mixed4d_1x1_w',
 'inception/mixed4d_1x1_b',
 'inception/mixed4d_3x3_bottleneck_w',
 'inception/mixed4d_3x3_bottleneck_b',
 'inception/mixed4d_3x3_w',
 'inception/mixed4d_3x3_b',
 'inception/mixed4d_5x5_bottleneck_w',
 'inception/mixed4d_5x5_bottleneck_b',
 'inception/mixed4d_5x5_w',
 'inception/mixed4d_5x5_b',
 'inception/mixed4d_pool_reduce_w',
 'inception/mixed4d_pool_reduce_b',
 'inception/mixed4e_1x1_w',
 'inception/mixed4e_1x1_b',
 'inception/mixed4e_3x3_bottleneck_w',
 'inception/mixed4e_3x3_bottleneck_b',
 'inception/mixed4e_3x3_w',
 'inception/mixed4e_3x3_b',
 'inception/mixed4e_5x5_bottleneck_w',
 'inception/mixed4e_5x5_bottleneck_b',
 'inception/mixed4e_5x5_w',
 'inception/mixed4e_5x5_b',
 'inception/mixed4e_pool_reduce_w',
 'inception/mixed4e_pool_reduce_b',
 'inception/mixed5a_1x1_w',
 'inception/mixed5a_1x1_b',
 'inception/mixed5a_3x3_bottleneck_w',
 'inception/mixed5a_3x3_bottleneck_b',
 'inception/mixed5a_3x3_w',
 'inception/mixed5a_3x3_b',
 'inception/mixed5a_5x5_bottleneck_w',
 'inception/mixed5a_5x5_bottleneck_b',
 'inception/mixed5a_5x5_w',
 'inception/mixed5a_5x5_b',
 'inception/mixed5a_pool_reduce_w',
 'inception/mixed5a_pool_reduce_b',
 'inception/mixed5b_1x1_w',
 'inception/mixed5b_1x1_b',
 'inception/mixed5b_3x3_bottleneck_w',
 'inception/mixed5b_3x3_bottleneck_b',
 'inception/mixed5b_3x3_w',
 'inception/mixed5b_3x3_b',
 'inception/mixed5b_5x5_bottleneck_w',
 'inception/mixed5b_5x5_bottleneck_b',
 'inception/mixed5b_5x5_w',
 'inception/mixed5b_5x5_b',
 'inception/mixed5b_pool_reduce_w',
 'inception/mixed5b_pool_reduce_b',
 'inception/head0_bottleneck_w',
 'inception/head0_bottleneck_b',
 'inception/nn0_w',
 'inception/nn0_b',
 'inception/softmax0_w',
 'inception/softmax0_b',
 'inception/head1_bottleneck_w',
 'inception/head1_bottleneck_b',
 'inception/nn1_w',
 'inception/nn1_b',
 'inception/softmax1_w',
 'inception/softmax1_b',
 'inception/softmax2_w',
 'inception/softmax2_b',
 'inception/conv2d0_pre_relu/conv',
 'inception/conv2d0_pre_relu',
 'inception/conv2d0',
 'inception/maxpool0',
 'inception/localresponsenorm0',
 'inception/conv2d1_pre_relu/conv',
 'inception/conv2d1_pre_relu',
 'inception/conv2d1',
 'inception/conv2d2_pre_relu/conv',
 'inception/conv2d2_pre_relu',
 'inception/conv2d2',
 'inception/localresponsenorm1',
 'inception/maxpool1',
 'inception/mixed3a_1x1_pre_relu/conv',
 'inception/mixed3a_1x1_pre_relu',
 'inception/mixed3a_1x1',
 'inception/mixed3a_3x3_bottleneck_pre_relu/conv',
 'inception/mixed3a_3x3_bottleneck_pre_relu',
 'inception/mixed3a_3x3_bottleneck',
 'inception/mixed3a_3x3_pre_relu/conv',
 'inception/mixed3a_3x3_pre_relu',
 'inception/mixed3a_3x3',
 'inception/mixed3a_5x5_bottleneck_pre_relu/conv',
 'inception/mixed3a_5x5_bottleneck_pre_relu',
 'inception/mixed3a_5x5_bottleneck',
 'inception/mixed3a_5x5_pre_relu/conv',
 'inception/mixed3a_5x5_pre_relu',
 'inception/mixed3a_5x5',
 'inception/mixed3a_pool',
 'inception/mixed3a_pool_reduce_pre_relu/conv',
 'inception/mixed3a_pool_reduce_pre_relu',
 'inception/mixed3a_pool_reduce',
 'inception/mixed3a/concat_dim',
 'inception/mixed3a',
 'inception/mixed3b_1x1_pre_relu/conv',
 'inception/mixed3b_1x1_pre_relu',
 'inception/mixed3b_1x1',
 'inception/mixed3b_3x3_bottleneck_pre_relu/conv',
 'inception/mixed3b_3x3_bottleneck_pre_relu',
 'inception/mixed3b_3x3_bottleneck',
 'inception/mixed3b_3x3_pre_relu/conv',
 'inception/mixed3b_3x3_pre_relu',
 'inception/mixed3b_3x3',
 'inception/mixed3b_5x5_bottleneck_pre_relu/conv',
 'inception/mixed3b_5x5_bottleneck_pre_relu',
 'inception/mixed3b_5x5_bottleneck',
 'inception/mixed3b_5x5_pre_relu/conv',
 'inception/mixed3b_5x5_pre_relu',
 'inception/mixed3b_5x5',
 'inception/mixed3b_pool',
 'inception/mixed3b_pool_reduce_pre_relu/conv',
 'inception/mixed3b_pool_reduce_pre_relu',
 'inception/mixed3b_pool_reduce',
 'inception/mixed3b/concat_dim',
 'inception/mixed3b',
 'inception/maxpool4',
 'inception/mixed4a_1x1_pre_relu/conv',
 'inception/mixed4a_1x1_pre_relu',
 'inception/mixed4a_1x1',
 'inception/mixed4a_3x3_bottleneck_pre_relu/conv',
 'inception/mixed4a_3x3_bottleneck_pre_relu',
 'inception/mixed4a_3x3_bottleneck',
 'inception/mixed4a_3x3_pre_relu/conv',
 'inception/mixed4a_3x3_pre_relu',
 'inception/mixed4a_3x3',
 'inception/mixed4a_5x5_bottleneck_pre_relu/conv',
 'inception/mixed4a_5x5_bottleneck_pre_relu',
 'inception/mixed4a_5x5_bottleneck',
 'inception/mixed4a_5x5_pre_relu/conv',
 'inception/mixed4a_5x5_pre_relu',
 'inception/mixed4a_5x5',
 'inception/mixed4a_pool',
 'inception/mixed4a_pool_reduce_pre_relu/conv',
 'inception/mixed4a_pool_reduce_pre_relu',
 'inception/mixed4a_pool_reduce',
 'inception/mixed4a/concat_dim',
 'inception/mixed4a',
 'inception/mixed4b_1x1_pre_relu/conv',
 'inception/mixed4b_1x1_pre_relu',
 'inception/mixed4b_1x1',
 'inception/mixed4b_3x3_bottleneck_pre_relu/conv',
 'inception/mixed4b_3x3_bottleneck_pre_relu',
 'inception/mixed4b_3x3_bottleneck',
 'inception/mixed4b_3x3_pre_relu/conv',
 'inception/mixed4b_3x3_pre_relu',
 'inception/mixed4b_3x3',
 'inception/mixed4b_5x5_bottleneck_pre_relu/conv',
 'inception/mixed4b_5x5_bottleneck_pre_relu',
 'inception/mixed4b_5x5_bottleneck',
 'inception/mixed4b_5x5_pre_relu/conv',
 'inception/mixed4b_5x5_pre_relu',
 'inception/mixed4b_5x5',
 'inception/mixed4b_pool',
 'inception/mixed4b_pool_reduce_pre_relu/conv',
 'inception/mixed4b_pool_reduce_pre_relu',
 'inception/mixed4b_pool_reduce',
 'inception/mixed4b/concat_dim',
 'inception/mixed4b',
 'inception/mixed4c_1x1_pre_relu/conv',
 'inception/mixed4c_1x1_pre_relu',
 'inception/mixed4c_1x1',
 'inception/mixed4c_3x3_bottleneck_pre_relu/conv',
 'inception/mixed4c_3x3_bottleneck_pre_relu',
 'inception/mixed4c_3x3_bottleneck',
 'inception/mixed4c_3x3_pre_relu/conv',
 'inception/mixed4c_3x3_pre_relu',
 'inception/mixed4c_3x3',
 'inception/mixed4c_5x5_bottleneck_pre_relu/conv',
 'inception/mixed4c_5x5_bottleneck_pre_relu',
 'inception/mixed4c_5x5_bottleneck',
 'inception/mixed4c_5x5_pre_relu/conv',
 'inception/mixed4c_5x5_pre_relu',
 'inception/mixed4c_5x5',
 'inception/mixed4c_pool',
 'inception/mixed4c_pool_reduce_pre_relu/conv',
 'inception/mixed4c_pool_reduce_pre_relu',
 'inception/mixed4c_pool_reduce',
 'inception/mixed4c/concat_dim',
 'inception/mixed4c',
 'inception/mixed4d_1x1_pre_relu/conv',
 'inception/mixed4d_1x1_pre_relu',
 'inception/mixed4d_1x1',
 'inception/mixed4d_3x3_bottleneck_pre_relu/conv',
 'inception/mixed4d_3x3_bottleneck_pre_relu',
 'inception/mixed4d_3x3_bottleneck',
 'inception/mixed4d_3x3_pre_relu/conv',
 'inception/mixed4d_3x3_pre_relu',
 'inception/mixed4d_3x3',
 'inception/mixed4d_5x5_bottleneck_pre_relu/conv',
 'inception/mixed4d_5x5_bottleneck_pre_relu',
 'inception/mixed4d_5x5_bottleneck',
 'inception/mixed4d_5x5_pre_relu/conv',
 'inception/mixed4d_5x5_pre_relu',
 'inception/mixed4d_5x5',
 'inception/mixed4d_pool',
 'inception/mixed4d_pool_reduce_pre_relu/conv',
 'inception/mixed4d_pool_reduce_pre_relu',
 'inception/mixed4d_pool_reduce',
 'inception/mixed4d/concat_dim',
 'inception/mixed4d',
 'inception/mixed4e_1x1_pre_relu/conv',
 'inception/mixed4e_1x1_pre_relu',
 'inception/mixed4e_1x1',
 'inception/mixed4e_3x3_bottleneck_pre_relu/conv',
 'inception/mixed4e_3x3_bottleneck_pre_relu',
 'inception/mixed4e_3x3_bottleneck',
 'inception/mixed4e_3x3_pre_relu/conv',
 'inception/mixed4e_3x3_pre_relu',
 'inception/mixed4e_3x3',
 'inception/mixed4e_5x5_bottleneck_pre_relu/conv',
 'inception/mixed4e_5x5_bottleneck_pre_relu',
 'inception/mixed4e_5x5_bottleneck',
 'inception/mixed4e_5x5_pre_relu/conv',
 'inception/mixed4e_5x5_pre_relu',
 'inception/mixed4e_5x5',
 'inception/mixed4e_pool',
 'inception/mixed4e_pool_reduce_pre_relu/conv',
 'inception/mixed4e_pool_reduce_pre_relu',
 'inception/mixed4e_pool_reduce',
 'inception/mixed4e/concat_dim',
 'inception/mixed4e',
 'inception/maxpool10',
 'inception/mixed5a_1x1_pre_relu/conv',
 'inception/mixed5a_1x1_pre_relu',
 'inception/mixed5a_1x1',
 'inception/mixed5a_3x3_bottleneck_pre_relu/conv',
 'inception/mixed5a_3x3_bottleneck_pre_relu',
 'inception/mixed5a_3x3_bottleneck',
 'inception/mixed5a_3x3_pre_relu/conv',
 'inception/mixed5a_3x3_pre_relu',
 'inception/mixed5a_3x3',
 'inception/mixed5a_5x5_bottleneck_pre_relu/conv',
 'inception/mixed5a_5x5_bottleneck_pre_relu',
 'inception/mixed5a_5x5_bottleneck',
 'inception/mixed5a_5x5_pre_relu/conv',
 'inception/mixed5a_5x5_pre_relu',
 'inception/mixed5a_5x5',
 'inception/mixed5a_pool',
 'inception/mixed5a_pool_reduce_pre_relu/conv',
 'inception/mixed5a_pool_reduce_pre_relu',
 'inception/mixed5a_pool_reduce',
 'inception/mixed5a/concat_dim',
 'inception/mixed5a',
 'inception/mixed5b_1x1_pre_relu/conv',
 'inception/mixed5b_1x1_pre_relu',
 'inception/mixed5b_1x1',
 'inception/mixed5b_3x3_bottleneck_pre_relu/conv',
 'inception/mixed5b_3x3_bottleneck_pre_relu',
 'inception/mixed5b_3x3_bottleneck',
 'inception/mixed5b_3x3_pre_relu/conv',
 'inception/mixed5b_3x3_pre_relu',
 'inception/mixed5b_3x3',
 'inception/mixed5b_5x5_bottleneck_pre_relu/conv',
 'inception/mixed5b_5x5_bottleneck_pre_relu',
 'inception/mixed5b_5x5_bottleneck',
 'inception/mixed5b_5x5_pre_relu/conv',
 'inception/mixed5b_5x5_pre_relu',
 'inception/mixed5b_5x5',
 'inception/mixed5b_pool',
 'inception/mixed5b_pool_reduce_pre_relu/conv',
 'inception/mixed5b_pool_reduce_pre_relu',
 'inception/mixed5b_pool_reduce',
 'inception/mixed5b/concat_dim',
 'inception/mixed5b',
 'inception/avgpool0',
 'inception/head0_pool',
 'inception/head0_bottleneck_pre_relu/conv',
 'inception/head0_bottleneck_pre_relu',
 'inception/head0_bottleneck',
 'inception/head0_bottleneck/reshape/shape',
 'inception/head0_bottleneck/reshape',
 'inception/nn0_pre_relu/matmul',
 'inception/nn0_pre_relu',
 'inception/nn0',
 'inception/nn0/reshape/shape',
 'inception/nn0/reshape',
 'inception/softmax0_pre_activation/matmul',
 'inception/softmax0_pre_activation',
 'inception/softmax0',
 'inception/head1_pool',
 'inception/head1_bottleneck_pre_relu/conv',
 'inception/head1_bottleneck_pre_relu',
 'inception/head1_bottleneck',
 'inception/head1_bottleneck/reshape/shape',
 'inception/head1_bottleneck/reshape',
 'inception/nn1_pre_relu/matmul',
 'inception/nn1_pre_relu',
 'inception/nn1',
 'inception/nn1/reshape/shape',
 'inception/nn1/reshape',
 'inception/softmax1_pre_activation/matmul',
 'inception/softmax1_pre_activation',
 'inception/softmax1',
 'inception/avgpool0/reshape/shape',
 'inception/avgpool0/reshape',
 'inception/softmax2_pre_activation/matmul',
 'inception/softmax2_pre_activation',
 'inception/softmax2',
 'inception/output',
 'inception/output1',
 'inception/output2']

The input to the graph is stored in the first tensor output, and the probability of the 1000 possible objects is in the last layer:


In [9]:
input_name = names[0] + ':0'
x = g.get_tensor_by_name(input_name)

In [10]:
softmax = g.get_tensor_by_name(names[-1] + ':0')

Predicting with the Inception Network

Let's try to use the network to predict now:


In [11]:
from skimage.data import coffee
og = coffee()
plt.imshow(og)
print(og.min(), og.max())


0 255

We'll crop and resize the image to 299 x 299 pixels. I've provided a simple helper function which will do this for us:


In [12]:
# Note that in the lecture, I used a slightly different inception
# model, and this one requires us to subtract the mean from the input image.
# The preprocess function will also crop/resize the image to 299x299
img = inception.preprocess(og)
print(og.shape), print(img.shape)


(400, 600, 3)
(299, 299, 3)
/usr/local/lib/python3.5/dist-packages/skimage/transform/_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15.
  warn("The default mode, 'constant', will be changed to 'reflect' in "
Out[12]:
(None, None)

In [13]:
# So this will now be a different range than what we had in the lecture:
print(img.min(), img.max())


-117.0 138.0

As we've seen from the last session, our images must be shaped as a 4-dimensional shape describing the number of images, height, width, and number of channels. So our original 3-dimensional image of height, width, channels needs an additional dimension on the 0th axis.


In [14]:
img_4d = img[np.newaxis]
print(img_4d.shape)


(1, 299, 299, 3)

In [15]:
fig, axs = plt.subplots(1, 2)
axs[0].imshow(og)

# Note that unlike the lecture, we have to call the `inception.deprocess` function
# so that it adds back the mean!
axs[1].imshow(inception.deprocess(img))


Out[15]:
<matplotlib.image.AxesImage at 0x7fc8600adeb8>

In [16]:
res = np.squeeze(softmax.eval(feed_dict={x: img_4d}))

In [17]:
# Note that this network is slightly different than the one used in the lecture.
# Instead of just 1 output, there will be 16 outputs of 1008 probabilities.
# We only use the first 1000 probabilities (the extra ones are for negative/unseen labels)
res.shape


Out[17]:
(16, 1008)

The result of the network is a 1000 element vector, with probabilities of each class. Inside our net dictionary are the labels for every element. We can sort these and use the labels of the 1000 classes to see what the top 5 predicted probabilities and labels are:


In [18]:
# Note that this is one way to aggregate the different probabilities.  We could also
# take the argmax.
res = np.mean(res, 0)
res = res / np.sum(res)

In [19]:
print([(res[idx], net['labels'][idx])
       for idx in res.argsort()[-5:][::-1]])


[(0.99849206, (947, 'espresso')), (0.000631253, (859, 'cup')), (0.00050241494, (953, 'chocolate sauce')), (0.00019483209, (844, 'consomme')), (0.00013370356, (822, 'soup bowl'))]

Visualizing Filters

Wow so it works! But how!? Well that's an ongoing research question. There has been a lot of great developments in the last few years to help us understand what might be happening. Let's try to first visualize the weights of the convolution filters, like we've done with our MNIST network before.


In [20]:
W = g.get_tensor_by_name('inception/conv2d0_w:0')
W_eval = W.eval()
print(W_eval.shape)


(7, 7, 3, 64)

With MNIST, our input number of filters was 1, since our input number of channels was also 1, as all of MNIST is grayscale. But in this case, our input number of channels is 3, and so the input number of convolution filters is also 3. We can try to see every single individual filter using the library tool I've provided:


In [21]:
from libs import utils
W_montage = utils.montage_filters(W_eval)
plt.figure(figsize=(10,10))
plt.imshow(W_montage, interpolation='nearest')


Out[21]:
<matplotlib.image.AxesImage at 0x7fc860029588>

Or, we can also try to look at them as RGB filters, showing the influence of each color channel, for each neuron or output filter.


In [22]:
Ws = [utils.montage_filters(W_eval[:, :, [i], :]) for i in range(3)]
Ws = np.rollaxis(np.array(Ws), 0, 3)
plt.figure(figsize=(10,10))
plt.imshow(Ws, interpolation='nearest')


Out[22]:
<matplotlib.image.AxesImage at 0x7fc8587f14e0>

In order to better see what these are doing, let's normalize the filters range:


In [23]:
# print (np.min(Ws), np.max(Ws))
Ws = (Ws / np.max(np.abs(Ws)) * 128 + 128).astype(np.uint8)
plt.figure(figsize=(10,10))
plt.imshow(Ws, interpolation='nearest')


Out[23]:
<matplotlib.image.AxesImage at 0x7fc85875b898>

Like with our MNIST example, we can probably guess what some of these are doing. They are responding to edges, corners, and center-surround or some kind of contrast of two things, like red, green, blue yellow, which interestingly is also what neuroscience of vision tells us about how the human vision identifies color, which is through opponency of red/green and blue/yellow. To get a better sense, we can try to look at the output of the convolution:


In [24]:
feature = g.get_tensor_by_name('inception/conv2d0_pre_relu:0')

Let's look at the shape:


In [25]:
layer_shape = tf.shape(feature).eval(feed_dict={x:img_4d})
print(layer_shape)


[  1 150 150  64]

So our original image which was 1 x 299 x 299 x 3 color channels, now has 64 new channels of information. The image's height and width are also halved, because of the stride of 2 in the convolution. We've just seen what each of the convolution filters look like. Let's try to see how they filter the image now by looking at the resulting convolution.


In [26]:
f = feature.eval(feed_dict={x: img_4d})
montage = utils.montage_filters(np.rollaxis(np.expand_dims(f[0], 3), 3, 2))
fig, axs = plt.subplots(1, 3, figsize=(20, 10))
axs[0].imshow(inception.deprocess(img))
axs[0].set_title('Original Image')
axs[1].imshow(Ws, interpolation='nearest')
axs[1].set_title('Convolution Filters')
axs[2].imshow(montage, cmap='gray')
axs[2].set_title('Convolution Outputs')


Out[26]:
<matplotlib.text.Text at 0x7fc8586b8550>

In [27]:
fig, axs = plt.subplots(1, 1, figsize=(10, 10))
plt.imshow(montage, cmap='gray')


Out[27]:
<matplotlib.image.AxesImage at 0x7fc8587ac390>

In [28]:
# ?utils.montage_filters
print (len(montage))
for i in range(400):
    print(max(montage[i]), min(montage[i]))


1209
0.5 0.5
2438.74462891 -2462.29663086
1149.40063477 -1639.19287109
1160.6673584 -1842.72973633
1335.33056641 -1742.53613281
1158.94287109 -1755.50854492
1241.96069336 -1763.71728516
2216.06298828 -2635.61987305
3573.42260742 -3793.9519043
3198.01171875 -3514.13769531
3279.18945312 -3494.13720703
3030.64208984 -3274.53710938
3092.38671875 -3367.51391602
3070.54492188 -3180.35644531
3066.2277832 -3209.87524414
2846.98193359 -3115.28051758
3163.59619141 -2859.00854492
2816.03637695 -2599.08178711
3017.82666016 -2626.70556641
2965.90869141 -2599.12646484
3014.5090332 -2641.10229492
3097.6394043 -2412.35620117
3041.9152832 -2400.00097656
2975.22436523 -2299.7890625
2907.19018555 -2180.80200195
2918.79003906 -1997.11022949
2837.21484375 -2017.14465332
3100.28930664 -2141.81469727
2911.85327148 -3274.85083008
3030.9543457 -2719.66577148
3017.43066406 -2194.0637207
2956.1484375 -2026.36938477
2879.72094727 -2863.72753906
3253.85375977 -2782.29663086
3042.20532227 -2653.46582031
2685.25610352 -3022.13720703
3136.12329102 -2932.73388672
3448.0168457 -2566.26416016
3601.41235352 -2499.45556641
3468.11279297 -2898.50732422
3433.97265625 -2974.13671875
3379.55932617 -2926.89916992
3391.54638672 -2962.18676758
3468.04248047 -2947.37109375
3473.99145508 -3202.33105469
3365.42260742 -3137.75024414
2980.44677734 -3052.66918945
2519.90332031 -3313.55175781
3289.41650391 -2633.28466797
3585.09423828 -3554.69458008
2834.44116211 -4107.83886719
3180.25073242 -4406.11132812
3307.98461914 -3469.31518555
2964.47558594 -3086.15478516
3277.10473633 -3705.61645508
3347.23168945 -3711.99047852
3557.71435547 -3067.68603516
3593.42626953 -3688.71069336
3565.76049805 -3123.93920898
3584.56933594 -3763.06494141
3555.36621094 -3275.01611328
3527.2734375 -4190.23144531
3512.02441406 -3447.90112305
3513.94726562 -4033.44946289
3507.94702148 -4530.08544922
3694.4465332 -4719.77880859
3815.34326172 -4654.26416016
3817.83300781 -4396.33691406
4005.31152344 -4217.91503906
4223.89453125 -4372.60253906
4410.82519531 -4420.02587891
4322.56152344 -4100.11767578
4604.58300781 -4217.95263672
4661.62304688 -4030.3659668
4728.60449219 -3780.41235352
4754.89453125 -4031.65722656
4795.70849609 -3918.6184082
5104.29003906 -4231.44384766
4771.22949219 -4578.03759766
5037.19140625 -4244.25292969
3410.98681641 -3649.23999023
3415.27148438 -3863.47167969
3427.93115234 -3138.07958984
3417.30957031 -3673.51367188
3421.05688477 -3804.68432617
3409.64208984 -3602.58374023
3425.89453125 -3150.74584961
3442.24243164 -3089.26513672
3412.09619141 -2522.28979492
3402.37255859 -1930.57568359
3402.3815918 -2651.34814453
3425.01611328 -3466.2409668
4676.79638672 -4753.88525391
3418.16015625 -3504.36962891
4511.15039062 -4183.53710938
3407.04785156 -3334.10498047
3377.02880859 -3664.55834961
3378.02392578 -2898.24462891
3447.2734375 -3247.67041016
3789.33544922 -3510.43212891
3062.04101562 -2304.69873047
2553.35083008 -3469.78393555
3417.32470703 -3306.82470703
3231.9465332 -3531.33666992
3943.58642578 -3979.24462891
3177.99365234 -3185.82666016
3408.56665039 -3061.83618164
3298.84594727 -2968.09130859
3314.92016602 -3296.67553711
4951.92919922 -2980.31762695
3302.35961914 -2971.3684082
3195.66821289 -3093.20117188
3307.31396484 -2921.19311523
2747.5871582 -2919.27636719
2718.15649414 -2792.99487305
3118.30493164 -3099.57861328
2675.18774414 -2630.22460938
3461.99633789 -3571.18261719
3700.71337891 -4460.92431641
3266.07641602 -4472.95654297
2884.53759766 -3960.23999023
2839.05786133 -3566.65063477
2686.40380859 -4199.97021484
2867.24658203 -2850.40161133
2428.88842773 -3672.60571289
2250.06738281 -4035.94580078
3154.88305664 -3893.55908203
2260.51538086 -2799.08959961
2599.09301758 -3825.59179688
2267.39599609 -3747.40844727
2610.31933594 -3828.92114258
2459.88354492 -3378.58374023
2724.02612305 -4166.83154297
2916.87866211 -3892.12084961
2874.96044922 -3707.05810547
3061.02026367 -3647.90014648
2973.04785156 -3468.57885742
2587.47094727 -3656.55541992
2282.11157227 -2485.69458008
2077.46777344 -2008.07385254
2315.18286133 -2564.70922852
3545.21582031 -3674.45703125
3078.71777344 -4588.46142578
3001.84619141 -3981.61547852
2754.4519043 -2782.94360352
2533.76879883 -2024.59838867
1838.00085449 -2022.40246582
1647.79760742 -2156.64135742
1988.69580078 -2135.25537109
2104.98681641 -2117.49462891
2574.35888672 -2792.12670898
0.5 0.5
1593.29980469 -903.519287109
1005.79699707 -1443.52722168
994.826721191 -1548.39575195
990.85345459 -1512.98364258
1010.53582764 -1525.52514648
1092.00402832 -1510.77929688
1144.45715332 -1608.26586914
1156.48156738 -1699.26513672
1419.38903809 -1744.89978027
1173.11730957 -1796.04211426
1168.140625 -1716.02258301
1133.69372559 -1711.44506836
1123.14245605 -2037.09155273
1330.58178711 -1869.18310547
1293.21813965 -1979.99865723
1490.75585938 -1875.52587891
1495.47033691 -2059.75097656
1686.36279297 -1810.48327637
1681.1685791 -1737.49511719
1792.85144043 -1587.70910645
1887.29089355 -1608.4510498
1920.953125 -1715.8503418
1896.86108398 -1621.10412598
1969.82824707 -1714.83911133
1900.9855957 -1974.17285156
1716.61914062 -2116.68041992
2270.18823242 -2446.3293457
2253.82373047 -1924.95422363
2113.89355469 -2216.29101562
2110.39941406 -1960.71801758
2374.13256836 -2455.09204102
2241.05224609 -1758.19470215
2434.34350586 -1526.74609375
2240.34301758 -1521.81677246
2134.16503906 -1399.20715332
2424.20019531 -1431.56811523
2337.34936523 -1489.52844238
2161.13183594 -1538.61303711
2118.00366211 -1593.59655762
2077.45532227 -1612.79968262
1927.61962891 -1624.26062012
2000.23327637 -1654.75366211
1968.00427246 -1686.81958008
1901.71569824 -1738.46899414
1965.71691895 -1712.10046387
2083.36791992 -1884.28723145
2304.58105469 -1736.15319824
2406.26416016 -1593.47375488
2270.73339844 -1599.2611084
2091.70288086 -1637.95117188
2388.57910156 -1825.35498047
2309.65820312 -1870.56665039
1970.45788574 -2177.42553711
2363.58642578 -2438.18554688
2086.64306641 -2113.59936523
2244.57397461 -2253.77612305
2109.93847656 -2253.08666992
2239.78320312 -2060.89819336
2154.41162109 -2340.80712891
2154.77758789 -1929.33703613
2128.37451172 -2249.20556641
1971.70043945 -2100.36914062
1867.45275879 -1904.90148926
2264.46264648 -2156.54614258
2249.82006836 -2039.91040039
2288.89672852 -1966.93017578
2234.75170898 -1911.17285156
2201.75732422 -1865.56274414
2091.08178711 -1785.46569824
1573.81262207 -1711.5736084
2019.5871582 -1573.28479004
1802.97338867 -1564.15368652
1853.20129395 -1563.38305664
1578.89709473 -1575.18933105
1659.33239746 -2323.78515625
2356.7512207 -2738.87963867
2828.35961914 -3461.9284668
2530.97485352 -2821.40112305
2801.98754883 -2657.27832031
2850.83422852 -3181.11157227
2632.40527344 -3065.36914062
2335.82714844 -2095.60498047
2918.76147461 -1725.70666504
2923.49951172 -1686.75134277
2819.65112305 -1646.41955566
2358.94677734 -1563.82006836
1814.50012207 -1583.34460449
1331.84667969 -1550.04101562
1284.39160156 -1554.70666504
1276.42834473 -1554.35717773
1449.79333496 -3031.11328125
2121.2097168 -2150.02392578
2882.53881836 -2347.85888672
2110.48022461 -1580.13891602
2121.89013672 -1792.12036133
1859.87561035 -1688.83178711
1972.9510498 -1828.93786621
2161.58154297 -2322.33618164
1715.14685059 -1965.5682373
2227.48291016 -1888.73144531
2042.67773438 -1633.82531738
1797.3359375 -2589.10107422
2027.11254883 -2853.93212891
2165.11816406 -1967.75939941
2112.91210938 -2518.54882812
1899.63012695 -2342.34472656
2286.77124023 -1705.22180176
2010.22583008 -1566.63146973
2674.37353516 -2559.50024414
2157.74731445 -2371.25610352
2885.55078125 -2150.96508789
2561.6706543 -2050.55493164
1972.69494629 -2146.41040039
1983.36523438 -1801.07556152
1976.22131348 -1819.40661621
2067.92016602 -1917.85546875
3572.89575195 -1902.56225586
3363.88891602 -2704.30566406
2880.73706055 -1806.23742676
2835.85424805 -1736.02478027
2269.8034668 -1794.28820801
2846.81982422 -1750.77819824
1811.78979492 -2006.51623535
1818.08850098 -1715.62280273
1837.19824219 -1659.95336914
2291.78271484 -1679.49572754
2180.40966797 -1704.22570801
1674.85095215 -1533.48413086
1957.30212402 -1689.78271484
1518.47619629 -1721.23522949
1524.09655762 -1596.75183105
1650.90112305 -1665.59399414
1831.40234375 -1758.5456543
1906.98754883 -1647.97363281
1952.03833008 -1677.9185791
2320.08300781 -1979.87451172
2228.41040039 -1578.92175293
1495.08215332 -1440.82519531
1520.2208252 -1615.91149902
1511.23974609 -1597.74279785
2164.18652344 -1519.12365723
3173.62255859 -1491.7545166
2798.36401367 -1602.26989746
1281.98327637 -1441.45874023
1482.03198242 -1411.35925293
1487.06262207 -1461.02941895
1536.64404297 -1500.26367188
1529.86291504 -1544.57775879
1528.96655273 -1582.83447266
2046.10656738 -1817.63134766
0.5 0.5
1620.375 -627.340698242
519.401367188 -994.118896484
484.645324707 -775.042907715
499.416564941 -693.943786621
502.806884766 -705.007751465
501.289001465 -988.772277832
1054.41052246 -874.120544434
1289.38208008 -2272.71240234
1183.47521973 -1918.43139648
1324.07458496 -1973.05773926
1349.47888184 -1834.99523926
1353.5213623 -1441.88232422
1328.51074219 -1258.84960938
1320.10754395 -1176.55639648
1475.82507324 -1170.47802734
1369.68200684 -1273.03356934
1383.39611816 -991.782470703
1391.67053223 -993.672363281
1268.58544922 -944.517028809
1225.9588623 -1020.83734131
1285.20410156 -999.398681641
1214.4251709 -976.71697998
1216.36315918 -945.505554199
1237.68676758 -941.002685547
1262.41430664 -913.273193359
1149.34753418 -899.496154785
1225.89147949 -1160.72827148
1467.04150391 -1452.28540039
1518.60522461 -1386.73571777
1399.47729492 -1530.85913086
1783.88830566 -1578.84521484
1546.77722168 -1256.88867188
1421.82800293 -1400.921875
1235.41455078 -1404.95629883
1221.15246582 -1244.04150391
1330.7980957 -1402.61047363
1257.58789062 -1437.31433105
1218.85046387 -1451.90112305
1147.9453125 -1444.93518066
980.679443359 -1382.59729004
944.486328125 -1332.74633789
1512.17834473 -1309.65625
1757.74169922 -1466.05273438
1274.4810791 -1614.16638184
2128.84936523 -1714.67126465
1699.87463379 -1647.29663086
1784.10522461 -2256.01586914
2194.40893555 -1611.29064941
1805.70532227 -1688.38696289
1107.35620117 -1756.67102051
1273.96264648 -1901.27368164
1499.42016602 -1610.35070801
1120.39489746 -1361.72851562
1275.41223145 -1646.83752441
1450.60412598 -1634.3482666
1171.76489258 -1382.34436035
1282.90856934 -1700.46508789
1496.64575195 -1481.51696777
1217.00402832 -1414.46777344
1537.45336914 -1765.68688965
1522.6973877 -1596.76831055
1644.49902344 -1563.44006348
1868.38952637 -1554.8067627
1874.50268555 -1620.29638672
1695.81677246 -1709.39526367
1676.63476562 -1590.77624512
1631.0657959 -1795.37658691
1625.66320801 -1556.57873535
1522.11132812 -1825.29284668
1267.11499023 -1518.71447754
1464.30664062 -1607.47167969
1304.8840332 -1599.18884277
1447.04187012 -1482.26147461
1496.82397461 -1638.19763184
1591.56237793 -1454.57373047
1626.02233887 -1762.42993164
1577.38378906 -1699.40759277
1701.13366699 -1919.47241211
1821.74536133 -1903.54162598
1367.6463623 -1847.51940918
1374.18969727 -1871.72558594
1700.56896973 -1126.5814209
1046.73718262 -1628.76025391
1070.05114746 -1807.46008301
968.449584961 -1729.27160645
941.172668457 -1172.63342285
1048.06835938 -1330.62182617
1595.28686523 -1086.18933105
883.627075195 -1104.67895508
1029.08520508 -1222.52514648
1442.01635742 -1432.58679199
1664.09777832 -1921.07495117
2595.60229492 -2128.74047852
1106.27600098 -1162.50817871
1250.74597168 -1157.75817871
1155.86999512 -1110.94165039
1144.05944824 -1630.36523438

It's a little hard to see what's happening here but let's try. The third filter for instance seems to be a lot like the gabor filter we created in the first session. It respond to horizontal edges, since it has a bright component at the top, and a dark component on the bottom. Looking at the output of the convolution, we can see that the horizontal edges really pop out.

Visualizing the Gradient

So this is a pretty useful technique for the first convolution layer. But when we get to the next layer, all of sudden we have 64 different channels of information being fed to more convolution filters of some very high dimensions. It's very hard to conceptualize that many dimensions, let alone also try and figure out what it could be doing with all the possible combinations it has with other neurons in other layers.

If we want to understand what the deeper layers are really doing, we're going to have to start to use backprop to show us the gradients of a particular neuron with respect to our input image. Let's visualize the network's gradient activation when backpropagated to the original input image. This is effectively telling us which pixels are responding to the predicted class or given neuron.

We use a forward pass up to the layer that we are interested in, and then a backprop to help us understand what pixels in particular contributed to the final activation of that layer. We will need to create an operation which will find the max neuron of all activations in a layer, and then calculate the gradient of that objective with respect to the input image.


In [29]:
feature = g.get_tensor_by_name('inception/conv2d0_pre_relu:0')
gradient = tf.gradients(tf.reduce_max(feature, 3), x)

When we run this network now, we will specify the gradient operation we've created, instead of the softmax layer of the network. This will run a forward prop up to the layer we asked to find the gradient with, and then run a back prop all the way to the input image.


In [30]:
res = sess.run(gradient, feed_dict={x: img_4d})[0]

Let's visualize the original image and the output of the backpropagated gradient:


In [31]:
fig, axs = plt.subplots(1, 2)
axs[0].imshow(inception.deprocess(img))
axs[1].imshow(res[0])


Out[31]:
<matplotlib.image.AxesImage at 0x7fc86c098ac8>

Well that looks like a complete mess! What we can do is normalize the activations in a way that let's us see it more in terms of the normal range of color values.


In [32]:
def normalize(img, s=0.1):
    '''Normalize the image range for visualization'''
    z = img / np.std(img)
    return np.uint8(np.clip(
        (z - z.mean()) / max(z.std(), 1e-4) * s + 0.5,
        0, 1) * 255)

In [33]:
r = normalize(res)
fig, axs = plt.subplots(1, 2)
axs[0].imshow(inception.deprocess(img))
axs[1].imshow(r[0])


Out[33]:
<matplotlib.image.AxesImage at 0x7fc852e5eac8>

Much better! This sort of makes sense! There are some strong edges and we can really see what colors are changing along those edges.

We can try within individual layers as well, pulling out individual neurons to see what each of them are responding to. Let's first create a few functions which will help us visualize a single neuron in a layer, and every neuron of a layer:


In [34]:
def compute_gradient(input_placeholder, img, layer_name, neuron_i):
    feature = g.get_tensor_by_name(layer_name)
    gradient = tf.gradients(tf.reduce_mean(feature[:, :, :, neuron_i]), x)
    res = sess.run(gradient, feed_dict={input_placeholder: img})[0]
    return res

def compute_gradients(input_placeholder, img, layer_name):
    feature = g.get_tensor_by_name(layer_name)
    layer_shape = tf.shape(feature).eval(feed_dict={input_placeholder: img})
    gradients = []
    for neuron_i in range(layer_shape[-1]):
        gradients.append(compute_gradient(input_placeholder, img, layer_name, neuron_i))
    print(layer_shape[-1])
    return gradients

Now we can pass in a layer name, and see the gradient of every neuron in that layer with respect to the input image as a montage. Let's try the second convolutional layer. This can take awhile depending on your computer:


In [35]:
gradients = compute_gradients(x, img_4d, 'inception/conv2d1_pre_relu:0')
gradients_norm = [normalize(gradient_i[0]) for gradient_i in gradients]
montage = utils.montage(np.array(gradients_norm))


64

In [36]:
plt.figure(figsize=(12, 12))
plt.imshow(montage)


Out[36]:
<matplotlib.image.AxesImage at 0x7fc8311ced68>

Let's zoom in on the 4 in the top-left corner


In [39]:
plt.figure(figsize=(12, 12))
plt.imshow(montage[:299*2,:299*2])


Out[39]:
<matplotlib.image.AxesImage at 0x7fc828d24b70>

So it's clear that each neuron is responding to some type of feature. It looks like a lot of them are interested in the texture of the cup, and seem to respond in different ways across the image. Some seem to be more interested in the shape of the cup, responding pretty strongly to the circular opening, while others seem to catch the liquid in the cup more. There even seems to be one that just responds to the spoon, and another which responds to only the plate.

Let's try to get a sense of how the activations in each layer progress. We can get every max pooling layer like so:


In [40]:
features = [name for name in names if 'maxpool' in name.split('/')[-1]]
print(features)


['inception/maxpool0', 'inception/maxpool1', 'inception/maxpool4', 'inception/maxpool10']

So I didn't mention what max pooling is. But it 's a simple operation. You can think of it like a convolution, except instead of using a learned kernel, it will just find the maximum value in the window, for performing "max pooling", or find the average value, for performing "average pooling".

We'll now loop over every feature and create an operation that first will find the maximally activated neuron. It will then find the sum of all activations across every pixel and input channel of this neuron, and then calculate its gradient with respect to the input image.


In [41]:
n_plots = len(features) + 1
fig, axs = plt.subplots(1, n_plots, figsize=(20, 5))
base = img_4d
axs[0].imshow(inception.deprocess(img))
for feature_i, featurename in enumerate(features):
    feature = g.get_tensor_by_name(featurename + ':0')
    neuron = tf.reduce_max(feature, len(feature.get_shape())-1)
    gradient = tf.gradients(tf.reduce_sum(neuron), x)
    this_res = sess.run(gradient[0], feed_dict={x: base})[0]
    axs[feature_i+1].imshow(normalize(this_res))
    axs[feature_i+1].set_title(featurename)