In [ ]:
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
1. Familiar with Python
2. Completed Handbook 1/Part 3: Wide Convolutional Neural Networks
1. Branch Convolutions in a Inception v1 Module
2. Branch Convolutions in a ResNeXt Module
Let's create an Inception module.
We will use these approaches:
1. Dimensionality Reduction replacing one convolution in a pair with a bottleneck
convolution.
2. Branching the input through multiple convolutions (wide).
3. Concatenating the branches back together.
You fill in the blanks (replace the ??), make sure it passes the Python interpreter, and then verify it's correctness with the summary output.
You will need to:
1. Set the filter size, strides and input vector for the bottleneck layer pair with the max pooling layer.
2. Set the filter size and input vector for the bottleneck layer pair with the 3x3 convolution.
3. Set the filter size and input vector for the bottleneck layer pair with the 5x5 convolution.
4. Concatenate the branches.
In [ ]:
from keras import Model, Input
from keras import layers
# Our hypothetical input to an inception module
x = inputs = Input((229, 229, 3))
# The inception branches (where x is the previous layer)
x1 = layers.MaxPooling2D((3, 3), strides=(1,1), padding='same')(x)
# Add the bottleneck after the 2x2 pooling layer
# HINT: x1 is the branch for pooling + bottleneck. So the output from pooling is the input to the bottleneck
x1 = layers.Conv2D(64, ??, strides=??, padding='same')(??)
# Add the second branch which is a single bottleneck convolution
x2 = layers.Conv2D(64, (1, 1), strides=(1, 1), padding='same')(x) # passes straight through
x3 = layers.Conv2D(64, (1, 1), strides=(1, 1), padding='same')(x)
# Add the the 3x3 convolutional layer after the bottleneck
# HINT: x3 is the branch for bottleneck + convolution. So the output from bottleneck is the input to the convolution
x3 = layers.Conv2D(96, ??, strides=(1, 1), padding='same')(??)
x4 = layers.Conv2D(64, (1, 1), strides=(1, 1), padding='same')(x)
# Add the the 5x5 convolutional layer after the bottleneck
# HINT: x4 is the branch for bottleneck + convolution. So the output from bottleneck is the input to the convolution
x4 = layers.Conv2D(48, ??, strides=(1, 1), padding='same')(??)
# Concatenate the filters from each of the four branches
# HINT: List the branches (variable names) as a list
x = outputs = layers.concatenate([??, ??, ??, ??])
# Let's create a mini-inception neural network using a single inception v1 module
model = Model(inputs, outputs)
It should look like below:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 229, 229, 3) 0
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 229, 229, 3) 0 input_1[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 229, 229, 64) 256 input_1[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 229, 229, 64) 256 input_1[0][0]
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 229, 229, 64) 256 max_pooling2d_3[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 229, 229, 64) 256 input_1[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 229, 229, 96) 55392 conv2d_3[0][0]
__________________________________________________________________________________________________
conv2d_6 (Conv2D) (None, 229, 229, 48) 76848 conv2d_5[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 229, 229, 272 0 conv2d_1[0][0]
conv2d_2[0][0]
conv2d_4[0][0]
conv2d_6[0][0]
==================================================================================================
Total params: 133,264
Trainable params: 133,264
Non-trainable params: 0
In [ ]:
model.summary()
Let's create a ResNeXt module.
We will use these approaches:
1. Split and branching the input through parallel convolutions (wide).
2. Concatenating the branches back together.
3. Dimensionality reduction by sandwiching the split/branch between two bottleneck
convolutions.
You will need to:
1. Append the solit (parallel) convolution into a list.
2. Set the number of input and output filters of each residual block group.
In [ ]:
from keras import Input, Model
from keras import layers
def _resnext_block(shortcut, filters_in, filters_out, cardinality=32, strides=(1, 1)):
""" Construct a ResNeXT block
shortcut : previous layer and shortcut for identity link
filters_in : number of filters (channels) at the input convolution
filters_out: number of filters (channels) at the output convolution
cardinality: width of cardinality layer
"""
# Bottleneck layer
# HINT: remember its all about 1's
x = layers.Conv2D(filters_in, kernel_size=??, strides=??,
padding='same')(shortcut)
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
# Cardinality (Wide) Layer
filters_card = filters_in // cardinality
groups = []
for i in range(cardinality):
# Split the input evenly across parallel branches
group = layers.Lambda(lambda z: z[:, :, :, i * filters_card:i *
filters_card + filters_card])(x)
# Maintain a list of parallel branches
# HINT: Your building a list of the split inputs (group) which are passed
# through a 3x3 convolution.
groups.append(layers.Conv2D(filters_card, kernel_size=(3, 3),
strides=strides, padding='same')(??))
# Concatenate the outputs of the cardinality layer together
# HINT: Its the list of parallel branches to concatenate
x = layers.concatenate(??)
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
# Bottleneck layer
x = layers.Conv2D(filters_out, kernel_size=(1, 1), strides=(1, 1),
padding='same')(x)
x = layers.BatchNormalization()(x)
# special case for first resnext block
if shortcut.shape[-1] != filters_out:
# use convolutional layer to double the input size to the block so it
# matches the output size (so we can add them)
shortcut = layers.Conv2D(filters_out, kernel_size=(1, 1), strides=strides,
padding='same')(shortcut)
shortcut = layers.BatchNormalization()(shortcut)
# Identity Link: Add the shortcut (input) to the output of the block
x = layers.add([shortcut, x])
x = layers.ReLU()(x)
return x
# The input tensor
inputs = layers.Input(shape=(224, 224, 3))
# Stem Convolutional layer
x = layers.Conv2D(64, kernel_size=(7, 7), strides=(2, 2), padding='same')(inputs)
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
x = layers.MaxPool2D(pool_size=(3, 3), strides=(2, 2), padding='same')(x)
# First ResNeXt Group, inputs are 128 filters and outputs are 256
# HINT: the second number will be twice as big as the first number
x = _resnext_block(x, ??, ??, strides=(2, 2))
for _ in range(2):
x = _resnext_block(x, ??, ??)
# strided convolution to match the number of output filters on next block and reduce by 2
x = layers.Conv2D(512, kernel_size=(1, 1), strides=(2, 2), padding='same')(x)
# Second ResNeXt Group, inputs will be 256 and outputs will be 512
for _ in range(4):
x = _resnext_block(x, ??, ??)
# strided convolution to match the number of output filters on next block and
# reduce by 2
x = layers.Conv2D(1024, kernel_size=(1, 1), strides=(2, 2), padding='same')(x)
# Third ResNeXt Group, inputs will be 512 and outputs 1024
for _ in range(6):
x = _resnext_block(x, ??, ??)
# strided convolution to match the number of output filters on next block and
# reduce by 2
x = layers.Conv2D(2048, kernel_size=(1, 1), strides=(2, 2), padding='same')(x)
# Fourth ResNeXt Group, inputs will be 1024 and outputs will be 2048
for _ in range(3):
x = _resnext_block(x, ??, ??)
# Final Dense Outputting Layer for 1000 outputs
x = layers.GlobalAveragePooling2D()(x)
outputs = layers.Dense(1000, activation='softmax')(x)
model = Model(inputs, outputs)
It should look like below:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) (None, 224, 224, 3) 0
__________________________________________________________________________________________________
conv2d_36 (Conv2D) (None, 112, 112, 64) 9472 input_2[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 112, 112, 64) 256 conv2d_36[0][0]
__________________________________________________________________________________________________
re_lu_4 (ReLU) (None, 112, 112, 64) 0 batch_normalization_5[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 56, 56, 64) 0 re_lu_4[0][0]
__________________________________________________________________________________________________
conv2d_37 (Conv2D) (None, 56, 56, 128) 8320 max_pooling2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 56, 56, 128) 512 conv2d_37[0][0]
__________________________________________________________________________________________________
re_lu_5 (ReLU) (None, 56, 56, 128) 0 batch_normalization_6[0][0]
__________________________________________________________________________________________________
lambda_33 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_34 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_35 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_36 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_37 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_38 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_39 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_40 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_41 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_42 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_43 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_44 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_45 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_46 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_47 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_48 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_49 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_50 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_51 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_52 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_53 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_54 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_55 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_56 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_57 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_58 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_59 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_60 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_61 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_62 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_63 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
lambda_64 (Lambda) (None, 56, 56, 4) 0 re_lu_5[0][0]
__________________________________________________________________________________________________
conv2d_38 (Conv2D) (None, 56, 56, 4) 148 lambda_33[0][0]
__________________________________________________________________________________________________
conv2d_39 (Conv2D) (None, 56, 56, 4) 148 lambda_34[0][0]
__________________________________________________________________________________________________
conv2d_40 (Conv2D) (None, 56, 56, 4) 148 lambda_35[0][0]
__________________________________________________________________________________________________
conv2d_41 (Conv2D) (None, 56, 56, 4) 148 lambda_36[0][0]
__________________________________________________________________________________________________
conv2d_42 (Conv2D) (None, 56, 56, 4) 148 lambda_37[0][0]
__________________________________________________________________________________________________
conv2d_43 (Conv2D) (None, 56, 56, 4) 148 lambda_38[0][0]
__________________________________________________________________________________________________
conv2d_44 (Conv2D) (None, 56, 56, 4) 148 lambda_39[0][0]
__________________________________________________________________________________________________
conv2d_45 (Conv2D) (None, 56, 56, 4) 148 lambda_40[0][0]
__________________________________________________________________________________________________
conv2d_46 (Conv2D) (None, 56, 56, 4) 148 lambda_41[0][0]
__________________________________________________________________________________________________
conv2d_47 (Conv2D) (None, 56, 56, 4) 148 lambda_42[0][0]
__________________________________________________________________________________________________
conv2d_48 (Conv2D) (None, 56, 56, 4) 148 lambda_43[0][0]
__________________________________________________________________________________________________
conv2d_49 (Conv2D) (None, 56, 56, 4) 148 lambda_44[0][0]
__________________________________________________________________________________________________
conv2d_50 (Conv2D) (None, 56, 56, 4) 148 lambda_45[0][0]
__________________________________________________________________________________________________
conv2d_51 (Conv2D) (None, 56, 56, 4) 148 lambda_46[0][0]
__________________________________________________________________________________________________
conv2d_52 (Conv2D) (None, 56, 56, 4) 148 lambda_47[0][0]
__________________________________________________________________________________________________
conv2d_53 (Conv2D) (None, 56, 56, 4) 148 lambda_48[0][0]
__________________________________________________________________________________________________
conv2d_54 (Conv2D) (None, 56, 56, 4) 148 lambda_49[0][0]
__________________________________________________________________________________________________
conv2d_55 (Conv2D) (None, 56, 56, 4) 148 lambda_50[0][0]
__________________________________________________________________________________________________
conv2d_56 (Conv2D) (None, 56, 56, 4) 148 lambda_51[0][0]
__________________________________________________________________________________________________
conv2d_57 (Conv2D) (None, 56, 56, 4) 148 lambda_52[0][0]
__________________________________________________________________________________________________
conv2d_58 (Conv2D) (None, 56, 56, 4) 148 lambda_53[0][0]
__________________________________________________________________________________________________
conv2d_59 (Conv2D) (None, 56, 56, 4) 148 lambda_54[0][0]
__________________________________________________________________________________________________
conv2d_60 (Conv2D) (None, 56, 56, 4) 148 lambda_55[0][0]
__________________________________________________________________________________________________
conv2d_61 (Conv2D) (None, 56, 56, 4) 148 lambda_56[0][0]
__________________________________________________________________________________________________
conv2d_62 (Conv2D) (None, 56, 56, 4) 148 lambda_57[0][0]
__________________________________________________________________________________________________
conv2d_63 (Conv2D) (None, 56, 56, 4) 148 lambda_58[0][0]
__________________________________________________________________________________________________
conv2d_64 (Conv2D) (None, 56, 56, 4) 148 lambda_59[0][0]
__________________________________________________________________________________________________
conv2d_65 (Conv2D) (None, 56, 56, 4) 148 lambda_60[0][0]
__________________________________________________________________________________________________
conv2d_66 (Conv2D) (None, 56, 56, 4) 148 lambda_61[0][0]
__________________________________________________________________________________________________
conv2d_67 (Conv2D) (None, 56, 56, 4) 148 lambda_62[0][0]
__________________________________________________________________________________________________
conv2d_68 (Conv2D) (None, 56, 56, 4) 148 lambda_63[0][0]
__________________________________________________________________________________________________
conv2d_69 (Conv2D) (None, 56, 56, 4) 148 lambda_64[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 56, 56, 128) 0 conv2d_38[0][0]
conv2d_39[0][0]
conv2d_40[0][0]
conv2d_41[0][0]
conv2d_42[0][0]
conv2d_43[0][0]
conv2d_44[0][0]
conv2d_45[0][0]
conv2d_46[0][0]
conv2d_47[0][0]
conv2d_48[0][0]
conv2d_49[0][0]
conv2d_50[0][0]
conv2d_51[0][0]
conv2d_52[0][0]
conv2d_53[0][0]
conv2d_54[0][0]
conv2d_55[0][0]
conv2d_56[0][0]
conv2d_57[0][0]
conv2d_58[0][0]
conv2d_59[0][0]
conv2d_60[0][0]
conv2d_61[0][0]
conv2d_62[0][0]
conv2d_63[0][0]
conv2d_64[0][0]
conv2d_65[0][0]
conv2d_66[0][0]
conv2d_67[0][0]
conv2d_68[0][0]
conv2d_69[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 56, 56, 128) 512 concatenate_2[0][0]
__________________________________________________________________________________________________
re_lu_6 (ReLU) (None, 56, 56, 128) 0 batch_normalization_7[0][0]
__________________________________________________________________________________________________
conv2d_71 (Conv2D) (None, 56, 56, 256) 16640 max_pooling2d_2[0][0]
__________________________________________________________________________________________________
conv2d_70 (Conv2D) (None, 56, 56, 256) 33024 re_lu_6[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 56, 56, 256) 1024 conv2d_71[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 56, 56, 256) 1024 conv2d_70[0][0]
__________________________________________________________________________________________________
add_2 (Add) (None, 56, 56, 256) 0 batch_normalization_9[0][0]
batch_normalization_8[0][0]
REMOVED for ...
batch_normalization_53 (BatchNo (None, 7, 7, 1024) 4096 concatenate_17[0][0]
__________________________________________________________________________________________________
re_lu_51 (ReLU) (None, 7, 7, 1024) 0 batch_normalization_53[0][0]
__________________________________________________________________________________________________
conv2d_584 (Conv2D) (None, 7, 7, 2048) 2099200 re_lu_51[0][0]
__________________________________________________________________________________________________
batch_normalization_54 (BatchNo (None, 7, 7, 2048) 8192 conv2d_584[0][0]
__________________________________________________________________________________________________
add_17 (Add) (None, 7, 7, 2048) 0 re_lu_49[0][0]
batch_normalization_54[0][0]
__________________________________________________________________________________________________
re_lu_52 (ReLU) (None, 7, 7, 2048) 0 add_17[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_1 (Glo (None, 2048) 0 re_lu_52[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 1000) 2049000 global_average_pooling2d_1[0][0]
==================================================================================================
Total params: 26,493,160
Trainable params: 26,432,104
Non-trainable params: 61,056
In [ ]:
model.summary()