In this tutorial we will look at the working of TensorGraph layer with TensorFlow eager. But before that let's see what exactly is TensorFlow eager.
Eager execution is an imperative, define-by-run interface where operations are executed immediately as they are called from Python. In other words, eager execution is a feature that makes TensorFlow execute operations immediately. Concrete values are returned instead of a computational graph to be executed later. As a result:
In [1]:
import tensorflow as tf
import tensorflow.contrib.eager as tfe
After importing neccessary modules, at the program startup we invoke enable_eager_execution()
.
In [2]:
tfe.enable_eager_execution()
Enabling eager execution changes how TensorFlow functions behave. Tensor objects return concrete values instead of being a symbolic reference to nodes in a static computational graph(non-eager mode). As a result, eager execution should be enabled at the beginning of a program.
Note that with eager execution enabled, these operations consume and return multi-dimensional arrays as Tensor
objects, similar to NumPy ndarrays
In [3]:
import numpy as np
import deepchem as dc
from deepchem.models.tensorgraph import layers
In the following snippet we describe how to create a Dense
layer in eager mode. The good thing about calling a layer as a function is that we don't have to call create_tensor()
directly. This is identical to tensorflow API and has no conflict. And since eager mode is enabled, it should return concrete tensors right away.
In [4]:
# Initialize parameters
in_dim = 2
out_dim = 3
batch_size = 10
inputs = np.random.rand(batch_size, in_dim).astype(np.float32) #Input
layer = layers.Dense(out_dim) # Provide the number of output values as parameter. This creates a Dense layer
result = layer(inputs) #get the ouput tensors
print(result)
Creating a second Dense
layer should produce different results.
In [5]:
layer2 = layers.Dense(out_dim)
result2 = layer2(inputs)
print(result2)
We can also execute the layer in eager mode to compute its output as a function of inputs. If the layer defines any variables, they are created the first time it is invoked. This happens in the same exact way that we would create a single layer in non-eager mode.
The following is also a way to create a layer in eager mode. The create_tensor()
is invoked by __call__()
object. This gives us an advantage of directly passing the tensor as a parameter while constructing a TensorGraph layer.
In [6]:
x = layers.Dense(out_dim)(inputs)
print(x)
Dense
layers are one of the layers defined in Deepchem. Along with it there are several others like Conv1D
, Conv2D
, conv3D
etc. We also take a look at how to construct a Conv1D
layer below.
Basically this layer creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs.
When using this layer as the first layer in a model, provide an input_shape
argument (tuple of integers or None
)
When the argument input_shape
is passed in as a tuple of integers e.g (2, 3) it would mean we are passing a sequence of 2 vectors of 3-Dimensional vectors.
And when it is passed as (None, 3) it means that we want variable-length sequences of 3-dimensional vectors.
In [7]:
from deepchem.models.tensorgraph.layers import Conv1D
In [8]:
width = 5
in_channels = 2
filters = 3
kernel_size = 2
batch_size = 5
inputs = np.random.rand(batch_size, width, in_channels).astype(
np.float32)
layer = layers.Conv1D(filters, kernel_size)
result = layer(inputs)
print(result)
Again it should be noted that creating a second Conv1D
layer would producr different results.
So thats how we invoke different DeepChem layers in eager mode.
One of the other interesting point is that we can mix tensorflow layers and DeepChem layers. Since they all take tensors as inputs and return tensors as outputs, so you can take the output from one kind of layer and pass it as input to a different kind of layer. But it should be noted that tensorflow layers can't be added to a TensorGraph.
Now that we've generalised so much, we should actually see if deepchem supplies an identical workflow for layers to that of tensorflow. For instance, let's consider the code where we create a Dense
layer.
y = Dense(3)(input)
What the above line does is that it creates a dense layer with three outputs. It initializes the weights and the biases. And then it multiplies the input tensor by the weights.
Let's put the above statement in some mathematical terms. A Dense
layer has a matrix of weights of shape (M, N)
, where M is the number of outputs and N is the number of inputs. The first time we call it, the layer sets N based on the shape of the input we passed to it and creates the weight matrix.
In [9]:
_input = tf.random_normal([2, 3])
print(_input)
layer = layers.Dense(4) # A DeepChem Dense layer
result = layer(_input)
print(result)
This is exactly how a tensorflow Dense
layer works. It implements the same operation as that of DeepChem's Dense
layer i.e., outputs = activation(inputs.kernel + bias)
where kernel
is the weights matrix created by the layer, and bias
is a bias vector created by the layer.
In [10]:
result = tf.layers.dense(_input, units=4) # A tensorflow Dense layer
print(result)
We pass a tensor input to that of tensorflow Dense
layer and recieve an output tensor that has the same shape as that of input except the last dimension is that of ouput space.
Finding gradients under eager mode is much similar to the autograd
API. The computational flow is very clean and logical.
What happens is that different operations can occur during each call, all forward operations are recorded to a tape, which is then played backwards when computing gradients. After the gradients have been computed, the tape is discared.
In [11]:
def dense_squared(x):
return layers.Dense(1)(layers.Dense(1)(inputs))
grad = tfe.gradients_function(dense_squared)
print(dense_squared(3.0))
print(grad(3.0))
In the above example, The gradients_function
call takes a Python function dense_squared()
as an argument and returns a Python callable that computes the partial derivatives of dense_squared()
with respect to its inputs.
In [ ]: