TensorGraph Layers and TensorFlow eager

In this tutorial we will look at the working of TensorGraph layer with TensorFlow eager. But before that let's see what exactly is TensorFlow eager.

Eager execution is an imperative, define-by-run interface where operations are executed immediately as they are called from Python. In other words, eager execution is a feature that makes TensorFlow execute operations immediately. Concrete values are returned instead of a computational graph to be executed later. As a result:

  • It allows writing imperative coding style like numpy
  • Provides fast debugging with immediate run-time errors and integration with Python tools
  • Strong support for higher-order gradients

In [1]:
import tensorflow as tf
import tensorflow.contrib.eager as tfe


/home/nitin/anaconda3/envs/deepchem/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

After importing neccessary modules, at the program startup we invoke enable_eager_execution().


In [2]:
tfe.enable_eager_execution()

Enabling eager execution changes how TensorFlow functions behave. Tensor objects return concrete values instead of being a symbolic reference to nodes in a static computational graph(non-eager mode). As a result, eager execution should be enabled at the beginning of a program.

Note that with eager execution enabled, these operations consume and return multi-dimensional arrays as Tensor objects, similar to NumPy ndarrays

Dense layer


In [3]:
import numpy as np
import deepchem as dc
from deepchem.models.tensorgraph import layers

In the following snippet we describe how to create a Dense layer in eager mode. The good thing about calling a layer as a function is that we don't have to call create_tensor() directly. This is identical to tensorflow API and has no conflict. And since eager mode is enabled, it should return concrete tensors right away.


In [4]:
# Initialize parameters
in_dim = 2
out_dim = 3
batch_size = 10

inputs = np.random.rand(batch_size, in_dim).astype(np.float32) #Input 

layer = layers.Dense(out_dim) # Provide the number of output values as parameter. This creates a Dense layer
result = layer(inputs) #get the ouput tensors

print(result)


tf.Tensor(
[[-0.01966965 -0.36708647 -1.2114673 ]
 [ 0.4068503  -0.1943181  -1.1845192 ]
 [-0.1708395  -0.4865471  -1.4172423 ]
 [ 0.18667166 -0.24077846 -1.0544432 ]
 [ 0.47903097 -0.5307548  -2.4122753 ]
 [ 0.215595   -0.3206781  -1.3613585 ]
 [ 0.53623885 -0.29272282 -1.6845896 ]
 [ 0.33747557 -0.49973956 -2.1234663 ]
 [ 0.6471517  -0.29422784 -1.8340569 ]
 [ 0.52365005 -0.25159127 -1.5295881 ]], shape=(10, 3), dtype=float32)

Creating a second Dense layer should produce different results.


In [5]:
layer2 = layers.Dense(out_dim)
result2 = layer2(inputs)

print(result2)


tf.Tensor(
[[-0.2995599  -0.94036776 -0.12444651]
 [-0.7348614  -0.3941293  -0.6263947 ]
 [-0.19710277 -1.2823544   0.02952629]
 [-0.47212085 -0.5672255  -0.34971827]
 [-1.1339976  -1.2335718  -0.86162925]
 [-0.5831823  -0.7636563  -0.42140976]
 [-1.0011474  -0.6127606  -0.84064883]
 [-0.9108876  -1.189698   -0.6587274 ]
 [-1.1556706  -0.5890429  -0.9902593 ]
 [-0.9471516  -0.5110685  -0.80683327]], shape=(10, 3), dtype=float32)

We can also execute the layer in eager mode to compute its output as a function of inputs. If the layer defines any variables, they are created the first time it is invoked. This happens in the same exact way that we would create a single layer in non-eager mode.

The following is also a way to create a layer in eager mode. The create_tensor() is invoked by __call__() object. This gives us an advantage of directly passing the tensor as a parameter while constructing a TensorGraph layer.


In [6]:
x = layers.Dense(out_dim)(inputs)

print(x)


tf.Tensor(
[[ 0.5611591  -0.22970255  0.11273235]
 [ 0.07900402 -0.62115765 -0.75677997]
 [ 0.8194277  -0.13113126  0.43268698]
 [ 0.263784   -0.38960344 -0.31656086]
 [ 0.5461697  -0.939682   -0.82996976]
 [ 0.36857378 -0.4793542  -0.35699493]
 [ 0.15906434 -0.84395605 -0.990051  ]
 [ 0.57359767 -0.7488086  -0.559262  ]
 [ 0.10336368 -0.9777839  -1.2067692 ]
 [ 0.10391745 -0.8005077  -0.9737376 ]], shape=(10, 3), dtype=float32)

Conv1D layer

Dense layers are one of the layers defined in Deepchem. Along with it there are several others like Conv1D, Conv2D, conv3D etc. We also take a look at how to construct a Conv1D layer below.

Basically this layer creates a convolution kernel that is convolved with the layer input over a single spatial (or temporal) dimension to produce a tensor of outputs. When using this layer as the first layer in a model, provide an input_shape argument (tuple of integers or None)

When the argument input_shape is passed in as a tuple of integers e.g (2, 3) it would mean we are passing a sequence of 2 vectors of 3-Dimensional vectors. And when it is passed as (None, 3) it means that we want variable-length sequences of 3-dimensional vectors.


In [7]:
from deepchem.models.tensorgraph.layers import Conv1D

In [8]:
width = 5
in_channels = 2
filters = 3
kernel_size = 2
batch_size = 5

inputs = np.random.rand(batch_size, width, in_channels).astype(
    np.float32)

layer = layers.Conv1D(filters, kernel_size)

result = layer(inputs)
print(result)


tf.Tensor(
[[[-0.4163494   0.82620853 -0.19792846]
  [ 0.11079708  0.63339764 -0.4348114 ]
  [-0.08490932  0.9501296  -0.78048754]
  [-0.11654931  1.3595564  -0.81889915]]

 [[-0.04658069  0.32090923 -0.01684245]
  [ 0.42333373  0.5015999  -0.51524043]
  [-0.14566499  1.3786271  -0.8008883 ]
  [-0.39936984  1.0127177  -0.41432446]]

 [[-0.36023882  1.2730068  -0.6852635 ]
  [-0.33337924  0.9590143  -0.4819267 ]
  [-0.17271379  0.38323316 -0.16304907]
  [ 0.03019462  0.4663608  -0.29353675]]

 [[-0.37523568  1.1785955  -0.6067249 ]
  [-0.3328942   1.0252076  -0.6913832 ]
  [-0.32247764  0.54115754 -0.28387934]
  [ 0.07658018  0.7736342  -0.5747899 ]]

 [[-0.51029617  0.92260885 -0.25738803]
  [ 0.00583639  0.16965131 -0.01829494]
  [ 0.21363875  0.17591321 -0.17815764]
  [ 0.22821559  0.739758   -0.5263671 ]]], shape=(5, 4, 3), dtype=float32)

Again it should be noted that creating a second Conv1D layer would producr different results.

So thats how we invoke different DeepChem layers in eager mode.

One of the other interesting point is that we can mix tensorflow layers and DeepChem layers. Since they all take tensors as inputs and return tensors as outputs, so you can take the output from one kind of layer and pass it as input to a different kind of layer. But it should be noted that tensorflow layers can't be added to a TensorGraph.

Workflow of DeepChem layers

Now that we've generalised so much, we should actually see if deepchem supplies an identical workflow for layers to that of tensorflow. For instance, let's consider the code where we create a Dense layer.

y = Dense(3)(input)

What the above line does is that it creates a dense layer with three outputs. It initializes the weights and the biases. And then it multiplies the input tensor by the weights.

Let's put the above statement in some mathematical terms. A Dense layer has a matrix of weights of shape (M, N), where M is the number of outputs and N is the number of inputs. The first time we call it, the layer sets N based on the shape of the input we passed to it and creates the weight matrix.


In [9]:
_input = tf.random_normal([2, 3])
print(_input)

layer = layers.Dense(4) # A DeepChem Dense layer
result = layer(_input)
print(result)


tf.Tensor(
[[-1.2747569  -1.2716365  -1.6619799 ]
 [ 0.01867434 -0.25217438  0.9760794 ]], shape=(2, 3), dtype=float32)
tf.Tensor(
[[-4.077428    1.9740283  -3.2914114   0.6148914 ]
 [ 1.3582385  -0.1323605   0.70408916 -1.2962227 ]], shape=(2, 4), dtype=float32)

This is exactly how a tensorflow Dense layer works. It implements the same operation as that of DeepChem's Dense layer i.e., outputs = activation(inputs.kernel + bias) where kernel is the weights matrix created by the layer, and bias is a bias vector created by the layer.


In [10]:
result = tf.layers.dense(_input, units=4) # A tensorflow Dense layer
print(result)


tf.Tensor(
[[ 0.10799867 -3.0421743   1.1339507   0.85969573]
 [ 0.49897006  0.54293513 -0.617443   -0.551059  ]], shape=(2, 4), dtype=float32)

We pass a tensor input to that of tensorflow Dense layer and recieve an output tensor that has the same shape as that of input except the last dimension is that of ouput space.

Gradients

Finding gradients under eager mode is much similar to the autograd API. The computational flow is very clean and logical. What happens is that different operations can occur during each call, all forward operations are recorded to a tape, which is then played backwards when computing gradients. After the gradients have been computed, the tape is discared.


In [11]:
def dense_squared(x):
    return layers.Dense(1)(layers.Dense(1)(inputs))

grad = tfe.gradients_function(dense_squared)

print(dense_squared(3.0))
print(grad(3.0))


tf.Tensor(
[[[0.70184654]
  [0.14101903]
  [0.40004665]
  [0.65922797]
  [0.7357256 ]]

 [[0.2765366 ]
  [0.07410333]
  [0.6447927 ]
  [0.6917898 ]
  [0.29831788]]

 [[0.70695525]
  [0.5157202 ]
  [0.31305453]
  [0.13231038]
  [0.25589827]]

 [[0.6361409 ]
  [0.4190178 ]
  [0.43067047]
  [0.17994796]
  [0.49510527]]

 [[0.7120138 ]
  [0.12389299]
  [0.05375208]
  [0.27753747]
  [0.6016071 ]]], shape=(5, 5, 1), dtype=float32)
[None]

In the above example, The gradients_function call takes a Python function dense_squared() as an argument and returns a Python callable that computes the partial derivatives of dense_squared() with respect to its inputs.


In [ ]: