Activation functions

Why do we need tf.nn.softmax_cross_entropy_with_logits ?


In [59]:
print(tf.nn.softmax_cross_entropy_with_logits.__doc__)


Computes softmax cross entropy between `logits` and `labels`.

  Measures the probability error in discrete classification tasks in which the
  classes are mutually exclusive (each entry is in exactly one class).  For
  example, each CIFAR-10 image is labeled with one and only one label: an image
  can be a dog or a truck, but not both.

  **NOTE:**  While the classes are mutually exclusive, their probabilities
  need not be.  All that is required is that each row of `labels` is
  a valid probability distribution.  If they are not, the computation of the
  gradient will be incorrect.

  If using exclusive `labels` (wherein one and only
  one class is true at a time), see `sparse_softmax_cross_entropy_with_logits`.

  **WARNING:** This op expects unscaled logits, since it performs a `softmax`
  on `logits` internally for efficiency.  Do not call this op with the
  output of `softmax`, as it will produce incorrect results.

  `logits` and `labels` must have the same shape `[batch_size, num_classes]`
  and the same dtype (either `float16`, `float32`, or `float64`).

  **Note that to avoid confusion, it is required to pass only named arguments to
  this function.**

  Args:
    _sentinel: Used to prevent positional parameters. Internal, do not use.
    labels: Each row `labels[i]` must be a valid probability distribution.
    logits: Unscaled log probabilities.
    dim: The class dimension. Defaulted to -1 which is the last dimension.
    name: A name for the operation (optional).

  Returns:
    A 1-D `Tensor` of length `batch_size` of the same type as `logits` with the
    softmax cross entropy loss.
  

Definition:

$softmax(x) = \frac{\exp(x)}{\sum_j \exp(x_j)}$

What do we want:

$layer(x) = \frac{\exp(W x + b)}{\sum_j \exp(W x_j + b)}$

How we did it in practice:

tf.nn.softmax_cross_entropy_with_logits

Why not FullyConnected + SoftMax?

Numeric error!

$\sum_{i=1}^N \log softmax_i(x_i) = \sum_{i=1}^N \sum_{j=1}^C [y_i = j] \log softmax_j(x_i) =$

$\sum_{i=1}^N \sum_{j=1}^C [y_i = j](x_{ij} - \log \sum_k \exp(x_k) =$

$\sum_{i=1}^N \sum_{j=1}^C [y_i = j](x_{ij} - \log exp(x_{max})(\sum_k \exp(x_k - x_{max}))$


In [60]:
import tensorflow as tf
from keras.layers.advanced_activations import LeakyReLU, PReLU

In [62]:
def LeakyRelu(x, alpha):
    return tf.maximum(alpha*x, x)

In [48]:
with tf.Session() as sess:
    inp = tf.Variable(initial_value=tf.random_uniform(shape=[5], minval=-5, maxval=5, dtype=tf.float32))
    alpha = 0.5
    res = LeakyRelu(inp, alpha)
    sess.run(tf.global_variables_initializer())
    before, after = sess.run([inp, res])
    print('before', before)
    print('after', after)


before [-4.82597351  0.86987257  2.20547295  4.79587078  0.65573597]
after [-2.41298676  0.86987257  2.20547295  4.79587078  0.65573597]

In [50]:
def PRelu(x):
    alpha = tf.Variable(initial_value=tf.random_normal(shape=x.shape))
    return tf.where(x < 0, alpha * x, tf.nn.relu(x))

In [51]:
with tf.Session() as sess:
    inp = tf.Variable(initial_value=tf.random_uniform(shape=[5], minval=-5, maxval=5, dtype=tf.float32))
    alpha = 0.5
    res = PRelu(inp)
    sess.run(tf.global_variables_initializer())
    before, after = sess.run([inp, res])
    print('before', before)
    print('after', after)


before [ 0.5338788   3.46164608  4.41276932  1.99321985 -2.14739799]
after [ 0.5338788   3.46164608  4.41276932  1.99321985  3.72888207]

In [66]:
def spp_layer(input_, levels=[2, 1], name = 'SPP_layer'):
    '''Multiple Level SPP layer.
       Works for levels=[1, 2, 3, 6].'''
    shape = input_.get_shape().as_list()
    with tf.variable_scope(name):
        pool_outputs = []
        for l in levels:
            pool = tf.nn.max_pool(input_, ksize=[1, np.ceil(shape[1] * 1. / l).astype(np.int32),
                                                 np.ceil(shape[2] * 1. / l).astype(np.int32), 1], 
                                  strides=[1, np.floor(shape[1] * 1. / l + 1).astype(np.int32),
                                           np.floor(shape[2] * 1. / l + 1), 1], 
                                  padding='SAME')
            pool_outputs.append(tf.reshape(pool, [shape[0], -1]))
        spp_pool = tf.concat(1, pool_outputs)
    return spp_pool

Spartial Pyramid Pooling coming soon

PR: https://github.com/tensorflow/tensorflow/pull/12852/files


In [ ]: