Ch 11: Concept 02

Embedding Lookup

Import TensorFlow, and begin an interactive session


In [1]:
import tensorflow as tf
sess = tf.InteractiveSession()

Let's say we only have 4 words in our vocabulary: "the", "fight", "wind", and "like".

Maybe each word is associated with numbers.

Word Number
'the' 17
'fight' 22
'wind' 35
'like' 51

In [2]:
embeddings_0d = tf.constant([17,22,35,51])

Or maybe, they're associated with one-hot vectors.

Word Vector
'the ' [1, 0, 0, 0]
'fight' [0, 1, 0, 0]
'wind' [0, 0, 1, 0]
'like' [0, 0, 0, 1]

In [3]:
embeddings_4d = tf.constant([[1, 0, 0, 0],
                             [0, 1, 0, 0],
                             [0, 0, 1, 0],
                             [0, 0, 0, 1]])

This may sound over the top, but you can have any tensor you want, not just numbers or vectors.

Word Tensor
'the ' [[1, 0] , [0, 0]]
'fight' [[0, 1] , [0, 0]]
'wind' [[0, 0] , [1, 0]]
'like' [[0, 0] , [0, 1]]

In [4]:
embeddings_2x2d = tf.constant([[[1, 0], [0, 0]],
                               [[0, 1], [0, 0]],
                               [[0, 0], [1, 0]],
                               [[0, 0], [0, 1]]])

Let's say we want to find the embeddings for the sentence, "fight the wind".


In [5]:
ids = tf.constant([1, 0, 2])

We can use the embedding_lookup function provided by TensorFlow:


In [6]:
lookup_0d = sess.run(tf.nn.embedding_lookup(embeddings_0d, ids))
print(lookup_0d)


[22 17 35]

In [7]:
lookup_4d = sess.run(tf.nn.embedding_lookup(embeddings_4d, ids))
print(lookup_4d)


[[0 1 0 0]
 [1 0 0 0]
 [0 0 1 0]]

In [8]:
lookup_2x2d = sess.run(tf.nn.embedding_lookup(embeddings_2x2d, ids))
print(lookup_2x2d)


[[[0 1]
  [0 0]]

 [[1 0]
  [0 0]]

 [[0 0]
  [1 0]]]