Generating Embedding visualization on Tensorboard

This was based on the great tutorial by @odduer, available here, it shows how to get cifar10 visualized on Tensorboard!

Here you'll find an example with word vector visualization :)

Load embedding

We'll be using a embedding matrix that is trained using GloVe, a word vector generation model. The matrix will contain 400,000 word vectors, each with a dimensionality of 50.


In [30]:
import numpy as np

word_vector = np.load('word_vector.npy')
print(word_vector.shape)


(400000, 50)

Choose a directory to save the embedding


In [31]:
LOG_DIR = 'tensorboard/embedding/'

Generate metadata


In [32]:
word_list = np.load('word_list.npy') # this maps an index to a word
word_list = word_list.tolist() # originally loaded as numpy array

# create metadata.tsv file
metadata_file = open(os.path.join(LOG_DIR, 'metadata.tsv'), "w+")

# write header
metadata_file.write('Word\tIndex\n')

# write metadata mapping
for index, word in enumerate(word_list):
    metadata_file.write(str(word) + '\t' + str(index) + '\n')

Generate TensorBoard Visualization


In [33]:
import tensorflow as tf
from tensorflow.contrib.tensorboard.plugins import projector
print('Tested with TensorFlow 1.2.0')
print('Your TensorFlow version:', tf.__version__)

import os

# create a tf.Variable to store the embedding
embedding_var = tf.Variable(word_vector,  name='embedding')

with tf.Session() as sess:
    # initialize variable
    sess.run(embedding_var.initializer)
    
    # creates a summary writer
    summary_writer = tf.summary.FileWriter(LOG_DIR)
    
    # adds embedding in tensorboard
    config = projector.ProjectorConfig()
    embedding = config.embeddings.add()
    embedding.tensor_name = embedding_var.name

    # adds a metadata, comment the line bellow if you don't have
    # a metadata
    embedding.metadata_path = 'metadata.tsv'

    projector.visualize_embeddings(summary_writer, config)
    saver = tf.train.Saver([embedding_var])
    saver.save(sess, os.path.join(LOG_DIR, 'embedding.ckpt'), 1)


Tested with TensorFlow 1.2.0
Your TensorFlow version: 1.2.0

Done

If you want to you can upload the metadata using the tensorboard web interface as well :)


In [29]:
## TODO ## Insert screenshots of embedding