Tensorflow

Large-Scale Machine Learning on

Heterogeneous Distributed Systems





(Preliminary White Paper, by google, November 9, 2015)

Outline

  • What is tensorflow ?
  • Programming Model and Basic Concepts
    • Operations and Kernels
    • Sessions
    • Variables
  • Implementation
    • Devices
    • Variables
  • Gradient Computation
  • Input Operations
  • Queues
  • Tools: Tensorboard
  • Tools: Visualization of Summary Data

Whats is tensorflow?

A heterogeneous, distributed system to specify various machine learning algorithms, and an implementation for executing such algorithm.

A tensor is a typed, multi-dimensional array with support for variety of element types.

Programming Model and Basic Concepts

  • Computation are represented by directed graph
    • Each node has one or more input and output, and represent instantiation of an operation
    • Value flowing along these edges are called tensor
  • Nodes for persistent state and for branching and looping control
  • Python and C++ as supported front end languages for building computational graph
  • Example TensorFlow code fragment

    From: google whitepaper on tensorflow

    computation graph

    From: google whitepaper on tensorflow

    Operations and Kernels

  • An operation has a name and represents an abstract computation (e.g., “matrix multiply”, or “add”).
  • A kernel is a particular implementation of an operation for running on CPU or GPU.
  • Session

    • Client program interact with tensorflow system via session.
    • One of the main operation is Run method
    • tensorflow compute the transitive closure of all nodes that must be executed in order to compute the outputs in Run

    In most computations a graph is executed multiple times. Most tensors do not survive past a single execution of the graph.

    Variables




    Variable is a special kind of operation that returns a handle to

    a persistent mutable tensor that survives across executions of a graph.

    Implementation

    • Client uses session interface to communicate with master and one or more worker processes.
    • worker process are responsible for arbitrating access to CPU cores or GPU cards.
    • Worker execute graph nodes on those devices as instructed by the master.

    Devices

    • Devices are the computational heart of TensorFlow.
    • Each worker manges one more more devices.
    • Example device names are "/job:localhost/device:cpu:0" or "/job:worker/task:17/device:gpu:3"

    Gradient Computation


    Tensorflow has inbuild support for automatic gradient computation.

    From: google whitepaper on tensorflow

    Input Operations

    • Special input operation nodes for training large scale model.
    • Typically configured with filesnames

    Queues

    • Allows hand off data thourgh Enqueue and Dequeue.
    • Allow different portions of the graph to execute asynchronously.
    • Allow data to be prefetched into queue while other previous batch is used.

    Tools: Tensorboard

    Allows Visualization of graph.

    From: google whitepaper on tensorflow

    Tools: Visualization of Summary Data


    Allows examination the state of various aspectsof the model

    From: google whitepaper on tensorflow