DRAW: A Recurrent Neural Network For Image Generation (2015 ICML)

Deep Recurrent Atten- tive Writer (DRAW)

  • Iterative Drawing(This paper's) VS One-shot drawing(Previous approach)
  • Mimic the foveation of the human eye
    • Spatial Attention
    • A Sequential variational auto-encoding framework
  • Resembles the selective read and write operations developed for the Neural Turing Machine

Variational Auto-encoder

  • Autoencoder, Encoder, Decoder
  • Use variational upper bound on log-likelihood of the data as Loss

Descriptions

  • both the encoder and decoder are recurrent networks
  • moreover the encoder is privy to the decoder’s previous outputs
  • A dynamically updated attention mechanism
    • restrict both
      • the input region observed by the encoder
      • the output region modified by the decoder.

Overall Networks

Attention Mechanism

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models (2016 NIPS)

Introduction

  • the intersection of structured probabilistic models and deep networks.
    • Prior work on deep generative methods (e.g., VAEs [9]) have been mostly unstructured
      • Pro: producing impressive samples and likelihood scores
      • Con: their representations have lacked interpretable meaning.

Inference

  • Attend, Infer, Repeat(AIR) Framework
    • treat inference as an iterative process,
      • implemented as a recurrent neural network that attends to one object at a time
      • learns to use an appropriate number of inference steps for each image.
    • End-to-end learning is enabled by recent advances in amortized variational inference
      • e.g., combining gradient based optimization for continuous latent variables with black-box optimization for discrete ones.

Learning

Models

What + Where

Experiments


In [ ]: