notebook.community

DRAW: A Recurrent Neural Network For Image Generation (2015 ICML)

Deep Recurrent Atten- tive Writer (DRAW)

Iterative Drawing(This paper's) VS One-shot drawing(Previous approach)
Mimic the foveation of the human eye
- Spatial Attention
- A Sequential variational auto-encoding framework
Resembles the selective read and write operations developed for the Neural Turing Machine

Variational Auto-encoder

Autoencoder, Encoder, Decoder
Use variational upper bound on log-likelihood of the data as Loss

Descriptions

both the encoder and decoder are recurrent networks
moreover the encoder is privy to the decoder’s previous outputs
A dynamically updated attention mechanism
- restrict both
  - the input region observed by the encoder
  - the output region modified by the decoder.

Overall Networks

Attention Mechanism

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models (2016 NIPS)

Introduction

the intersection of structured probabilistic models and deep networks.
- Prior work on deep generative methods (e.g., VAEs [9]) have been mostly unstructured
  - Pro: producing impressive samples and likelihood scores
  - Con: their representations have lacked interpretable meaning.

Inference

Attend, Infer, Repeat(AIR) Framework
- treat inference as an iterative process,
  - implemented as a recurrent neural network that attends to one object at a time
  - learns to use an appropriate number of inference steps for each image.
- End-to-end learning is enabled by recent advances in amortized variational inference
  - e.g., combining gradient based optimization for continuous latent variables with black-box optimization for discrete ones.

Learning

Models

What + Where

Experiments



In [ ]: