notebook.community

Sutton

Slides
- https://webdocs.cs.ualberta.ca/~sutton/609%20dropbox/slides%20(pdf%20and%20keynote)/

David Silver (Deep Mind)

Slides
- http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html
youtube playlist
- https://www.youtube.com/watch?v=KHZVXao4qXs&list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT&index=7

Eligibility Traces

Eligibility traces unify and generalize TD and Monte Carlo methods. When TD
methods are augmented with eligibility traces, they produce a family of methods
spanning a spectrum that has Monte Carlo methods at one end (λ = 1) and one-
step TD methods at the other (λ = 0).

Policy Gradient

Understanding Baseline
We need to approximate V instead of Q *

UC Berkeley CS 294: Deep Reinforcement Learning, Spring 2017

Resources
- http://rll.berkeley.edu/deeprlcourse/

WildML Lecture Notes

http://www.wildml.com/2016/10/learning-reinforcement-learning/

모두의 연구소

Resources
- http://www.modulabs.co.kr/RL_library/3305