Lecture 11 - Deep Reinforcement Learning

Helpful Materials

1.0 - What is Machine Learning?

1.1 - Machine Learning Applications

2.0 - What is Deep Learning?

2.1 - Image Recognition

2.2 - Simple Neural Networks (Perceptron)

2.3 - Building Blocks of Neural Networks

2.4 - Deep NNs

2.5 - Training Neural Networks

2.5.1 - Optimisation / Function Approximation

2.6 - Types of Machine Learning

2.0 - Reinforcement Learning

2.1 - Deep RL

2.2 - Reinforcement Learning Recap

2.2.1 - Policy

2.3 - Deep RL Algorithms

Value Learning

  1. Find Q(s,a)Q(s,a)
  2. a=argmaxaQ(s,a)a=argmax_a Q(s,a)

Policy Learning

  1. Find π(s)\pi(s)
  2. Sample aπ(s)a\sim \pi(s)

2.4 - Value Learning RL

2.4.1 - Q-Value Approximation

We want to learn features from pixels!

2.4.2 - DNNs as a Q-Value Approximator

To solve this, we can use a DNN as a Q(s,a) approximator

2.4.3 - Training DQNs

🧠 This is very similar to using TD target to make Q-Learning estimates closer to the actual value.

2.4.4 - DQN for Atari

Untitled

DQN Doesn't Always Work Well

2.5 - Downsides of Q-Learning

2.6 - Policy Gradient

2.6.1 - Training Policy Gradient

2.7 - Summary