April 8: Introduction, overview of the fields of reinforcement learning and robot learning
April 15: Bandit problems, Bayesian optimization
April 22: Categories of RL / MDP / value function / Bellman equations
April 30: DQN / value & policy iteration / RL as density estimation / MaxEnt and Boltzmann distribution / Policy gradient
May 13: Distributed Q-learning / QT-Opt / Policy learning
May 20: Policy gradient, REINFORCE / variance reduction / policy gradient with function approximation
May 27: On-policy actor critic / NAC / A3C, TRPO / PPO / GAE
June 10: Off-policy actor critic, DDPG / TD3 / KL-regularized RL / SAC / MPO
June 17: Inverse RL
June 24: Gradient-free methods / Model-based RL
July 1: Offline RL / AWAC/ implicit Q-learning / TD3+BC / large models / diffusion policies
July 8: Advanced topic (Hierarchical RL, goal-conditioned RL / HER, intrinsic reward)
July 15: Review & wrap up