April 9: Introduction, overview of the fields of reinforcement learning and robot learning
April 16: Bandit problems, Bayesian optimization
April 23: Categories of RL / MDP / value function / Bellman equations
April 30: DQN / value&policy iteration / RL as density estimation / MaxEnt and Boltzmann distribution / Policy gradient
May 14: No lecture
May 21: Symposium on Robot Learning (in person, Takeda Science Frontier hall)
May 28: Policy gradient, REINFORCE / variance reduction / policy gradient with function approximation
June 4: On-policy actor critic / NAC / A3C, TRPO / PPO / GAE
June 11: Off-policy actor critic, DDPG / TD3 / SAC / QT-Opt / MPO
July 18: Inverse RL
June 25: Gradient-free methods / Model-based RL
July 2: Offline RL / AWAC/ implicit Q-learning / TD3+BC / large models / diffusion policies
July 9: Advanced topic (Rainbow, Distributed RL, domain randomization, adversarial training, curriculum learning, HRL, goal-conditioned RL / HER, intrinsic reward)