GCL情報理工学特別講義Ⅶ （強化学習）

GCL情報理工学特別講義Ⅶ （強化学習）／ GCL Special Lecture in Information Science and Technology Ⅶ （Reinforcement Learning）

This course aims to provide an overview of reinforcement learning (RL) algorithms. RL has achieved remarkable success in various applications, including robotic manipulation and autonomous driving. In addition, RL is also used to fine-tune large language models, and the potential of RL is still expanding. We start with basic concepts such as a Markov decision process and the value functions and see popular algorithms such as TD3 and Soft actor critic.

MIMA Search

授業計画

April 9: Introduction, overview of the fields of reinforcement learning and robot learning April 16: Bandit problems, Bayesian optimization April 23: Categories of RL / MDP / value function / Bellman equations April 30: DQN / value&policy iteration / RL as density estimation / MaxEnt and Boltzmann distribution / Policy gradient May 14: No lecture May 21: Symposium on Robot Learning (in person, Takeda Science Frontier hall) May 28: Policy gradient, REINFORCE / variance reduction / policy gradient with function approximation June 4: On-policy actor critic / NAC / A3C, TRPO / PPO / GAE June 11: Off-policy actor critic, DDPG / TD3 / SAC / QT-Opt / MPO July 18: Inverse RL June 25: Gradient-free methods / Model-based RL July 2: Offline RL / AWAC/ implicit Q-learning / TD3+BC / large models / diffusion policies July 9: Advanced topic (Rainbow, Distributed RL, domain randomization, adversarial training, curriculum learning, HRL, goal-conditioned RL / HER, intrinsic reward)

授業の方法

Online lecture via Zoom except May 21st. On May 21st, there will be a symposium on robot learning organized by a lecturer of this course, and students are expected to attend the symposium in person at Honga campus.

成績評価方法

attendance, and reports.

教科書

There is no specific text book for this lecture.

参考書

Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. Second Edition. MIT Press, Cambridge, MA, 2018.

履修上の注意

We expect students to review the materials for the previous lecture every week. Basic knowledge of machine learning, linear algebra, and probability theory is required.