大学院
HOME 大学院 GCL情報理工学特別講義Ⅶ (強化学習)
学内のオンライン授業の情報漏洩防止のため,URLやアカウント、教室の記載は削除しております。
最終更新日:2024年4月22日

授業計画や教室は変更となる可能性があるため、必ずUTASで最新の情報を確認して下さい。
UTASにアクセスできない方は、担当教員または部局教務へお問い合わせ下さい。

GCL情報理工学特別講義Ⅶ (強化学習)

GCL情報理工学特別講義Ⅶ (強化学習)/ GCL Special Lecture in Information Science and Technology Ⅶ (Reinforcement Learning)
This course aims to provide an overview of reinforcement learning (RL) algorithms. RL has achieved remarkable success in various applications, including robotic manipulation and autonomous driving. In addition, RL is also used to fine-tune large language models, and the potential of RL is still expanding. We start with basic concepts such as a Markov decision process and the value functions and see popular algorithms such as TD3 and Soft actor critic.
MIMA Search
時間割/共通科目コード
コース名
教員
学期
時限
4890-1066
GIF-CO5527L1
GCL情報理工学特別講義Ⅶ (強化学習)
長 隆之
S1 S2
火曜2限
マイリストに追加
マイリストから削除
講義使用言語
英語
単位
2
実務経験のある教員による授業科目
NO
他学部履修
開講所属
情報理工学系研究科
授業計画
April 9: Introduction, overview of the fields of reinforcement learning and robot learning April 16: Bandit problems, Bayesian optimization April 23: Categories of RL / MDP / value function / Bellman equations April 30: DQN / value&policy iteration / RL as density estimation / MaxEnt and Boltzmann distribution / Policy gradient May 14: No lecture May 21: Symposium on Robot Learning (in person, Takeda Science Frontier hall) May 28: Policy gradient, REINFORCE / variance reduction / policy gradient with function approximation June 4: On-policy actor critic / NAC / A3C, TRPO / PPO / GAE June 11: Off-policy actor critic, DDPG / TD3 / SAC / QT-Opt / MPO July 18: Inverse RL June 25: Gradient-free methods / Model-based RL July 2: Offline RL / AWAC/ implicit Q-learning / TD3+BC / large models / diffusion policies July 9: Advanced topic (Rainbow, Distributed RL, domain randomization, adversarial training, curriculum learning, HRL, goal-conditioned RL / HER, intrinsic reward)
授業の方法
Online lecture via Zoom except May 21st. On May 21st, there will be a symposium on robot learning organized by a lecturer of this course, and students are expected to attend the symposium in person at Honga campus.
成績評価方法
attendance, and reports.
教科書
There is no specific text book for this lecture.
参考書
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. Second Edition. MIT Press, Cambridge, MA, 2018.
履修上の注意
We expect students to review the materials for the previous lecture every week. Basic knowledge of machine learning, linear algebra, and probability theory is required.