强化学习 mini-course,欢迎同学们学习,并且在完成课程后给予我们改进意见。
课程名称:REINFORCEMENT LEARNING mini-course
课程时间:共计4次课,每次课按 2.5小时内容量设计, 共计10小时。
考核形式:1 Aissgnment + 1 Project
课程材料:课件PDF文件整理完整之后将会本页面上提供公开下载。
Lecture 1:
Introduction to our RL mini-course
Introduction to Reinforcement Learning
Markov Decision Process
Tabular Q-learning
Deep Q-learning
Hands-on Deep Q-learning
Lecture 2:
Double Q-learning
Dueling Q-learning
Rainbow Q-learning
Vallnia Policy Gradient
Actor-Critic
Hands-on Actor-Critic
Lecture 3:
Deterministic Policy Gradient
Twin Delayed DDPG
Trust Region Policy Optimization
Proximal Policy Optimization
Soft Actor-Critic
Lecture 4:
Hindsight Experience Replay
Model-based Method: I2A
Exploration by Intrinsic Reward: RND
Multi-agent problem setting
IQL
QMIX
MADDPG
More Topic
Assignment 1 code 下载链接
本门课程Project采用个人独立完成的模式,详细的评分要求将会在UMMooddle上公布。
本门课程所涉及的论文如下:
(To be updated!)