信息科学与工程学院学术报告预告： Model-Free Finite Horizon H-Infinity Control: An Off-Policy Approach with Double Minimax Q-learning

来源：信息科学与工程学院 2025-08-25 12:14 浏览：次

报告题目： Model-Free Finite Horizon H-Infinity Control: An Off-Policy Approach with Double Minimax Q-learning

报告人：Wen Yu （余文）

报告时间：2025年8月28日 09:30-10:30

报告地点：莲花街校区惟学楼228会议室

报告人简介：

Wen Yu（余文），墨西哥科学院院士，墨西哥国立理工学院全职教授，入选前2%高被引科学家，1990年获得清华大学自动化专业学士学位，1992年和1995年分别获得东北大学自动控制专业硕士和博士学位。1995年至1996年期间，在东北大学自动控制系担任讲师。自1996年起，就职于墨西哥国立理工学院。2002年至2003年，在墨西哥石油研究所担任研究职务。2006年至2007年，他作为高级访问研究员在英国贝尔法斯特女王大学工作；2009年至2010年，他在美国加利福尼亚州圣克鲁兹的加州大学担任访问副教授。自2006年以来，他还担任东北大学的访问教授。

已发表500余篇学术论文，其中包括200余篇期刊论文，并出版了8部专著。指导了38篇博士论文和40篇硕士论文。根据Google Scholar统计，学术成果已被引用超过12,000次，H指数为52。曾担任IEEE旗舰年会SSCI 2023的大会主席。还曾担任《IEEE Transactions on Cybernetics》《IEEE Transactions on Neural Networks and Learning Systems》《Neurocomputing》《Scientific Reports》《Intelligence & Robotics》等期刊的副编辑。

报告内容简介：

Finite horizon H-infinity control is essential for robust system design, particularly when guaranteed system performance is required over a specific time interval. Despite offering practical benefits over its infinite horizon counterpart, these model-based frameworks present complexities, notably the time-varying nature of the Difference Riccati Equation (DRE), which significantly complicates solutions for systems with unknown dynamics. This paper proposes a novel model-free method by leveraging off-policy reinforcement learning (RL), known for its superior data efficiency and flexibility compared to traditional on-policy methods prevalent in model-free H-infinity control literature. Recognizing the unique challenges of off-policy RL within the inherent minimax optimization problem of H-infinity control, we propose the Neural Network-based Double Minimax Q-learning (NN-DMQ) algorithm. This algorithm is specifically designed to handle the adversarial interaction between the controller and the worst-case disturbance, while also mitigating the bias introduced by Q-value overestimation, which can destabilize learning. A key theoretical contribution of this work is a rigorous convergence proof of the proposed Double Minimax Q-learning (DMQ) algorithm. This proof provides strong guarantees for the algorithm's stability and capability to learn the optimal finite-horizon robust control and worst-case disturbance policies. Extensive were performed to verify the effectiveness and robustness of our approach, illustrating its real-world implementation in challenging real-world control problems with unknown dynamics.

欢迎广大师生参加！

信息科学与工程学院

2025年8月25日

（责任编辑：李翰）

演讲者	Wen Yu （余文）	演讲时间	2025年8月28日 09:30-10:30
地点	莲花街校区惟学楼228会议室	分类
职位		摄影
审核		审校
主要负责		联系学院
事记时间