自适应动态规划(Adaptive dynamic programming,ADP)是最优控制领域新兴起的一种近似最优方法,是当前国际最优化领域的研究热点.ADP方法利用函数近似结构来近似哈密顿–雅可比–贝尔曼(Hamilton-Jacobi-Bellman,HJB)方程的解,采用离线迭代或者在线更新的方法,来获得系统的近似最优控制策略,从而能够有效地解决非线性系统的优化控制问题.本文按照ADP的结构变化、算法的发展和应用三个方面介绍ADP方法.对目前ADP方法的研究成果加以总结,并对这一研究领域仍需解决的问题和未来的发展方向作了进一步的展望.
Adaptive dynamic programming (ADP) is a novel approximate optimal control scheme, which has recently become a hot topic in the field of optimal control. As a standard approach in the field of ADP, a function approximation structure is used to approximate the solution of Hamilton-Jacobi-Bellman (HJB) equation. The approximate optimal control policy is obtained by using the offline iteration algorithm or the online update algorithm. This paper gives a review of ADP in the order of the variation on the structure of ADP scheme, the development of ADP algorithms and applications of ADP scheme, aiming to bring the reader into this novel field of optimization technology. Furthermore, the future studies are pointed out.