为克服动态规划方法在解决高维问题优化存在的维数灾问题,利用函数近似方法来取得代价函数,通过自学习的方法得到近似动态规划解,适用于复杂、非线性系统的决策优化或控制问题。采用双启发动态规划(DHP)算法用于水泥烧成系统的控制,用神经网络建立评价模块和动作模块对该系统进行优化控制。寻找合适的优化目标函数,由评价模块判断动作的好坏并反馈给动作模块,动作模块给出各参数的调整量。仿真结果显示,系统状态量能够被稳定控制在合理的范围。
In order to overcome curse of dimensionality of dynamic programming method on solving the high-dimensional optimiza- tion problem, it uses function approximation method to approximate the cost function and gets approximate dynamic programming solution through self-learning method. This method is suitable for complex, nonlinear systems optimization or control of the decision-making problems. Dual Heuristic dynamic Programming(DHP) algorithm is used to control the cement burning system. It establishes the critic and action module using neural network to optimize the control of the system. By finding a suitable objective fimction, the critic mod- ule determines good or bad actions and feeds back to the action module, action module for adjusting the amount of each parameter is given. Simulation results show that the system state variables can be stable at a reasonable range.