东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

基于Q学习的互联电网动态最优CPS控制

期刊名称：中国电机工程学报, 29(19), pp 13-19, 2009/7/5
时间：0
分类：TM732[电气工程—电力系统及自动化]
作者机构：[1]华南理工大学电力学院,广东省广州市510640, [2]香港理工大学电机工程系,中国香港特别行政区
相关基金：国家自然科学基金项目（50807016）,中国香港特别行政区研究资助局项目（RGC No.PolyU G-U494）;广东省自然科学基金项目资助（06300091）.
相关项目：CPS标准下AGC的最优松驰控制及其马尔可夫决策过程

关键词：自动发电控制, Q学习, 马尔可夫决策过程, 控制性能标准, 最优控制, automatic generation control, Q-learning, Markov decision process, control performance standard, optimal control

中文摘要：

控制性能标准（control performance standard，CPS）下互联电网自动发电控制（automatic generation control，AGC）系统是一个典型的不确定随机系统，应用基于马尔可夫决策过程（Markov decision process，MDP）理论的Q学习算法可有效地实现控制策略的在线学习和动态优化决策。将CPS值作为包含AGC的电力系统“环境”所给的“奖励”，依靠Q值函数与CPS控制动作形成的闭环反馈结构进行交互式学习，学习目标为使CPS动作从环境中获得的长期积累奖励值最大。提出一种实用的半监督群体预学习方法，解决了Q学习控制器在预学习试错阶段的系统镇定和快速收敛问题。仿真研究表明，引入基于Q学习的CPS控制可显著增强整个AGC系统的鲁棒性和适应性，有效提高了CPS的考核合格率。

英文摘要：

The NERC＇s control performance standard （CPS） based automatic generation control （AGC） problem is a stochastic multistage decision problem, which can be suitably modeled as a reinforcement learning （RL） problem based on Markov decision process （MDP） theory. The paper chose the Q-learning method as the RL algorithm regarding the CPS values as the rewards from the interconnected power systems. By regulating a closed-loop CPS control rule to maximize the total reward in the procedure of on-line learning, the optimal CPS control strategy can be gradually obtained. An applicable semi-supervisory pre-leaming method was introduced to enhance the stability and convergence ability of Q-learning controllers. Two cases show that the proposed controllers can obviously enhance the robustness and adaptability of AGC systems while the CPS compliances are ensured.

同期刊论文项目