提出了一种全新的迁移蜂群优化算法,并应用到电力系统无功优化问题.利用Q学习的试错与奖励机制构造蜂群的学习模式,并采用强化学习的行为迁移技术实现蜂群的迁移学习.为解决算法求解多变量优化问题遇到的维数灾难,提出了状态–组合动作链的方式将状态–动作空间分解成若干低维空间,明显降低算法的计算难度.仿真结果表明:本文所提算法可以保证最优解质量的同时,寻优速度能提高到传统启发式智能算法的4~67倍左右,非常适用于大规模复杂系统非线性规划问题的快速求解.
This paper proposes a novel transfer bees optimizer(TBO), which is implemented to solve the reactive power optimization of power systems. The trial-and-error and the reward mechanism of Q-learning is adopted to construct the learning mode of the bees, and the technology of behavior transfer from reinforcement learning is used for transfer learning.Moreover, a space-action chain is proposed to decompose the solution space into several lower-dimensional spaces, thus it can solve the curse of dimension resulted from the multiple variables optimization problem. Simulation results show that TBO can obtain a high-quality optimal solution, while its convergence speed can be accelerated as many as 4 to67 times faster than that of the conventional heuristic artificial algorithm(AI) algorithm, which is very suitable for fast optimization of nonlinear programming in a large-scale complex system.