强化学习的收敛速度随状态一动作空间的维数呈指数增长,因此在涉及大的状态空间时,强化学习算法的收敛速度非常慢以至不能满足应用需求。在许多应用环境中,若智能体之间存在合作关系,借助多个智能体进行分布式学习可以部分解决这一问题。利用进化算法,设计了智能体繁殖、消亡等操作,使得子代智能体能够继承父代智能体在状态空间的方向信息,从而更快地找到状态一动作空间的有效更新。仿真实验表明:算法比已有的强化学习方法具有更高的搜索效率和收敛速度。
Reinforcement learning is not applicable concerning large state-actions, since that its convergence speed increases exponentially with the number of dimensions of state-action space. In many situations, this problem partially can be solved by utilizing a cooperation relationship among agents. An evolutional algorithm was put forward, which could rapidly find the effective updating of state-action pairs by the evolutionary operators such as reproduction as well as die out. Simulations proved that the algorithm performs was better than present multiagent cooperation learning algorithms.