针对多机器人系统的增强学习问题,为提高机器人的学习速度和充分利用通信范围内其他机器人的增强学习的经验和结果,给出了2类基于局部加权k近邻时间差分的多机器人系统的交互式学习策略.对于机器人之间通信无时滞情形,基于环境感测和任务信息状态描述的局部加权k近邻状态选择方法,机器人通过对自身和通信范围内其他机器人Q值表的比较和分析,对其自身的Q值表进行优化迭代更新.在此基础上,分别给出了基于全局通信条件下和局部通信条件下多机器人系统的异步的互增强学习方案.最后,通过仿真实验进一步验证了所提方案的可行性和有效性.
To accelerate the learning speed of robots for multirobot systems and make full use of ex perience and results of other robots in the communication domain, two kinds of multirobot learning strategies based on the local weighted knearest neighbor temporal difference (kNNTD) algorithm are proposed. Without consideration of time delays during the communications of robots, based on the method of local weighted kNNTD state selection by using environment sense information and task destination information, the optimal iteration of Q value table of a robot is updated by the em ployment of comparison and analysis of Q value tables of itself and other communicating robots. Af ter that, asynchronous interaction reinforcement learning schemes are presented in the case of global communication and local communication in the working environment, respectively. Finally, the sim ulations verify the effectiveness and efficiency of the proposed strategy.