针对两轮机器人自平衡运动控制问题,提出了一种基于Boltzamnn机的Skinner操作条件反射学习机制作为机器人仿生自主学习的算法.该算法利用Boltzamnn机中Metropolis判据平衡Skinner操作条件反射学习中探索和利用的比例,并依据概率取向机制以一定的概率选择最优行为,从而使机器人在未知环境下可获得像人或动物一样的仿生自主学习技能,实现机器人的自平衡运动控制.最后,分别用基于Boltzamnn机的Skinner操作条件反射的学习算法和基于贪婪策略的Skinner操作条件反射的学习算法做了仿真实验并进行了比较.结果表明,基于Boltzamnn机的Skinner操作条件反射的学习算法能使机器人获得较强的运动平衡控制技能和较好的动态性能,体现了机器人的自主学习特性.
In view of the self-balancing movement control problem of the two-wheeled robot,a bionic self-learning algorithm of the robot is proposed as a study mechanism of Skinner's operant conditioning reflection based on the Boltzamnn machine.This algorithm uses the Metropolis criterion in Boltzamnn machine to balance in the proportion of the exploration and the exploitation in the study of Skinner's operant conditioning reflection,and chooses the most superior behavior through certain probability depending on the probability tropism mechanism.Thus the robot can obtain the skill of bionic self-learning like the human or the animal under the unknown environment,and realize the self-balancing movement control of the robot.Finally,the simulation experiments were conducted and the Skinner's operant conditioning reflection study algorithms based on the Boltzamnn machine and the greedy strategy were compared,separately.Results show that the Skinner's operant conditioning reflection study algorithm based on the Boltzamnn machine can obtain the stronger movement balancing control skill and the better dynamic performance,and manifest the self-learning characteristics of the robot.