针对贝叶斯强化学习中参数个数巨大,收敛速度慢,无法实现在线学习的问题,提出一种基于模型的可分解贝叶斯强化学习方法.首先,将学习参数进行可分解表示,降低学习参数的个数;然后,根据先验知识和观察数据采用贝叶斯方法来学习,最优化探索和利用二者之间的平衡关系;最后,采用基于点的贝叶斯强化学习方法实现学习过程的快速收敛,从而达到在线学习的目的.仿真结果表明该算法能够满足实时系统性能的要求.
Due to the enormous number of parameters and slow convergence which are the major obstacles for online learn- ing in model-based Bayesian reinforcement learning, the paper presents a model-based factored Bayesian reinforcement learning ap- proach. Firstly, factored representations are made to represent the dynamics with fewer parameters. Then, according to prior knowl- edge and observable data, this paper exploits model-based reinforcement learning to provide an elegant solution to the optimal explo- ration-exploitation tradeoff. Finally, a pointed-based Bayesian reinforcement learning approach is proposed to speed up the conver- gence to achieve online learning. The experimental results show that the proposed approach can approximate the underlying Bayesian reinforcement learning task well with guaranteed real-time performance.