HAMs体系的一个主要问题是:它的状态空间是由机器状态与环境状态共同生成的联合状态空间,而基于子过程的状态抽象方法也不能完全解决这个问题.本文对此进行了详细的分析,并从策略耦合SMDPs的观点分析与描述了HAMs模型,提出一系列基于HAMs的同态变换的形式化定义及证明了几个较为实用的定理,表明同态变换方法可以有效地解决这一问题.在此基础上,总结了应用同态变换进行状态抽象的几个重要的观点.并使用本文提出的方法对一个典型的实例进行了分析与验证.
A main problem that exists in HAMs-family HRL is its joint state space consisting of the cross-product of the machine states in the HAM and the states in the original MDP, which is not completely solved by a subroutine-based state abstraction method. This paper analyzes this problem in detail and provides formal descriptions on HAMs model by using "policy- coupled" semi-Markov decision processes. It also provides formal definitions on HAMs-based homomorphisms, proves some useful theorems, and shows that the HAMs-based homomorphisms can conquer this problem. This paper concludes some important opinions on applying homomorphisms to state abstractions. Lastly, a typical example is analyzed and evaluated.