这篇论文在波兰的空格为连续时间的 Markov 决定过程学习限制一般水准变化标准。把途径基于二,这份报纸证明变化最小化 optimality 方程的答案的不仅存在和一个变化的存在正规的最小的政策,而且二变化最小化 optimality 不平等的答案的存在和一个变化的存在不能正规的最小的政策。一个例子被给我们的条件说明所有。
This paper studies the limit average variance criterion for continuous-time Markov decision processes in Polish spaces. Based on two approaches, this paper proves not only the existence of solutions to the variance minimization optimality equation and the existence of a variance minimal policy that is canonical, but also the existence of solutions to the two variance minimization optimality inequalities and the existence of a variance minimal policy which may not be canonical. An example is given to illustrate all of our conditions.