这份报纸学习强壮的 n (n = 1, 0 ) 为在波兰的空格的连续时间的 Markov 决定过程的折扣和有限地平线标准。相应转变率被允许无界,并且报酬率可以有既不上面也不更低的界限。在温和条件下面,作者证明强壮的 n 的存在(n = 1, 0 ) 由开发二种等价关系打折最佳的静止政策:一个人在标准期望的平均报酬和强壮的 1 折扣 optimality 之间,并且其它在偏爱和强壮的 0 折扣 optimality 之间。作者也由开发一个正规三位字节的有趣的描述为一个有限地平线控制问题证明一条最佳的政策的存在。
This paper studies the strong n(n =-1,0)-discount and finite horizon criteria for continuoustime Markov decision processes in Polish spaces.The corresponding transition rates are allowed to be unbounded,and the reward rates may have neither upper nor lower bounds.Under mild conditions,the authors prove the existence of strong n(n =-1,0)-discount optimal stationary policies by developing two equivalence relations:One is between the standard expected average reward and strong-1-discount optimality,and the other is between the bias and strong 0-discount optimality.The authors also prove the existence of an optimal policy for a finite horizon control problem by developing an interesting characterization of a canonical triplet.