本文考虑可数状态非负费用的折扣半马氏决策过程.首先在给定半马氏决策核和策略下构造一个连续时间半马氏决策过程,然后用最小非负解方法证明值函数满足最优方程和存在ε-最优平稳策略,并进一步给出最优策略的存在性条件及其一些性质.最后,给出了值迭代算法和一个数值算例.
This paper deals with discounted semi-Markov decision processes with countable states and nonnegative costs.We first construct a continuous-time semi-Markov decision process under a given semi-Markov decision kernel and each policy. Then,we prove that the value function satisfies the optimality equation and there exists an e-optimal stationary policy by using a minimum nonnegative solution approach,and further give conditions for the existence of optimal policies as well as some properties of optimal policies.Finally,a value iteration algorithm for computing the value function is developed and a numerical example is given.