东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

Total reward criteria for unconstrained/constrained continuous-time Markov decision processes

ISSN号：1009-6124
期刊名称：系统科学与复杂性学报(英文版)
时间：0
页码：491-505
语言：中文
分类：O211.62[理学—概率论与数理统计;理学—数学] TJ761.13[兵器科学与技术—武器系统与运用工程]
作者机构：[1]School of Mathematics and Computational Science, Sun Yat-sen University, Guangzhou 510275, China, [2]School of Public Health and Tropical Medicine, Southern Medical University, Guangzhou 510515, China
相关基金：This research is supported by the National Natural Science Foundation of China under Grant Nos. 10925107 and 60874004.
相关项目：随机动态系统高级最优控制的研究

关键词：马尔可夫决策过程, 连续时间, 标准, 奖金, 马氏决策过程, 约束模型, 拉格朗日乘数, 控制系统, Constrained-optimal policy, continuous-time Markov decision process, optimal policy, total reward criterion, unbounded reward/cost and transition rates.

中文摘要：

这份报纸与期望的全部的报酬标准学习可数的连续时间的 Markov 决定过程。作者首先与可能的无界的转变率学习非强迫的模型，并且在作者在下面显示出全部的报酬 optimality 方程并且也的一个答案的存在的控制系统原语数据上给合适的条件一条最佳的静止政策的存在。然后，作者在期望的全部的费用上强加限制，并且考虑联系抑制模型。关于非强迫的模型并且用 Lagrange multipliers 途径基于结果，作者在一些另外的条件下面证明抑制最佳的政策的存在。最后，作者把结果用于控制排队系统。

英文摘要：

This paper studies denumerable continuous-time Markov decision processes with expected total reward criteria. The authors first study the unconstrained model with possible unbounded transition rates, and give suitable conditions on the controlled system＇s primitive data under which the authors show the existence of a solution to the total reward optimality equation and also the existence of an optimal stationary policy. Then, the authors impose a constraint on an expected total cost, and consider the associated constrained model. Basing on the results about the unconstrained model and using the Lagrange multipliers approach, the authors prove the existence of constrained-optimal policies under some additional conditions. Finally, the authors apply the results to controlled queueing systems.

同期刊论文项目