东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

First passage Markov decision processes with constraints and varying discount factors

ISSN号：0529-6579
期刊名称：《中山大学学报：自然科学版》
时间：0
分类：O224[理学—运筹学与控制论;理学—数学] TJ761.13[兵器科学与技术—武器系统与运用工程]
作者机构：[1]School of Mathematics and Statistics, Zhaoqing University, Zhaoqing 526061, China, [2]School of Mathematics and Computational Science, Sun Yat-sen University,Guangzhou 510275, China
相关基金：This work was supported in part by the National Natural Science Foundation of China （Grant Nos. 61374067, 41271076）.

作者： Xiao WU[1,2], Xiaolong ZOU[2], Xianping GUO[2]

关键词：马尔可夫决策过程, 通道, 因子, 贴现, 最优问题, 线性规划, 离散时间, 活动空间, Discrete-time Markov decision process （DTMDP）, constrainedoptimality, varying discount factor, unbounded cost

中文摘要：

这份报纸与多限制，州依赖者的折扣因素，和可能无界的费用在可数的州、紧缩的 Borel 行动空格集中于分离时间的 Markov 决定处理的第一个段落(DTMDP ) 的抑制 optimality 问题(警察) 。借助于一条政策的一项所谓的职业措施的性质，我们证明抑制 optimality 问题等价于一(无限维) 职业的集合上的线性编程与一些限制测量，并且因此在合适的条件下面证明一条最佳的政策的存在。用在抑制 optimality 问题和线性编程之间的等价，而且，我们为有限状态和行动的盒子获得一条最佳的政策的一种准确形式。作为一个例子，最后，一个控制排队系统被给说明我们的结果。

英文摘要：

This paper focuses on the constrained optimality problem （COP） of first passage discrete-time Markov decision processes （DTMDPs） in denumerable state and compact Borel action spaces with multi-constraints, state-dependent discount factors, and possibly unbounded costs. By means of the properties of a so-called occupation measure of a policy, we show that the constrained optimality problem is equivalent to an （infinite-dimensional） linear programming on the set of occupation measures with some constraints, and thus prove the existence of an optimal policy under suitable conditions. Furthermore, using the equivalence between the constrained optimality problem and the linear programming, we obtain an exact form of an optimal policy for the case of finite states and actions. Finally, as an example, a controlled queueing system is given to illustrate our results.

同期刊论文项目

青藏高原热融湖对冻土热状况长期作用的数值模拟研究

期刊论文 10

终止时间随机且折扣因子不确定的Markov控制过程

期刊论文 8

同项目期刊论文

NA序列Stout型加权和的完全收敛性

R/S统计量的单对数律

END序列加权和的完全收敛性

NOD序列Sung型加权和的完全收敛性

END序列的完全收敛性：任意阶矩存在的情形

关于任意同分布随机变量序列最大值不等式及应用

STRONG N-DISCOUNT AND FINITE-HORIZON OPTIMALITY FOR CONTINUOUS-TIME MARKOV DECISION PROCESSES

热源周期振荡条件下一维融化问题的数值研究

青藏高原热融湖横向扩张速率对湖下融区发展影响的数值模拟

低温边界条件下柱型凝固问题的数值研究

Modeled response of talik development under thermokarst lakes to permafrost thickness on the Qinghai-Tibet Plateau

具有振荡边界热源的两类融化问题的数值研究

时滞微分方程边值问题正解的存在性

无穷区间上微分方程边值问题正解的存在性

高阶微分方程之导弹追逐模型

一类相变热传导问题的数值解法

期刊信息

《中山大学学报：自然科学版》
北大核心期刊（2011版）

主管单位:国家教育部
主办单位:中山大学
主编：王建华
地址：广州市新港西路135号
邮编：510275
邮箱：xuebaozr@mail.sysn.edu.cn
电话：020-84111990

国际标准刊号：ISSN：0529-6579
国内统一刊号：ISSN：44-1241/N
邮发代号:46-15

获奖情况:
全国优秀高等学校自然科学学报及教育部优秀科技期...,广东省优秀科学技术期刊一等奖,《中文核心期刊要目总览》综合性科技类核心期刊,中国期刊方阵“双效”期刊

国内外数据库收录:
美国化学文摘（网络版）,美国数学评论（网络版）,英国农业与生物科学研究中心文摘,德国数学文摘,荷兰文摘与引文数据库,美国剑桥科学文摘,英国动物学记录,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,英国英国皇家化学学会文摘,中国北大核心期刊（2000版）

被引量:18509