位置:成果数据库 > 期刊 > 期刊详情页
A parallel scheduling algorithm for reinforcement learning in large state space
  • ISSN号:1000-7180
  • 期刊名称:《微电子学与计算机》
  • 时间:0
  • 分类:TP18[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程] TP301.6[自动化与计算机技术—计算机系统结构;自动化与计算机技术—计算机科学与技术]
  • 作者机构:[1]Institute of Computer Science and Technology, Soochow University, Suzhou 215006, China, [2]Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China, [3]Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
  • 相关基金:Acknowledgements This paper was supported by the National Natural Science Foundation of China (Grant Nos. 61272005, 61070223, 61103045, 60970015, and 61170020), Natural Science Foundation of Jiangsu (BK2012616, BK2009116), High School Natural Foundation of Jiangsu (09KJA520002), and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University (93K172012K04).
中文摘要:

在加强学习的区域的主要挑战正在可伸缩直到更大、更复杂的问题。瞄准加强学习的可伸缩的问题,学习方法的可伸缩的加强, DCS-SRL,根据 divide-and-conquer 策略,和它的集中被建议被证明。在这个方法,在大州的空间或连续州的空间的学习问题被分解成多重更小的 subproblems。给一个特定的学习算法,每 subproblem 能与有限可得到的资源独立地被解决。最后,部件答案能被重新结合获得需要的结果。探讨在调度程序优先考虑 subproblems 的问题,安排算法的加权的优先级被建议。这个安排算法保证那计算集中于被期望最大地高效的问题空间的区域。帮助学习过程,一个新平行方法,叫的 DCS-SPRL,从把 DCS-SRL 与平行安排体系结构相结合被导出。在 DCS-SPRL 方法, subproblems 将在有能力在平行工作的处理器之中被散布。试验性的结果证明基于 DCS-SPRL 学习有快集中速度和好可伸缩性。

英文摘要:

The main challenge in the area of reinforcement learning is scaling up to larger and more complex problems. Aiming at the scaling problem of reinforcement learning, a scalable reinforcement learning method, DCS-SRL, is proposed on the basis of divide-and-conquer strategy, and its convergence is proved. In this method, the learning problem in large state space or continuous state space is decomposed into multiple smaller subproblems. Given a specific learning algorithm, each subproblem can be solved independently with limited available resources. In the end, component solutions can be recombined to obtain the desired result. To ad- dress the question of prioritizing subproblems in the scheduler, a weighted priority scheduling algorithm is proposed. This scheduling algorithm ensures that computation is focused on regions of the problem space which are expected to be maximally productive. To expedite the learning process, a new parallel method, called DCS-SPRL, is derived from combining DCS-SRL with a parallel scheduling architecture. In the DCS-SPRL method, the subproblems will be distributed among processors that have the capacity to work in parallel. The experimental results show that learning based on DCS-SPRL has fast convergence speed and good scalability.

同期刊论文项目
期刊论文 37 会议论文 4 获奖 3 专利 3
同项目期刊论文
期刊信息
  • 《微电子学与计算机》
  • 中国科技核心期刊
  • 主管单位:中国航天科技集团公司
  • 主办单位:中国航天科技集团公司第九研究院第七七一研究所
  • 主编:李新龙
  • 地址:西安市雁塔区太白南路198号
  • 邮编:710065
  • 邮箱:mc771@163.com
  • 电话:029-82262687
  • 国际标准刊号:ISSN:1000-7180
  • 国内统一刊号:ISSN:61-1123/TN
  • 邮发代号:52-16
  • 获奖情况:
  • 航天优秀期刊,陕西省优秀期刊一等奖
  • 国内外数据库收录:
  • 荷兰文摘与引文数据库,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊(2004版),中国北大核心期刊(2008版),中国北大核心期刊(2011版),中国北大核心期刊(2014版),中国北大核心期刊(2000版)
  • 被引量:17909