东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

一种新的多智能体Q学习算法

ISSN号：0254-4156
期刊名称：自动化学报
时间：0
页码：367-372
语言：中文
分类：TP18[自动化与计算机技术—控制科学与工程;自动化与计算机技术—控制理论与控制工程]
作者机构：[1]中南大学信息科学与工程学院,长沙410083, [2]贵州省高速公路开发总公司,贵阳550003
相关基金：湖南省自然科学基金项目（06JJ50144）和国家杰出青年科学基金项目（60425310）资助
相关项目：工程系统与控制

关键词：多智能体, 增强学习, Q学习, Multi-agent systems, reinforcement learning, Q-learning

中文摘要：

针对非确定马尔可夫环境下的多智能体系统，提出了一种新的多智能体Q学习算法．算法中通过对联合动作的统计来学习其它智能体的行为策略，并利用智能体策略向量的全概率分布保证了对联合最优动作的选择．同时对算法的收敛性和学习性能进行了分析．该算法在多智能体系统RoboCup中的应用进一步表明了算法的有效性与泛化能力．

英文摘要：

Due to the presence of other agents, the environment of multi-agent systems （MAS） cannot be simply treated as Markov decision processes （MDPs）. The current reinforcement learning algorithms which are based on MDPs must be reformed before it can be applicable to MAS. Based on an agent＇s independent learning ability this paper proposes a novel Q-learning algorithm for MAS -an agent learning other agents＇ action policies through observing the joint action. The policies of other agents are expressed as action probability distribution matrixes. A concise and yet useful updating method for the matrixes is proposed. The full joint probability of distribution matrixes guarantees the learning agent to choose his/her optimal action. The convergence and performance of the proposed algorithm are analyzed theoretically. When applied to RoboCup, our algorithm showed high learning efficiency and good generalization ability. Finally, we briefly point out some directions of multi-agent reinforcement learning.

同期刊论文项目

工程系统与控制

期刊论文 174 会议论文 51 获奖 10 专利 8 著作 5

同项目期刊论文

线性时滞广义系统的时滞相关稳定性新判据

On delay-dependent robust stability for uncertain neutral systems

Modeling and controlling of a flexible hydraulic manipulator

Control design and comprehensive stability analysis of acrobots based on Lyapunov functions

Robust stability of discrete-time nonlinear system with time-delay.

基于共生矩阵和颜色特征的泡罩包装缺陷检测

韶钢 105m2 烧结机过程控制系统的设计与应用

Improved bounded-real-lemma representation and H∞ control of systems with polytopic uncertainties

高炉温度场的红外图像识别检测方法及应用

基于成分预测模型的矿石烧结配料专家优化方法

焦炉火道温度的模糊专家控制策略及其应用

基于预报的烧结终点模糊控制

一类具有匹配时滞状态扰动的非线性系统自适应鲁棒镇定

线性时滞系统的时滞相关鲁棒控制

具有未知上界时滞状态扰动的非线性系统自适应鲁棒镇定

Delay-dependent conditions for absolute stability of Lurie control systems with time-varying delay

含饱和驱动的线性时滞系统的时滞相关鲁棒稳定化

H-infinity control for networked control systems (NCS) with time-varying delays

Robust impulsive synchronization of linear discrete dynamical networks

Delay-dependent stabilization of linear systems with time-varying state and input delays

基于信息融合的高炉料面分布模型与应用

一种获取网络化控制系统最大允许延迟时间的新方法

线性时滞系统的时滞相关无源控制

焦炉加热过程无辨识自适应控制方法

铅锌烧结过程烧穿点的模糊预测控制方法

Design of a discrete-time output-feedback based repetitive-control system

基于二维混合模型的保成本重复控制

基于二维混合模型和状态观测器的重复控制设计

具有状态观测器的鲁棒重复控制系统设计

交叉立方体网络的无死锁虫洞路由算法

煤气平衡认证分析系统的设计

带时变时延的网络化控制系统控制器设计方法

焦炉集气管的模糊专家控制方法及其应用

基于服务生成图的Web服务工作流可用性研究

烧结过程烧结终点的预测与智能控制策略的研究及应用

基于多智能体强化学习的焦炉集气管压力多级协调控制

焦炉集气管压力智能解耦控制系统的应用

焦炉立火道温度的智能集成软测量方法及其应用

基于模糊分类变系数的铅锌烧结过程综合透气性状态预测

一种烧结终点模糊滑模控制策略及其在烧结过程中的应用

基于多智能体系统的分布式智能控制系统框架与原型系统开发

基于MAS的分布式焦炉集气管压力解耦控制

焦炉燃烧加热过程智能集成优化控制系统

烧结过程优化控制中的OPC同步通信机制

焦炉加热燃烧过程智能优化控制系统的研究及应用(下)

基于免疫FNN算法的加热炉炉温优化控制

减少以太网传播延迟的冲突切断方法及应用仿真

Intelligent optimal control for lead-zinc sintering process state

A new integral inequality approach to delay-dependent robust H∞ control

Delay-dependent robust H-infinity control for discrete- time uncertain systems with time-varying sta

煤气混合加压过程的智能解耦控制方法与应用

一种基于消去树的LDL分解方法及其在营销优化计算中的应用

线性多时滞不确定离散时间系统的时滞相关 H ∞控制

基于免疫算法自调节变异率的FNNC参数优化及其应用

Delay-dependent robust stabilisation of discrete-time systems with time-varying delay

无料钟高炉布料模型的研究与应用

焦炉燃烧过程温度优化控制系统的研究与应用

一种新的多智能体系统结构及其在RoboCup中的应用

基于MAS的分布式集成智能控制系统开发与应用

焦炉加热燃烧过程智能优化控制系统的研究及应用(上)

Delay-dependent stability criteria for linear systems with multiple time delays

微网孤岛模式下负荷分配的改进控制策略

On absolute stability of Lur'e control systems with multiple non-linearities

Augmented Lyapunov functional and delay-dependent stability criteria for neutral systems

Delay-dependent robust stability and stabilization criteria for uncertain neutral systems

Intelligent integrated optimization and control system for lead–zinc sintering process

Integrated intelligent control of gas mixing-and-pressurization process

H ∞ filtering for discrete-time systems with time-varying delay

基于二维混合模型的改进型重复控制系统保性能设计方法

基于二维混合模型的最优重复控制方法

多输入模糊协调控制算法及应用研究

基于多工况分析的焦炉加热过程火道温度模糊控制

Intelligent integrated control of combustion process of coke oven based on determination of operatin

Internet-based teaching and experiment system for control engineering course

基于混合粒子群算法的铅锌烧结过程产量质量优化

基于烟气温度场分布的烧穿点智能集成预测方法

无料钟高炉布料模型设计与应用

基于图像灰度统计分布的高炉温度场动态定标算法

高炉布料的焦层坍塌建模方法研究

基于神经网络和模拟退火算法的配煤智能优化方法

基于多元模糊线性回归的烧结终点预测方法

焦炉集气管压力的变结构模糊控制研究

一种基于Bayesian信念网络的客户行为预测方法

一种新的网格工作流宏观自组织演化机制研究

SWES:一种基于QoS的Web服务工作流调度性能评价系统

基于T-S模型的透气性鲁棒预测

基于HPSO的钢坯加热过程炉温优化设定

一种新的交叉立方体最短路径路由算法

New distributed positioning algorithm based on centroid of circular belt for wireless sensor network

A fast LDL-factorization approach for large sparse positive definite system and its application to o

Stability analysis for neutral systems with mixed delays

Delay-dependent stability analysis for uncertain neutral systems with time-varying delays

An improved robust stability and robust stabilization method for linear discrete-time uncertain syst

Networked control and supervision system based on LonWorks fieldbus and Intranet/Internet

Hybrid scheduling model and analysis of performance for switched industrial ethernet

不确定时滞系统的时滞相关非脆弱鲁棒 H ∞ 控制

基于二维混合模型的重复控制系统设计新方法

高炉料面煤气流分布识别方法

联合循环发电系统燃料热值智能优化控制

基于三维体质心的无线传感器网络节点定位算法

基于二维混合模型的鲁棒迭代学习控制设计

基于预测模型与调整规则的烧结配料优化综合集成方法

线性时滞系统的时滞相关鲁棒稳定性新判据

欠驱动两杆机器人的统一控制策略和全局稳定性分析

基于多工况识别的焦炉燃烧过程多模态模糊专家控制(上)

铅锌烧结过程质量产量的智能集成优化控制

铅锌烧结过程综合透气性的集成预测模型

基于专家评估和信息融合的高炉料面温度场智能建模

面向高炉布料操作优化的在线信息检测方法及其应用(下)

基于耦合度的集气管压力智能解耦控制

基于多工况识别的焦炉燃烧过程多模态模糊专家控制(下)

线性不确定系统的H∞状态反馈鲁棒重复控制

基于二维混合模型的改进型重复控制器设计

一种基于经验知识和信息熵的阈值选择策略

基于集成模型与遍历搜索算法的铅锌烧结透气性优化

一种基于3G的新型实时通信系统的设计与实现

一种基于改进遗传算法的模糊神经网络控制器及其在烧结终点控制中的应用

基于混合粒子群算法的烧结配料优化

基于双向令牌的可扩展及可靠的群组成员管理

基于IGS和SVM的烧结返矿量智能集成预测模型

一种克服粒子群早熟的混合优化算法

中立型系统的时滞相关非脆弱H∞控制

Control of acrobot based on non-smooth Lyapunov function and comprehensive stability analysis

Delay-dependent H∞ control of linear discrete-time systems with an interval-like time-varying delay

Improving disturbance-rejection performance based on an equivalent-input-disturbance approach

Global exponential stability of bidirectional associative memory neural networks with time delays

A quality-and-Quantity prediction model for lead-zinc sintering process baesd on state parameters an

Output feedback stabilization for a discrete-time system with a time-varying delay

Exponential stability analysis for neural networks with time-varying delay

基于粒子群优化的集气管压力变结构模糊控制

面向高炉布料操作优化的在线信息检测方法及其应用(上)

炼焦生产过程质量产量能耗的集成优化控制

基于增广Lyapunov泛函的Lurie时滞系统的绝对稳定性

Improved free-weighting matrix approach for stability analysis of discrete-time recurrent neural net

基于遗传算法的三维无线传感器网络定位新算法

Acrobot控制器设计与全局稳定性分析

Nonlinear system modeling and robust predictive control based on RBF-ARX model

Improved stabilisation method for networked control systems

Lurie非线性系统时滞相关绝对稳定性分析

基于双种群粒子群优化新算法的最优潮流求解

基于支持向量机的高炉炉况诊断方法

一种交叉立方体网络的并行路由算法

基于多层分布式软件体系结构的烧结过程BTP优化控制系统设计

基于优化调度模型的焦炉推焦计划编制方法

基于混杂递阶结构的焦炉加热过程火道温度智能控制

基于综合工况评判模型的铅锌烧结过程操作参数优化方法

Web服务工作流中基于信任关系的QoS调度

一种基于业务生成图的Web服务工作流构造方法

基于统计分析和多支持向量机的风电功率坡度事件分类预测

Non-fragile delay-dependent H∞ control of linear time-delay system with uncertainties in state and control input

Robust Fuzzy Tracking Control for Nonlinear Networked Control Systems with Integral Quadratic Constraints

Delay-dependent Robust Stability for Uncertain Stochastic Systems with Interval Time-varying Delay

Delay-dependent robust H-infinity control for discrete-time uncertain systems with time-varying state delays

基于部分反馈线性化的三杆体操机器人控制策略

基于时滞的H_∞滤波器设计及其在网络中的应用

Lyapunov-Krasovskii functional based power system stability analysis in environment of WAMS

Networked control and supervision system based on LonWorks fieldbus and Intranet/Internet

基于信息融合的焦炉加热过程工况判断方法及应用

人工代谢算法在故障诊断中的应用

期刊信息

《自动化学报》
中国科技核心期刊

主管单位:中国科学院
主办单位:中国自动化学会中国科学院自动化研究所
主编：王飞跃
地址：北京东黄城根北街16号
邮编：100717
邮箱：aas@ia.ac.cn
电话：010-64019820

国际标准刊号：ISSN：0254-4156
国内统一刊号：ISSN：11-2109/TP
邮发代号:2-180

获奖情况:
1997年获全国优秀期刊奖,1985、1990、1996、2000年获中国科学院优秀期刊二等奖,2002年获国家期刊奖

国内外数据库收录:
美国数学评论（网络版）,德国数学文摘,荷兰文摘与引文数据库,美国工程索引,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:27550