东篱科研大数据发现系统（DRDS）

位置：成果数据库 > 期刊 > 期刊详情页

Feature Selection and Feature Learning for High-dimensional Batch Reinforcement Learning: A Survey

ISSN号：0254-4156
期刊名称：《自动化学报》
时间：0
分类：TP181TP391.4
作者机构：[1]State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences
相关基金：supported by National Natural Science Foundation of China(Nos.61034002,61233001 and 61273140)

作者： De-Rong Liu[1], Hong-Liang Li[1], Ding Wang[1]

关键词： Intelligent, control, reinforcement, LEARNING, adaptive, dynamic, programming, FEATURE, selection, FEATURE, LEARNING, big, data.

中文摘要：

Tremendous amount of data are being generated and saved in many complex engineering and social systems every day.It is significant and feasible to utilize the big data to make better decisions by machine learning techniques. In this paper, we focus on batch reinforcement learning(RL) algorithms for discounted Markov decision processes(MDPs) with large discrete or continuous state spaces, aiming to learn the best possible policy given a fixed amount of training data. The batch RL algorithms with handcrafted feature representations work well for low-dimensional MDPs. However, for many real-world RL tasks which often involve high-dimensional state spaces, it is difficult and even infeasible to use feature engineering methods to design features for value function approximation. To cope with high-dimensional RL problems, the desire to obtain data-driven features has led to a lot of works in incorporating feature selection and feature learning into traditional batch RL algorithms. In this paper, we provide a comprehensive survey on automatic feature selection and unsupervised feature learning for high-dimensional batch RL. Moreover, we present recent theoretical developments on applying statistical learning to establish finite-sample error bounds for batch RL algorithms based on weighted Lpnorms. Finally, we derive some future directions in the research of RL algorithms, theories and applications.

英文摘要：

同期刊论文项目

复杂系统平行控制基础理论及典型应用

期刊论文 38

基于数据的智能电网电能供需自适应优化匹配与调控

期刊论文 4

基于数据的非线性控制系统分析与设计

期刊论文 90 会议论文 50 著作 3

同项目期刊论文

基于数据的自学习优化控制：研究进展与展望

带有储能设备的智能电网电能迭代自适应动态规划最优控制

Optimal Constrained Self-learning Battery Sequential Management in Microgrid Via Adaptive Dynamic Programming

A Neural-Network-Based Iterative GDHP Approach for Solving a Class of Nonlinear Optimal Control Prob

An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed in

基于ADP的一类时滞离散系统跟踪控制

一类非线性系统的全局渐近稳定和有限时间镇定

Direct adaptive control fora class of discrete-time unknown nonaffine nonlinear systems using neural

Optimal control of switched systems with an inequality constraints based on smooth penalty function

Data-driven optimal algorithms and their applications to pattern recognition

A Novel Iterative-Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems

Detecting and Reacting to Changes in Sensing Units: The Active Classifier Case

Data-Based Controllability and Observability Analysis of Linear Discrete-Time Systems

Optimal control for discrete-time affine non-linear systems using general value iteration

Numerical adaptive learning control scheme for discrete-time non-linear systems

Neuro-Optimal Control for a Class of Unknown Nonlinear Dynamic Systems Using SN-DHP Technique

Multiperson zero-sum differential games for a class of uncertain nonlinear systems

基于ADP算法的带时滞及饱和的非线性系统优化控制

Neural-network-based optimal control for a class of unknown discrete-time nonlinear systems using gl

Discrete-Time Neural Network for Fast Solving Large Linear L1 Estimation Problems and Its Applicatio

Reinforcement Learning for Adaptive Optimal Control of Unknown Continuous-Time Nonlinear Systems Wit

Finite-Approximation-Error Based Optimal Control Approach for Discrete-Time Nonlinear Systems

Adaptive Dynamic Programming for Optimal Tracking Control of Unknown Nonlinear Systems With Applicat

Decentralized Stabilization for a Class of Continuous-Time Nonlinear Interconnected Systems Using On

Dual Heuristic Dynamic Programming for Nonlinear Discrete-Time Uncertain Systems With State Delay

An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time non

Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown intern

Adaptive Dynamic Programming Algorithm for Renewable Energy Scheduling and Battery Management

How to automatically set an initial angle for balance control of a cart-pole system: an education ca

Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming

Dual Iterative Adaptive Dynamic Programming for a Class of Discrete-Time Nonlinear Systems With Time

基于OGRE的车辆自适应巡航控制三维仿真

A supervised Actor-Critic approach for adaptive cruise control

Approximation-error-ADP-based optimal tracking control for chaotic systems with convergence proof

基于神经网络近似的自适应优化控制

Convergence analysis andapplication of fuzzy-HDP for nonlinear discrete-time HJB systems

Policy iteration optimaltracking control for chaotic systems by adaptive dynamic programming approac

A Human-like full range adaptive cruise control based on supervised adaptive dynamic programming

Approximate optimal solution of the DTHJB equation for a class of nonlinear affine systems with unkn

Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

Neural-Network-Observer-Based Optimal Control for Unknown Nonlinear Systems Using Adaptive Dynamic P

多目标执行依赖启发式动态规划励磁控制

Computational Intelligence in Urban Traffic Signal Control: A Survey

基于数据的自学习优化控制：研究进展与展望

Optimal Tracking Control for a Class of Unknown Discrete-Time Systems With Actuator Saturation Via D

A Novel Iterative theta-Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems

Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear

Full-range adaptive cruise control based on supervised adaptive dynamic programming

Adaptive Cruise Control Based on Reinforcement Leaning with Shaping Rewards

Finite Horizon Optimal Control of Discrete-TimeNonlinear Systems with Unfixed Initial State Using Ad

Neural-network-basedadaptive optimal tracking control scheme for discrete-time nonlinear systemswith

Neuro-optimal trackingcontrol for a class of discrete-time nonlinear systems via generalized valueit

Neural-Network-Based Online Optimal Control for Uncertain Non-Linear Continuous-Time Systems With Co

Self-teaching adaptive dynamic programming for Gomoku

Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic progr

非奇异终端滑模控制系统相轨迹和暂态分析

Optimal Control of Switched Systems with an Inequality Constraint Based on Smooth Penalty Function M

Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time

Neural-Network-BasedDecentralized Control of Continuous-Time Nonlinear Interconnected Systems withUn

Centralized anddecentralized event-triggered controlfor group consensus with fixed topology incontin

Neural-Network-Based Zero-Sum Game for Discrete-Time Nonlinear Systems Via Iterative Adaptive Dynami

基于ADP算法的带时滞及饱和的非线性系统最优控制

旋转翼无人机系统

Finite horizon optimal control of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming

带饱和执行器非线性时滞系统的自适应动态规划

基于评价网络近似误差的自适应动态规划优化控制

基于RBF-ELM神经网络的超级电容建模方法

Optimal Tracking Control for a Class of Unknown Discrete-time Systems with Actuator Saturation via Data-based ADP Algorithm

Adaptive Sliding Mode Control for Re-entry Attitude of Near Space Hypersonic Vehicle Based on Backstepping Design

基于ε-ADP的一类离散非线性系统最优跟踪控制

带有储能设备的智能电网电能迭代自适应动态规划最优控制

城市交通大数据技术及智能应用系统

工业云操作系统的自主研制

Cyber-physical-social system in intelligent transportation

基于数据的自学习优化控制：研究进展与展望

不完全信息议价博弈的序贯均衡分析与计算实验

求解三维装箱问题的启发式正交二叉树搜索算法

一类地名大数据库和公共服务云平台的研发

3D打印颅骨在颅底解剖教学中的应用

基于ACP方法的应急疏散系统研究

平行数据：从大数据到数据智能

情报5．0：平行时代的平行情报体系

机器人的未来发展：从工业自动化到知识自动化

从激光到激活：钱学森的情报理念与平行情报体系

软件定义的系统与知识自动化：从牛顿到默顿的平行升华

深度学习在控制领域的研究现状与展望

平行控制：数据驱动的计算控制方法

区块链技术：从数据智能到知识自动化

从工业4.0到能源5.0：智能能源系统的概念、内涵及体系框架

基于ACP方法的高层建筑火灾中人员疏散策略研究

Analysis of dynamic features in intersecting pedestrian flows

基于深度强化学习的平行企业资源计划

实时竞价广告研究述评

PDP: Parallel Dynamic Programming

Traffic Signal Timing via Deep Reinforcement Learning

经营战略的革命——合作生产

分布参数系统的平行控制：从基于模型的控制到数据驱动的智能控制

基于ASKE引擎的开源情报采集与分析研究-以2008-2012年国内情报学核心期刊为例

指控5．O：平行时代的智能指挥与控制体系

Where Does AlphaGo Go: From Church-Turing Thesis to AlphaGo Thesis and Beyond

从社会计算到社会制造：一场即将来临的产业革命

Social Media Based Transportation Research: the State of the Work and the Networking

车辆关键状态的平行估计

A fuzzy-rule-based Couzin model

射流火焰温度场广义随机分布模型的迭代学习控制

带有储能设备的智能电网电能迭代自适应动态规划最优控制

期刊信息

《自动化学报》
中国科技核心期刊

主管单位:中国科学院
主办单位:中国自动化学会中国科学院自动化研究所
主编：王飞跃
地址：北京东黄城根北街16号
邮编：100717
邮箱：aas@ia.ac.cn
电话：010-64019820

国际标准刊号：ISSN：0254-4156
国内统一刊号：ISSN：11-2109/TP
邮发代号:2-180

获奖情况:
1997年获全国优秀期刊奖,1985、1990、1996、2000年获中国科学院优秀期刊二等奖,2002年获国家期刊奖

国内外数据库收录:
美国数学评论（网络版）,德国数学文摘,荷兰文摘与引文数据库,美国工程索引,日本日本科学技术振兴机构数据库,中国中国科技核心期刊,中国北大核心期刊（2004版）,中国北大核心期刊（2008版）,中国北大核心期刊（2011版）,中国北大核心期刊（2014版）,中国北大核心期刊（2000版）

被引量:27550