检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

229 篇 会议
18 篇 期刊文献

馆藏范围

247 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

113 篇 工学
- 103 篇 计算机科学与技术...
- 42 篇 软件工程
- 38 篇 电气工程
- 23 篇 控制科学与工程
- 5 篇 信息与通信工程
- 3 篇 机械工程
- 2 篇 力学（可授工学、理...
- 1 篇 仪器科学与技术
- 1 篇 建筑学
- 1 篇 化学工程与技术
- 1 篇 交通运输工程
27 篇 理学
- 25 篇 数学
- 7 篇 系统科学
- 6 篇 统计学（可授理学、...
- 1 篇 物理学
- 1 篇 化学
- 1 篇 大气科学
10 篇 管理学
- 8 篇 管理科学与工程(可...
- 3 篇 工商管理
- 2 篇 图书情报与档案管...
2 篇 经济学
- 2 篇 应用经济学
1 篇 法学
- 1 篇 社会学

主题

95 篇 dynamic programm...
54 篇 optimal control
51 篇 learning
44 篇 reinforcement le...
35 篇 learning (artifi...
27 篇 equations
25 篇 neural networks
22 篇 heuristic algori...
20 篇 convergence
20 篇 control systems
18 篇 function approxi...
18 篇 mathematical mod...
16 篇 approximation al...
15 篇 vectors
15 篇 cost function
14 篇 markov processes
14 篇 nonlinear system...
14 篇 artificial neura...
13 篇 stochastic proce...
12 篇 adaptive dynamic...

机构

10 篇 chinese acad sci...
5 篇 school of inform...
4 篇 northeastern uni...
4 篇 department of el...
4 篇 department of in...
3 篇 department of el...
3 篇 automation and r...
3 篇 department of el...
3 篇 robotics institu...
3 篇 key laboratory o...
3 篇 natl univ def te...
3 篇 univ illinois de...
2 篇 department of ar...
2 篇 school of electr...
2 篇 univ groningen i...
2 篇 univ texas autom...
2 篇 colorado state u...
2 篇 guangxi univ sch...
2 篇 national science...
2 篇 informatics inst...

作者

13 篇 liu derong
7 篇 hado van hasselt
7 篇 marco a. wiering
7 篇 dongbin zhao
6 篇 zhao dongbin
5 篇 xu xin
5 篇 lewis frank l.
5 篇 huaguang zhang
5 篇 wei qinglai
5 篇 derong liu
5 篇 warren b. powell
4 篇 haibo he
4 篇 jagannathan s.
4 篇 frank l. lewis
4 篇 zhang huaguang
4 篇 ni zhen
4 篇 yanhong luo
4 篇 wang ding
4 篇 he haibo
4 篇 damien ernst

语言

246 篇 英文
1 篇 其他

检索条件"任意字段=2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2014"

共 247 条记录，以下是11-20 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Nonparametric Infinite Horizon Kullback-Leibler Stochastic Control

Nonparametric Infinite Horizon Kullback-Leibler Stochastic C...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Pan, Yunpeng Theodorou, Evangelos A. Georgia Inst Technol Daniel Guggenheim Sch Aerosp Engn Atlanta GA 30332 USA

ISBN: (纸本)9781479945528

We present two nonparametric approaches to Kullback-Leibler (KL) control, or linearly-solvable Markov decision problem (LMDP) based on Gaussian processes (GP) and Nystrom approximation. Compared to recently developed parametric methods, the proposed data-driven frameworks feature accurate function approximation and efficient on-line operations. Theoretically, we derive the mathematical connection of KL control based on dynamic programming with earlier work in control theory which relies on information theoretic dualities for the infinite time horizon case. Algorithmically, we give explicit optimal control policies in nonparametric forms, and propose on-line update schemes with budgeted computational costs. Numerical results demonstrate the effectiveness and usefulness of the proposed frameworks.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

adaptive dynamic programming for Discrete-time LQR Optimal Tracking Control Problems with Unknown dynamics

Adaptive Dynamic Programming for Discrete-time LQR Optimal T...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Liu, Yang Luo, Yanhong Zhang, Huaguang Northeastern Univ Sch Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China

ISBN: (纸本)9781479945528

In this paper, an optimal tracking control approach based on adaptive dynamic programming (ADP) algorithm is proposed to solve the linear quadratic regulation (LQR) problems for unknown discrete-time systems in an online fashion. First, we convert the optimal tracking problem into designing infinite-horizon optimal regulator for the tracking error dynamics based on the system transformation. Then we expand the error state equation by the history data of control and state. The iterative ADP algorithm of policy iteration (PI) and value iteration (VI) are introduced to solve the value function of the controlled system. It is shown that the proposed ADP algorithm solves the LQR without requiring any knowledge of the system dynamics. The simulation results show the convergence and effectiveness of the proposed control scheme.

关键词： Digital control systems

来源：评论

学校读者我要写书评

暂无评论

Convergent reinforcement learning Control with Neural Networks and Continuous Action Search

Convergent Reinforcement Learning Control with Neural Networ...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Lee, Minwoo Anderson, Charles W. Colorado State Univ Dept Comp Sci Ft Collins CO 80523 USA

ISBN: (纸本)9781479945528

We combine a convergent TD-learning method and direct continuous action search with neural networks for function approximation to obtain both stability and generalization over inexperienced state-action pairs. We extend linear Greedy-GQ to nonlinear neural networks for convergent learning. Direct continuous action search with back-propagation leads to efficient high-precision control. A high dimensional continuous state and action problem, octopus arm control, is examined to test the proposed algorithm. Comparing TD, linear Greedy-GQ, and nonlinear Greedy-GQ, we discuss how the correction term contributes to learning with nonlinear Greedy-GQ algorithm and how continuous action search contributes to learning speed and stability.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Data-Driven Partially Observable dynamic Processes Using adaptive dynamic programming

Data-Driven Partially Observable Dynamic Processes Using Ada...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Zhong, Xiangnan Ni, Zhen Tang, Yufei He, Haibo Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA

ISBN: (纸本)9781479945528

adaptive dynamic programming (ADP) has been widely recognized as one of the "core methodologies" to achieve optimal control for intelligent systems in Markov decision process (MDP). Generally, ADP control design requires all the information of the system dynamics. However, in many practical situations, the measured input and output data can only represent part of the system states. This means the complete information of the system cannot be available in many real-world cases, which narrows the range of application of the ADP design. In this paper, we propose a data-driven ADP method to stabilize the system with partially observable dynamics based on neural network techniques. A state network is integrated into the typical actor-critic architecture to provide an estimated state from the measured input/output sequences. The theoretical analysis and the stability discussion of this data-driven ADP method are also provided. Two examples are studied to verify our proposed method.

关键词： Markov processes

来源：评论

学校读者我要写书评

暂无评论

Neural-Network-Based adaptive dynamic Surface Control for MIMO Systems with Unknown Hysteresis

Neural-Network-Based Adaptive Dynamic Surface Control for MI...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Liu, Lei Wang, Zhanshan Shen, Zhengwei Northeastern Univ Coll Informat Sci & Engn Shenyang Liaoning Peoples R China

ISBN: (纸本)9781479945528

This paper focuses on the composite adaptive tracking control for a class of nonlinear multiple-input-multiple-output (MIMO) systems with unknown backlash-like hysteresis nonlinearities. A dynamic surface control method is incorporated into the proposed control strategy to eliminate the problem of explosion of complexity. Compared with some existing methods, the prediction error between system state and serial-parallel estimation model is combined with compensated tracking error to construct the adaptive laws for neural network (NN) weights. It is shown that the proposed control approach can guarantee that all the signals of the resulting closed-loop systems are semi-globally uniformly ultimately bounded and the tracking error converges to a small neighborhood. Finally, simulation results are provided to confirm the effectiveness of the proposed approaches.

关键词： dynamic surface control prediction error backlash-like hysteresis adaptive neural network control

来源：评论

学校读者我要写书评

暂无评论

Theoretical Analysis of a reinforcement learning based Switching Scheme

Theoretical Analysis of a Reinforcement Learning based Switc...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Heydari, Ali South Dakota Sch Mines & Technol Dept Mech Engn Rapid City SD 57701 USA

ISBN: (纸本)9781479945528

A reinforcement learning based scheme for optimal switching with an infinite-horizon cost function is briefly proposed in this paper. Several theoretical questions are shown to arise regarding its convergence, optimality of the result, and continuity of the limit function, to be uniformly approximated using parametric function approximators. The main contribution of the paper is providing rigorous answers for the questions, where, sufficient conditions for convergence, optimality, and continuity are provided.

关键词： function approximation learning (artificial intelligence) infinite-horizon cost function optimal switching parametric function approximators reinforcement learning based switching scheme Approximation methods Artificial neural networks Convergence Cost function Optimal control Schedules Switches learning (artificial intelligence) Cost functions Approximation method function approximation Switches Converge Artificial neural networks Optimal control Theoretical analysis

来源：评论

学校读者我要写书评

暂无评论

Model-Based Multi-Objective reinforcement learning

Model-Based Multi-Objective Reinforcement Learning

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Wiering, Marco A. Withagen, Maikel Drugan, Madalina M. Univ Groningen Inst Artificial Intelligence NL-9700 AB Groningen Netherlands Vrije Univ Brussel Artificial Intelligence Lab Ixelles Brunei

ISBN: (纸本)9781479945528

This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorithm first learns a model of the multi-objective sequential decision making problem, after which this learned model is used by a multi-objective dynamic programming method to compute Pareto optimal policies. The advantage of this model-based multi-objective reinforcement learning method is that once an accurate model has been estimated from the experiences of an agent in some environment, the dynamic programming method will compute all Pareto optimal policies. Therefore it is important that the agent explores the environment in an intelligent way by using a good exploration strategy. In this paper we have supplied the agent with two different exploration strategies and compare their effectiveness in estimating accurate models within a reasonable amount of time. The experimental results show that our method with the best exploration strategy is able to quickly learn all Pareto optimal policies for the Deep Sea Treasure problem.

关键词： Pareto optimisation decision making dynamic programming learning (artificial intelligence) Pareto optimal policies deep sea treasure problem model-based multiobjective reinforcement learning multiobjective dynamic programming method multiobjective sequential decision making problem Computational modeling dynamic programming Heuristic algorithms learning (artificial intelligence) Markov processes Pareto optimization Vectors Pareto optimisation dynamic programming exploration strategy Heuristic algorithms learning (artificial intelligence) Computational modeling Markov chain Agents optimal strategy decision making

来源：评论

学校读者我要写书评

暂无评论

Model-free Q-learning over Finite Horizon for Uncertain Linear Continuous-time Systems

Model-free <i>Q</i>-learning over Finite Horizon for Uncerta...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Xu, Hao Jagannathan, S. Texas A&M Univ Coll Sci & Engn Corpus Christi TX 78412 USA Missouri Univ Sci & Technol Dept Elect & Comp Engn Rolla MO USA

ISBN: (纸本)9781479945528

In this paper, a novel optimal control over finite horizon has been introduced for linear continuous-time systems by using adaptive dynamic programming (ADP). First, a new time-varying Q-function parameterization and its estimator are introduced. Subsequently, Q-function estimator is tuned online by using both Bellman equation in integral form and terminal cost. Eventually, near optimal control gain is obtained by using the Q-function estimator. All the closed-loop signals are shown to be bounded by using Lyapunov stability analysis where bounds are functions of initial conditions and final time while the estimated control signal converges close to the optimal value. The simulation results illustrate the effectiveness of the proposed scheme.

关键词： adaptive dynamics programming (ADP) Q-learning Optimal Control Riccati Equation Forward-in-time

来源：评论

学校读者我要写书评

暂无评论

Continuous-Time Differential dynamic programming with Terminal Constraints

Continuous-Time Differential Dynamic Programming with Termin...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Sun, Wei Theodorou, Evangelos A. Tsiotras, Panagiotis

ISBN: (纸本)9781479945528

In this work, we revisit the continuous-time Differential dynamic programming (DDP) approach for solving optimal control problems with terminal state constraints. We derive two algorithms, each for different order of expansion of the system dynamics and we investigate their performance in terms of their convergence speed. Compared to previous work, we provide a set of backward differential equations for the value function expansion by relaxing the assumption that the initial nominal control must be very close to the optimal control solution. We apply the derived algorithms to two classical optimal control problems, namely, the inverted pendulum and the Dreyfus rocket problem and show the benefit of second order expansion.

关键词： Inverted pendulum

来源：评论

学校读者我要写书评

暂无评论

adaptive Fault Identification for a Class of Nonlinear dynamic Systems

Adaptive Fault Identification for a Class of Nonlinear Dynam...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning (adprl)

作者： Wu, Li-Bing Ye, Dan Zhao, Xin-Gang Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China Univ Sci & Technol Liaoning Coll Sci Anshan 114051 Liaoning Peoples R China Chinese Acad Sci State Key Lab Robot Shenyang 110016 Liaoning Peoples R China Chinese Acad Sci Shenyang Inst Automat Shenyang 110016 Liaoning Peoples R China

ISBN: (纸本)9781479945528

This paper is concerned with the diagnosis problem of actuator faults for a class of nonlinear systems. It is assumed that the upper bound of the Lipschtiz constant of the nonlinearity in the faulty system is unknown. Then, a new nonlinear observer for fault diagnosis based on an adaptive estimator is proposed. Moreover, by making use of the designed adaptive observer with on-line update control law without sigma-modification condition to approximate the faulty system, it is proved that the estimate error of the adaptive control parameter, the output observation error and the error between the system fault and the corresponding estimate value are uniformly ultimately bounded via Lyapunov stability analysis. Finally, simulation examples are provided to illustrate the efficiency of the proposed fault identification approach.

关键词： Errors

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共25页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：