检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

746 篇 会议
270 篇 期刊文献
4 册 图书

馆藏范围

1,020 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

711 篇 工学
- 520 篇 计算机科学与技术...
- 380 篇 电气工程
- 278 篇 控制科学与工程
- 153 篇 软件工程
- 79 篇 信息与通信工程
- 40 篇 交通运输工程
- 23 篇 仪器科学与技术
- 20 篇 机械工程
- 9 篇 生物工程
- 8 篇 电子科学与技术（可...
- 7 篇 力学（可授工学、理...
- 7 篇 土木工程
- 6 篇 动力工程及工程热...
- 6 篇 石油与天然气工程
- 4 篇 生物医学工程（可授...
- 3 篇 材料科学与工程（可...
- 3 篇 化学工程与技术
- 3 篇 航空宇航科学与技...
- 3 篇 安全科学与工程
118 篇 理学
- 98 篇 数学
- 32 篇 系统科学
- 22 篇 统计学（可授理学、...
- 10 篇 生物学
- 8 篇 物理学
- 4 篇 化学
66 篇 管理学
- 63 篇 管理科学与工程(可...
- 14 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 4 篇 应用经济学
3 篇 法学
- 3 篇 社会学
2 篇 医学
1 篇 教育学

主题

312 篇 reinforcement le...
216 篇 dynamic programm...
206 篇 optimal control
107 篇 adaptive dynamic...
104 篇 adaptive dynamic...
97 篇 learning
88 篇 neural networks
78 篇 heuristic algori...
68 篇 reinforcement le...
58 篇 learning (artifi...
54 篇 nonlinear system...
53 篇 convergence
51 篇 control systems
51 篇 mathematical mod...
48 篇 approximate dyna...
44 篇 approximation al...
43 篇 equations
42 篇 adaptive control
41 篇 artificial neura...
41 篇 cost function

机构

41 篇 chinese acad sci...
27 篇 univ rhode isl d...
17 篇 tianjin univ sch...
16 篇 univ sci & techn...
16 篇 univ illinois de...
15 篇 northeastern uni...
14 篇 beijing normal u...
13 篇 northeastern uni...
13 篇 guangdong univ t...
12 篇 northeastern uni...
9 篇 natl univ def te...
8 篇 ieee
8 篇 univ chinese aca...
7 篇 univ chinese aca...
7 篇 cent south univ ...
7 篇 southern univ sc...
7 篇 beijing univ tec...
6 篇 chinese acad sci...
6 篇 missouri univ sc...
5 篇 nanjing univ pos...

作者

54 篇 liu derong
37 篇 wei qinglai
29 篇 he haibo
22 篇 wang ding
21 篇 xu xin
19 篇 jiang zhong-ping
17 篇 lewis frank l.
17 篇 yang xiong
17 篇 zhang huaguang
17 篇 ni zhen
16 篇 zhao bo
15 篇 gao weinan
14 篇 zhao dongbin
13 篇 zhong xiangnan
12 篇 si jennie
12 篇 derong liu
10 篇 jagannathan s.
10 篇 dongbin zhao
10 篇 song ruizhuo
9 篇 abouheaf mohamme...

语言

994 篇 英文
20 篇 其他
6 篇 中文

检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"

共 1020 条记录，以下是591-600 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Model-based multi-objective reinforcement learning

Model-based multi-objective reinforcement learning

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Marco A. Wiering Maikel Withagen Mădălina M Drugan Institute of Artificial Intelligence University of Groningen The Netherlands Artificial Intelligence Lab Vrije Universiteit Brussel Belgium

This paper describes a novel multi-objective reinforcement learning algorithm. The proposed algorithm first learns a model of the multi-objective sequential decision making problem, after which this learned model is used by a multi-objective dynamic programming method to compute Pareto optimal policies. The advantage of this model-based multi-objective reinforcement learning method is that once an accurate model has been estimated from the experiences of an agent in some environment, the dynamic programming method will compute all Pareto optimal policies. Therefore it is important that the agent explores the environment in an intelligent way by using a good exploration strategy. In this paper we have supplied the agent with two different exploration strategies and compare their effectiveness in estimating accurate models within a reasonable amount of time. The experimental results show that our method with the best exploration strategy is able to quickly learn all Pareto optimal policies for the Deep Sea Treasure problem.

关键词： Computational modeling Pareto optimization learning (artificial intelligence) Heuristic algorithms dynamic programming Vectors Markov processes

来源：评论

学校读者我要写书评

暂无评论

adaptive dynamic programming for discrete-time LQR optimal tracking control problems with unknown dynamics

Adaptive dynamic programming for discrete-time LQR optimal t...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Yang Liu Yanhong Luo Huaguang Zhang School of Information Science and Engineering Northeastern University Shenyang Liaoning China

ISBN: (纸本)9781479945511

In this paper, an optimal tracking control approach based on adaptive dynamic programming (ADP) algorithm is proposed to solve the linear quadratic regulation (LQR) problems for unknown discrete-time systems in an online fashion. First, we convert the optimal tracking problem into designing infinite-horizon optimal regulator for the tracking error dynamics based on the system transformation. Then we expand the error state equation by the history data of control and state. The iterative ADP algorithm of policy iteration (PI) and value iteration (VI) are introduced to solve the value function of the controlled system. It is shown that the proposed ADP algorithm solves the LQR without requiring any knowledge of the system dynamics. The simulation results show the convergence and effectiveness of the proposed control scheme.

关键词： Heuristic algorithms Trajectory dynamic programming Equations Algorithm design and analysis History Optimal control

来源：评论

学校读者我要写书评

暂无评论

Near-optimality bounds for greedy periodic policies with application to grid-level storage

Near-optimality bounds for greedy periodic policies with app...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Yuhai Hu Boris Defourny Department of Industrial & Systems Engineering Lehigh University USA

This paper is concerned with periodic Markov Decision Processes, as a simplified but already rich model for nonstationary infinite-horizon problems involving seasonal effects. Considering the class of policies greedy for periodic approximate value functions, we establish improved near-optimality bounds for such policies, and derive a corresponding value-iteration algorithm suitable for periodic problems. The effectiveness of a parallel implementation of the algorithm is demonstrated on a grid-level storage control problem that involves stochastic electricity prices following a daily cycle.

关键词： Silicon Markov processes Approximation algorithms Approximation methods Modeling Electricity dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Tunable and generic problem instance generation for multi-objective reinforcement learning

Tunable and generic problem instance generation for multi-ob...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Deon Garrett Jordi Bieger Kristinn R. Thórisson Icelandic Institute for Intelligent Machines Reykjavík University Iceland

A significant problem facing researchers in reinforcement learning, and particularly in multi-objective learning, is the dearth of good benchmarks. In this paper, we present a method and software tool enabling the creation of random problem instances, including multi-objective learning problems, with specific structural properties. This tool, called Merlin (for Multi-objective Environments for reinforcement learning), provides the ability to control these features in predictable ways, thus allowing researchers to begin to build a more detailed understanding about what features of a problem interact with a given learning algorithm to improve or degrade the algorithm's performance. We present this method and tool, and briefly discuss the controls provided by the generator, its supported options, and their implications on the generated benchmark instances.

关键词： learning (artificial intelligence) Correlation Generators Covariance matrices Benchmark testing Heuristic algorithms Optimization

来源：评论

学校读者我要写书评

暂无评论

Continuous-time differential dynamic programming with terminal constraints

Continuous-time differential dynamic programming with termin...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Wei Sun Evangelos A. Theodorou Panagiotis Tsiotras Mobile and Internet Systems Laboratory University College Cork Ireland

In this work, we revisit the continuous-time Differential dynamic programming (DDP) approach for solving optimal control problems with terminal state constraints. We derive two algorithms, each for different order of expansion of the system dynamics and we investigate their performance in terms of their convergence speed. Compared to previous work, we provide a set of backward differential equations for the value function expansion by relaxing the assumption that the initial nominal control must be very close to the optimal control solution. We apply the derived algorithms to two classical optimal control problems, namely, the inverted pendulum and the Dreyfus rocket problem and show the benefit of second order expansion.

关键词： Optimal control Heuristic algorithms Differential equations Equations Convergence Rockets Trajectory

来源：评论

学校读者我要写书评

暂无评论

On-policy Q-learning for adaptive optimal control

On-policy Q-learning for adaptive optimal control

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Sumit Kumar Jha Shubhendu Bhasin Department of Electrical Engineering Indian Institute of Technology Delhi New Delhi India

This paper presents a novel on-policy Q-learning approach for finding the optimal control policy online for continuous-time linear time invariant (LTI) systems with completely unknown dynamics. The proposed result estimates the unknown parameters of the optimal control policy based on the fixed point equation involving the Q-function. The gradient-based update laws, based on the minimization of the Bellman's error, are used to achieve online adaptation of parameters with the use of persistence of excitation condition. A novel asymptotically convergent state derivative estimator is presented to ensure that the proposed result is independent of knowledge of system dynamics. Simulation results are presented to validate the theoretical development.

关键词： Optimal control Adaptation models Convergence Estimation error adaptive systems Mathematical model Equations

来源：评论

学校读者我要写书评

暂无评论

Using supervised training signals of observable state dynamics to speed-up and improve reinforcement learning

Using supervised training signals of observable state dynami...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Daniel L Elliott Charles Anderson Dept of Computer Science Colorado State University

A common complaint about reinforcement learning (RL) is that it is too slow to learn a value function which gives good performance. This issue is exacerbated in continuous state spaces. This paper presents a straight-forward approach to speeding-up and even improving RL solutions by reusing features learned during a pre-training phase prior to Q-learning. During pre-training, the agent is taught to predict state change given a state/action pair. The effect of pre-training is examined using the model-free Q-learning approach but could readily be applied to a number of RL approaches including model-based RL. The analysis of the results provides ample evidence that the features learned during pre-training is the reason behind the improved RL performance.

关键词： Artificial neural networks Data models Training learning (artificial intelligence) Heuristic algorithms Supervised learning Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Using approximate dynamic programming for estimating the revenues of a hydrogen-based high-capacity storage device

Using approximate dynamic programming for estimating the rev...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Vincent François-Lavet Raphael Fonteneau Damien Ernst Department of Electrical Engineering and Computer Science University of Liège Belgium

This paper proposes a methodology to estimate the maximum revenue that can be generated by a company that operates a high-capacity storage device to buy or sell electricity on the day-ahead electricity market. The methodology exploits the dynamic programming (DP) principle and is specified for hydrogen-based storage devices that use electrolysis to produce hydrogen and fuel cells to generate electricity from hydrogen. Experimental results are generated using historical data of energy prices on the Belgian market. They show how the storage capacity and other parameters of the storage device influence the optimal revenue. The main conclusion drawn from the experiments is that it may be advisable to invest in large storage tanks to exploit the inter-seasonal price fluctuations of electricity.

关键词： Electricity Hydrogen Fuel cells Electrochemical processes Hydrogen storage dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Neural-network-based adaptive dynamic surface control for MIMO systems with unknown hysteresis

Neural-network-based adaptive dynamic surface control for MI...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Lei Liu Zhanshan Wang Zhengwei Shen College of Information Science and Engineering Northeastern University Shenyang Liaoning China

ISBN: (纸本)9781479945511

This paper focuses on the composite adaptive tracking control for a class of nonlinear multiple-input-multiple-output (MIMO) systems with unknown backlash-like hysteresis nonlinearities. A dynamic surface control method is incorporated into the proposed control strategy to eliminate the problem of explosion of complexity. Compared with some existing methods, the prediction error between system state and serial-parallel estimation model is combined with compensated tracking error to construct the adaptive laws for neural network (NN) weights. It is shown that the proposed control approach can guarantee that all the signals of the resulting closed-loop systems are semi-globally uniformly ultimately bounded and the tracking error converges to a small neighborhood. Finally, simulation results are provided to confirm the effectiveness of the proposed approaches.

关键词： Hysteresis Approximation methods adaptive systems MIMO Educational institutions Nonlinear systems Vectors

来源：评论

学校读者我要写书评

暂无评论

Convergent reinforcement learning control with neural networks and continuous action search

Convergent reinforcement learning control with neural networ...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Minwoo Lee Charles W. Anderson Department of Computer Science Colorado State University Fort Collins CO USA

We combine a convergent TD-learning method and direct continuous action search with neural networks for function approximation to obtain both stability and generalization over inexperienced state-action pairs. We extend linear Greedy-GQ to nonlinear neural networks for convergent learning. Direct continuous action search with back-propagation leads to efficient high-precision control. A high dimensional continuous state and action problem, octopus arm control, is examined to test the proposed algorithm. Comparing TD, linear Greedy-GQ, and nonlinear Greedy-GQ, we discuss how the correction term contributes to learning with nonlinear Greedy-GQ algorithm and how continuous action search contributes to learning speed and stability.

关键词： Function approximation Neural networks Approximation algorithms Vectors learning (artificial intelligence) Legged locomotion

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共102页 << < 56 57 58 59 60 61 62 63 64 65 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：