检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

748 篇 会议
271 篇 期刊文献
4 册 图书

馆藏范围

1,023 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

712 篇 工学
- 520 篇 计算机科学与技术...
- 381 篇 电气工程
- 278 篇 控制科学与工程
- 153 篇 软件工程
- 79 篇 信息与通信工程
- 40 篇 交通运输工程
- 23 篇 仪器科学与技术
- 20 篇 机械工程
- 9 篇 生物工程
- 8 篇 电子科学与技术（可...
- 7 篇 力学（可授工学、理...
- 7 篇 土木工程
- 6 篇 动力工程及工程热...
- 6 篇 石油与天然气工程
- 4 篇 生物医学工程（可授...
- 3 篇 材料科学与工程（可...
- 3 篇 化学工程与技术
- 3 篇 航空宇航科学与技...
- 3 篇 安全科学与工程
118 篇 理学
- 98 篇 数学
- 32 篇 系统科学
- 22 篇 统计学（可授理学、...
- 10 篇 生物学
- 8 篇 物理学
- 4 篇 化学
66 篇 管理学
- 63 篇 管理科学与工程(可...
- 14 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 4 篇 应用经济学
3 篇 法学
- 3 篇 社会学
2 篇 医学
1 篇 教育学

主题

313 篇 reinforcement le...
216 篇 dynamic programm...
206 篇 optimal control
107 篇 adaptive dynamic...
104 篇 adaptive dynamic...
97 篇 learning
88 篇 neural networks
78 篇 heuristic algori...
68 篇 reinforcement le...
58 篇 learning (artifi...
54 篇 nonlinear system...
53 篇 convergence
51 篇 control systems
51 篇 mathematical mod...
48 篇 approximate dyna...
44 篇 approximation al...
43 篇 equations
42 篇 adaptive control
41 篇 artificial neura...
41 篇 cost function

机构

41 篇 chinese acad sci...
27 篇 univ rhode isl d...
17 篇 tianjin univ sch...
16 篇 univ sci & techn...
16 篇 univ illinois de...
15 篇 northeastern uni...
14 篇 beijing normal u...
13 篇 northeastern uni...
13 篇 guangdong univ t...
12 篇 northeastern uni...
9 篇 natl univ def te...
8 篇 ieee
8 篇 univ chinese aca...
7 篇 univ chinese aca...
7 篇 cent south univ ...
7 篇 southern univ sc...
7 篇 beijing univ tec...
6 篇 chinese acad sci...
6 篇 missouri univ sc...
5 篇 nanjing univ pos...

作者

54 篇 liu derong
37 篇 wei qinglai
29 篇 he haibo
22 篇 wang ding
21 篇 xu xin
19 篇 jiang zhong-ping
17 篇 lewis frank l.
17 篇 yang xiong
17 篇 zhang huaguang
17 篇 ni zhen
16 篇 zhao bo
15 篇 gao weinan
14 篇 zhao dongbin
13 篇 derong liu
13 篇 zhong xiangnan
12 篇 si jennie
10 篇 jagannathan s.
10 篇 dongbin zhao
10 篇 song ruizhuo
9 篇 abouheaf mohamme...

语言

992 篇 英文
25 篇 其他
6 篇 中文

检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"

共 1023 条记录，以下是711-720 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

adaptive sample collection using active learning for kernel-based approximate policy iteration

Adaptive sample collection using active learning for kernel-...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Liu, Chunming Xu, Xin Haiyun Hu Dai, Bin College of Mechatronics and Automation National University of Defense Technology Changsha 410073 China

ISBN: (纸本)9781424498888

Approximate policy iteration (API) has been shown to be a class of reinforcement learning methods with stability and sample efficiency. However, sample collection is still an open problem which is critical to the performance of API methods. In this paper, a novel adaptive sample collection strategy using active learning-based exploration is proposed to enhance the performance of kernel-based API. In this strategy, an online kernel-based least squares policy iteration (KLSPI) method is adopted to construct nonlinear features and approximate the Q-function simultaneously. Therefore, more representative samples can be obtained for value function approximation. Simulation results on typical learning control problems illustrate that by using the proposed strategy, the performance of KLSPI can be improved remarkably. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Bayesian active learning with basis functions

Bayesian active learning with basis functions

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Ryzhov, Ilya O. Powell, Warren B. Operations Research and Financial Engineering Princeton University Princeton NJ 08544 United States

ISBN: (纸本)9781424498888

A common technique for dealing with the curse of dimensionality in approximate dynamic programming is to use a parametric value function approximation, where the value of being in a state is assumed to be a linear combination of basis functions. Even with this simplification, we face the exploration/exploitation dilemma: an inaccurate approximation may lead to poor decisions, making it necessary to sometimes explore actions that appear to be suboptimal. We propose a Bayesian strategy for active learning with basis functions, based on the knowledge gradient concept from the optimal learning literature. The new method performs well in numerical experiments conducted on an energy storage problem. © 2011 ieee.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Feedback controller parameterizations for reinforcement learning

Feedback controller parameterizations for Reinforcement Lear...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Roberts, John W. Manchester, Ian R. Tedrake, Russ CSAIL MIT Cambridge MA 02139 United States

ISBN: (纸本)9781424498888

reinforcement learning offers a very general framework for learning controllers, but its effectiveness is closely tied to the controller parameterization used. Especially when learning feedback controllers for weakly stable systems, ineffective parameterizations can result in unstable controllers and poor performance both in terms of learning convergence and in the cost of the resulting policy. In this paper we explore four linear controller parameterizations in the context of REINFORCE, applying them to the control of a reaching task with a linearized flexible manipulator. We find that some natural but naive parameterizations perform very poorly, while the Youla Parameterization (a popular parameterization from the controls literature) offers a number of robustness and performance advantages. © 2011 ieee.

关键词： Parameterization

来源：评论

学校读者我要写书评

暂无评论

Grounding subgoals in information transitions

Grounding subgoals in information transitions

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Van Dijk, Sander G. Polani, Daniel Adaptive Systems Research Group University of Hertfordshire Hatfield United Kingdom

ISBN: (纸本)9781424498888

In reinforcement learning problems, the construction of subgoals has been identified as an important step to speed up learning and to enable skill transfer. For this purpose, one typically extracts states from various saliency properties of an MDP transition graph, most notably bottleneck states. Here we introduce an alternative approach to this problem: assuming a family of MDPs with multiple goals but with a fixed transition graph, we introduce the relevant goal information as the amount of Shannon information that the agent needs to maintain about the current goal at a given state to select the appropriate action. We show that there are distinct transition states in the MDP at which new relevant goal information has to be considered for selecting the next action. We argue that these transition states can be interpreted as subgoals for the current task class, and we use these states to automatically create a hierarchical policy, according to the well-established Options model for hierarchical reinforcement learning. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Protecting against evaluation overfitting in empirical reinforcement learning

Protecting against evaluation overfitting in empirical reinf...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Whiteson, Shimon Tanner, Brian Taylor, Matthew E. Stone, Peter Informatics Institute University of Amsterdam Netherlands Department of Computing Science University of Alberta Canada Department of Computer Science Lafayette College United States Department of Computer Science University of Texas Austin United States

ISBN: (纸本)9781424498888

Empirical evaluations play an important role in machine learning. However, the usefulness of any evaluation depends on the empirical methodology employed. Designing good empirical methodologies is difficult in part because agents can overfit test evaluations and thereby obtain misleadingly high scores. We argue that reinforcement learning is particularly vulnerable to environment overfitting and propose as a remedy generalized methodologies, in which evaluations are based on multiple environments sampled from a distribution. In addition, we consider how to summarize performance when scores from different environments may not have commensurate values. Finally, we present proof-of-concept results demonstrating how these methodologies can validate an intuitively useful range-adaptive tile coding method. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Active exploration by searching for experiments that falsify the computed control policy

Active exploration by searching for experiments that falsify...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Fonteneau, Raphael Murphy, Susan A. Wehenkel, Louis Ernst, Damien Department of Electrical Engineering and Computer Science University of Liège Belgium Department of Statistics University of Michigan United States

ISBN: (纸本)9781424498888

We propose a strategy for experiment selection - in the context of reinforcement learning - based on the idea that the most interesting experiments to carry out at some stage are those that are the most liable to falsify the current hypothesis about the optimal control policy. We cast this idea in a context where a policy learning algorithm and a model identification method are given a priori. Experiments are selected if, using the learnt environment model, they are predicted to yield a revision of the learnt control policy. Algorithms and simulation results are provided for a deterministic system with discrete action space. They show that the proposed approach is promising. © 2011 ieee.

关键词： learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Approximate reinforcement learning: An overview

Approximate reinforcement learning: An overview

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Buşoniu, Lucian Ernst, Damien De Schutter, Bart Babuška, Robert Delft Center for Systems and Control Delft Univ. of Technology Netherlands Research Associate of the FRS-FNRS Systems and Modeling Unit University of Liège Liège Belgium

ISBN: (纸本)9781424498888

reinforcement learning (RL) allows agents to learn how to optimally interact with complex environments. Fueled by recent advances in approximation-based algorithms, RL has obtained impressive successes in robotics, artificial intelligence, control, operations research, etc. However, the scarcity of survey papers about approximate RL makes it difficult for newcomers to grasp this intricate field. With the present overview, we take a step toward alleviating this situation. We review methods for approximate RL, starting from their dynamic programming roots and organizing them into three major classes: approximate value iteration, policy iteration, and policy search. Each class is subdivided into representative categories, highlighting among others offline and online algorithms, policy gradient methods, and simulation-based techniques. We also compare the different categories of methods, and outline possible ways to enhance the reviewed algorithms. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

dynamic lead time promising

Dynamic lead time promising

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Reindorp, Matthew J. Fu, Michael C. Department of Industrial Engineering and Innovation Sciences Eindhoven University of Technology Netherlands Robert H. Smith School of Business Institute for Systems Research University of Maryland United States

ISBN: (纸本)9781424498888

We consider a make-to-order business that serves customers in multiple priority classes. Orders from customers in higher classes bring greater revenue, but they expect shorter lead times than customers in lower classes. In making lead time promises, the firm must recognize preexisting order commitments, uncertainty over future demand from each class, and the possibility of supply chain disruptions. We model this scenario as a Markov decision problem and use reinforcement learning to determine the firm's lead time policy. In order to achieve tractability on large problems, we utilize a sequential decision-making approach that effectively allows us to eliminate one dimension from the state space of the system. Initial numerical results from the sequential dynamic approach suggest that the resulting policies more closely approximate optimal policies than static optimization approaches. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

On learning with imperfect representations

On learning with imperfect representations

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Kalyanakrishnan, Shivaram Stone, Peter Department of Computer Science University of Texas at Austin 1616 Guadalupe St Austin TX 78701 United States

ISBN: (纸本)9781424498888

In this paper we present a perspective on the relationship between learning and representation in sequential decision making tasks. We undertake a brief survey of existing real-world applications, which demonstrates that the classical tabular representation seldom applies in practice. Specifically, several practical tasks suffer from state aliasing, and most demand some form of generalization and function approximation. Coping with these representational aspects thus becomes an important direction for furthering the advent of reinforcement learning in practice. The central thesis we present in this position paper is that in practice, learning methods specifically developed to work with imperfect representations are likely to perform better than those developed for perfect representations and then applied in imperfect- representation settings. We specify an evaluation criterion for learning methods in practice, and propose a framework for their synthesis. In particular, we highlight the degrees of representational bias prevalent in different learning methods. We reference a variety of relevant literature as a background for this introspective essay. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Application of reinforcement learning-based algorithms in CO2 allowance and electricity markets

Application of reinforcement learning-based algorithms in CO...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning

作者： Nanduri, Vishnuteja Department of Industrial and Manufacturing Engineering University of Wisconsin-Milwaukee Milwaukee WI 53211 United States

ISBN: (纸本)9781424498888

Climate change is one of the most important challenges faced by the world this century. In the U.S., the electric power industry is the largest emitter of CO2, contributing to the climate crisis. Federal emissions control bills in the form of cap-and-trade programs are currently idling in the U.S. Congress. In the mean time, ten states in the northeastern U.S. have adopted a regional cap-and-trade program to reduce CO2 levels and also to increase investments in cleaner technologies. Many of the states in which the cap-and-trade programs are active operate under a restructured market paradigm, where generators compete to supply power. This research presents a bi-level game-theoretic model to capture competition between generators in cap-and-trade markets and restructured electricity markets. The solution to the game-theoretic model is obtained using a reinforcement learning based algorithm. © 2011 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共103页 << < 68 69 70 71 72 73 74 75 76 77 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：