检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

140 篇 会议
7 篇 期刊文献

馆藏范围

147 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

71 篇 工学
- 66 篇 计算机科学与技术...
- 15 篇 软件工程
- 11 篇 电气工程
- 10 篇 控制科学与工程
- 2 篇 仪器科学与技术
- 2 篇 信息与通信工程
- 1 篇 力学（可授工学、理...
- 1 篇 机械工程
- 1 篇 建筑学
11 篇 理学
- 10 篇 数学
- 2 篇 系统科学
- 2 篇 统计学（可授理学、...
7 篇 管理学
- 6 篇 管理科学与工程(可...
- 3 篇 工商管理
- 1 篇 图书情报与档案管...
3 篇 经济学
- 3 篇 应用经济学

主题

76 篇 dynamic programm...
39 篇 learning
26 篇 optimal control
25 篇 reinforcement le...
15 篇 function approxi...
15 篇 control systems
14 篇 approximation al...
14 篇 equations
13 篇 neural networks
13 篇 stochastic proce...
12 篇 convergence
10 篇 state-space meth...
10 篇 cost function
9 篇 mathematical mod...
8 篇 trajectory
8 篇 approximation me...
7 篇 approximate dyna...
7 篇 algorithm design...
7 篇 adaptive control
7 篇 heuristic algori...

机构

4 篇 school of inform...
4 篇 department of in...
3 篇 department of el...
3 篇 northeastern uni...
3 篇 univ texas autom...
3 篇 arizona state un...
3 篇 robotics institu...
3 篇 univ illinois de...
2 篇 princeton univ d...
2 篇 national science...
2 篇 college of mecha...
2 篇 key laboratory o...
2 篇 univ utrecht dep...
2 篇 department of op...
1 篇 inria
1 篇 computational le...
1 篇 school of automa...
1 篇 univ cincinnati ...
1 篇 toyota technol c...
1 篇 neuroinformatics...

作者

5 篇 liu derong
4 篇 xu xin
4 篇 martin riedmille...
4 篇 huaguang zhang
4 篇 marco a. wiering
4 篇 zhang huaguang
4 篇 si jennie
4 篇 derong liu
3 篇 hado van hasselt
3 篇 lewis frank l.
3 篇 dongbin zhao
3 篇 powell warren b.
3 篇 warren b. powell
3 篇 riedmiller marti...
2 篇 manuel loth
2 篇 van hasselt hado
2 篇 preux philippe
2 篇 hu dewen
2 篇 jennie si
2 篇 philippe preux

语言

142 篇 英文
5 篇 其他

检索条件"任意字段=2007 IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning, ADPRL 2007"

共 147 条记录，以下是101-110 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Toward effective combination of off-line and on-line training in ADP framework

Toward effective combination of off-line and on-line trainin...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Danil Prokhorov Toyota Technical Center Ann Arbor MI USA

We are interested in finding the most effective combination between off-line and on-line/real-time training in approximate dynamic programming. We introduce our approach of combining proven off-line methods of training for robustness with a group of on-line methods. Training for robustness is carried out on reasonably accurate models with the multi-stream Kalman filter method (Feldkamp et al., 1998), whereas on-line adaptation is performed either with the help of a critic or by methods resembling reinforcement learning. We also illustrate importance of using recurrent neural networks for both controller/actor and critic

关键词： Neurocontrollers Robustness Recurrent neural networks Neural networks Adaptive control dynamic programming Robust control Programmable control learning Uncertainty

来源：评论

学校读者我要写书评

暂无评论

An Optimal ADP Algorithm for a High-Dimensional Stochastic Control Problem

An Optimal ADP Algorithm for a High-Dimensional Stochastic C...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Juliana Nascimento Warren Powell Department of Operations Research and Financial Engineering Princeton University Engineering Princeton NJ USA

We propose a provably optimal approximate dynamic programming algorithm for a class of multistage stochastic problems, taking into account that the probability distribution of the underlying stochastic process is not known and the state space is too large to be explored entirely. The algorithm and its proof of convergence rely on the fact that the optimal value functions of the problems within the problem class are concave and piecewise linear. The algorithm is a combination of Monte Carlo simulation, pure exploitation, stochastic approximation and a projection operation. Several applications, in areas like energy, control, inventory and finance, fall under the framework

关键词： Stochastic processes Optimal control Piecewise linear approximation dynamic programming Heuristic algorithms Probability distribution State-space methods Convergence Piecewise linear techniques Approximation algorithms

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning in Continuous Action Spaces

Reinforcement Learning in Continuous Action Spaces

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Hado van Hasselt Marco A. Wiering Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands

Quite some research has been done on reinforcement learning in continuous environments, but the research on problems where the actions can also be chosen from a continuous space is much more limited. We present a new class of algorithms named continuous actor critic learning automaton (CACLA) that can handle continuous states and actions. The resulting algorithm is straightforward to implement. An experimental comparison is made between this algorithm and other algorithms that can handle continuous action spaces. These experiments show that CACLA performs much better than the other algorithms, especially when it is combined with a Gaussian exploration method

关键词： learning automata Computational modeling dynamic programming Intelligent systems Telephony Books Physics computing

来源：评论

学校读者我要写书评

暂无评论

The Effect of Bootstrapping in Multi-Automata reinforcement learning

The Effect of Bootstrapping in Multi-Automata Reinforcement ...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Maarten Peeters Katja Verbeeck Ann Nowe Computational Modeling Laboratory Vrije Universiteit Brussel Brussels Belgium

learning automata are shown to be an excellent tool for creating learning multi-agent systems. Most algorithms used in current automata research expect the environment to end in an explicit end-stage. In this end-stage the rewards are given to the learning automata (i.e. Monte Carlo updating). This is however unfeasible in sequential decision problems with infinite horizon where no such end-stage exists. In this paper we propose a new algorithm based on one-step returns that uses bootstrapping to find good equilibrium paths in multi-stage games

关键词： learning automata Monte Carlo methods Convergence dynamic programming Computational modeling Multiagent systems Infinite horizon Equations

来源：评论

学校读者我要写书评

暂无评论

Kernelizing LSPE(λ)

Kernelizing LSPE(λ)

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Tobias Jung Daniel Polani University of Mainz Germany University of Herfordshire UK

We propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(λ) and LSTD(λ). In particular we present the 'kernelization' of model-free LSPE(λ). The 'kernelization' is computationally made possible by using the subset of regressors approximation, which approximates the kernel using a vastly reduced number of basis functions. The core of our proposed solution is an efficient recursive implementation with automatic supervised selection of the relevant basis functions. The LSPE method is well-suited for optimistic policy iteration and can thus be used in the context of online reinforcement learning. We use the high-dimensional Octopus benchmark to demonstrate this

关键词： Least squares approximation Function approximation learning Kernel dynamic programming Electronic mail Optimal control Optimization methods Least squares methods Control systems

来源：评论

学校读者我要写书评

暂无评论

Optimal control applied to Wheeled Mobile Vehicles

Optimal control applied to Wheeled Mobile Vehicles

引用

ieee International symposium on Intelligent Signal Processing

作者： Gomez, M. Martinez, T. Sanchez, S. Meziat, D. Univ Alcala Escuela Politecn Super Dept Automat Alcala De Henares Spain Univ Alicante Escuela Politecn Super Ingn Sistemas Teoria Sefial Dept Fis Alicante Spain

ISBN: (纸本)9781424408290

The goal of the work described in this paper is to develop a particular optimal control technique based on a Cell. Mapping technique in combination with the Q-learning reinforcement learning method to control wheeled mobile vehicles. This approach manages 4 state variables due to a dynamic model is performed instead of a kinematics model which can be done with less variables. This new solution can be applied to non-linear continuous systems where reinforcement learning methods have multiple constraints. Emphasis is given to the new combination of techniques, which applied to optimal control problems produce satisfactory results. The proposed algorithm is very robust to any change involved In the vehicle parameters because the vehicle model is estimated in real time from received experience.

关键词： Cell-Mapping dynamic programming optimal control principle of optimality Q-learning reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Particle Swarn Optimized Adaptive dynamic programming

Particle Swarn Optimized Adaptive Dynamic Programming

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Dongbin Zhao Jianqiang Yi Derong Liu Key Laboratory of Complex Systems and Intelligence Science Institute of Automation Chinese Academy and Sciences Beijing China Department of Electrical and Computer Engineering University of Illinois Chicago Chicago IL USA

Particle swarm optimization is used for the training of the action network and critic network of the adaptive dynamic programming approach. The typical structures of the adaptive dynamic programming and particle swarm optimization are adopted for comparison to other learning algorithms such as gradient descent method. Besides simulation on the balancing of a cart pole plant, a more complex plant pendulum robot (pendubot) is tested for the learning performance. Compared to traditional adaptive dynamic programming approaches, the proposed evolutionary learning strategy is verified as faster convergence and higher efficiency. Furthermore, the structure becomes simple because the plant model does not need to be identified beforehand

关键词： dynamic programming Particle swarm optimization Neural networks Robots Backpropagation Adaptive systems Evolutionary computation learning Cost function Testing

来源：评论

学校读者我要写书评

暂无评论

Identifying trajectory classes in dynamic tasks

Identifying trajectory classes in dynamic tasks

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Stuart O. Anderson Siddhartha S. Srinivasa Robotics Institute Carnegie Mellon University Pittsburgh PA USA Intel Research Pittsburgh Intel Corporation Pittsburgh PA USA

Using domain knowledge to decompose difficult control problems is a widely used technique in robotics. Previous work has automated the process of identifying some qualitative behaviors of a system, finding a decomposition of the system based on that behavior, and constructing a control policy based on that decomposition. We introduce a novel method for automatically finding decompositions of a task based on observing the behavior of a preexisting controller. Unlike previous work, these decompositions define reparameterizations of the state space that can permit simplified control of the system

关键词： State-space methods Automatic control Control systems Motion control dynamic programming Robotics and automation Convergence learning Humans Robot control

来源：评论

学校读者我要写书评

暂无评论

Model-Based reinforcement learning in Factored-State MDPs

Model-Based Reinforcement Learning in Factored-State MDPs

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Alexander L. Strehl Department of Computer Science Rutgers University Piscataway NJ USA

We consider the problem of learning in a factored-state Markov decision process that is structured to allow a compact representation. We show that the well-known algorithm, factored Rmax, performs near-optimally on all but a number of timesteps that is polynomial in the size of the compact representation, which is often exponentially smaller than the number of states. This is equivalent to the result obtained by Kearns and Roller for their DBN-E 3 algorithm, except that we've conducted the analysis in a more general setting. We also extend the results to a new algorithm, factored IE, that uses the interval estimation approach to exploration and can be expected to outperform factored Rmax on most domains

关键词： learning Polynomials Algorithm design and analysis State-space methods dynamic programming Computer science Mathematical model Performance analysis Bayesian methods Linear approximation

来源：评论

学校读者我要写书评

暂无评论

Q-learning with Continuous State Spaces and Finite Decision Set

Q-Learning with Continuous State Spaces and Finite Decision ...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (adprl)

作者： Kengy Barty Pierre Girardeau Jean-Sebastien Roy Cyrille Strugarek EDF Research and Development Clamart France

This paper aims to present an original technique in order to compute the optimal policy of a Markov decision problem with continuous state space and discrete decision variables. We propose an extension of the Q-learning algorithm introduced in 1989 by Watkins for discrete Markov decision problems. Our algorithm relies on stochastic approximation and functional estimation, and uses kernels to locally update the Q-functions. We state under mild assumptions a converge theorem for this algorithm. Finally, we illustrate our algorithm by solving two classical problems: the mountain car task and the puddle world task

关键词： State-space methods Kernel Costs dynamic programming Stochastic processes Recursive estimation Random variables learning Approximation algorithms Uncertainty

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共15页 << < 6 7 8 9 10 11 12 13 14 15 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：