检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

748 篇 会议
271 篇 期刊文献
4 册 图书

馆藏范围

1,023 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

712 篇 工学
- 520 篇 计算机科学与技术...
- 381 篇 电气工程
- 278 篇 控制科学与工程
- 153 篇 软件工程
- 79 篇 信息与通信工程
- 40 篇 交通运输工程
- 23 篇 仪器科学与技术
- 20 篇 机械工程
- 9 篇 生物工程
- 8 篇 电子科学与技术（可...
- 7 篇 力学（可授工学、理...
- 7 篇 土木工程
- 6 篇 动力工程及工程热...
- 6 篇 石油与天然气工程
- 4 篇 生物医学工程（可授...
- 3 篇 材料科学与工程（可...
- 3 篇 化学工程与技术
- 3 篇 航空宇航科学与技...
- 3 篇 安全科学与工程
118 篇 理学
- 98 篇 数学
- 32 篇 系统科学
- 22 篇 统计学（可授理学、...
- 10 篇 生物学
- 8 篇 物理学
- 4 篇 化学
66 篇 管理学
- 63 篇 管理科学与工程(可...
- 14 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 4 篇 应用经济学
3 篇 法学
- 3 篇 社会学
2 篇 医学
1 篇 教育学

主题

313 篇 reinforcement le...
216 篇 dynamic programm...
206 篇 optimal control
107 篇 adaptive dynamic...
104 篇 adaptive dynamic...
97 篇 learning
88 篇 neural networks
78 篇 heuristic algori...
68 篇 reinforcement le...
58 篇 learning (artifi...
54 篇 nonlinear system...
53 篇 convergence
51 篇 control systems
51 篇 mathematical mod...
48 篇 approximate dyna...
44 篇 approximation al...
43 篇 equations
42 篇 adaptive control
41 篇 artificial neura...
41 篇 cost function

机构

41 篇 chinese acad sci...
27 篇 univ rhode isl d...
17 篇 tianjin univ sch...
16 篇 univ sci & techn...
16 篇 univ illinois de...
15 篇 northeastern uni...
14 篇 beijing normal u...
13 篇 northeastern uni...
13 篇 guangdong univ t...
12 篇 northeastern uni...
9 篇 natl univ def te...
8 篇 ieee
8 篇 univ chinese aca...
7 篇 univ chinese aca...
7 篇 cent south univ ...
7 篇 southern univ sc...
7 篇 beijing univ tec...
6 篇 chinese acad sci...
6 篇 missouri univ sc...
5 篇 nanjing univ pos...

作者

54 篇 liu derong
37 篇 wei qinglai
29 篇 he haibo
22 篇 wang ding
21 篇 xu xin
19 篇 jiang zhong-ping
17 篇 lewis frank l.
17 篇 yang xiong
17 篇 zhang huaguang
17 篇 ni zhen
16 篇 zhao bo
15 篇 gao weinan
14 篇 zhao dongbin
13 篇 derong liu
13 篇 zhong xiangnan
12 篇 si jennie
10 篇 jagannathan s.
10 篇 dongbin zhao
10 篇 song ruizhuo
9 篇 abouheaf mohamme...

语言

992 篇 英文
25 篇 其他
6 篇 中文

检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"

共 1023 条记录，以下是851-860 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Ramp metering based on on-line ADHDP (λ) controller

Ramp metering based on on-line ADHDP (λ) controller

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Xuerui Bai Dongbin Zhao Jianqiang Yi Jing Xu Laboratory of Complex Systems and Intelligence Science Institute of Automation Chinese Academy and Sciences Beijing China University of Arizona Tucson USA

Increasing dependence on car-based travel has led to the daily occurrence of freeway congestions around the world. In order to improve the worse and worse traffic congestion situation and solve the problems brought with it, a new kind of effective, fast, and robust method should be presented. Ramp metering has been developed as a traffic management strategy to alleviate congestion on freeways. But, it doesnpsilat work well in uncertainty situations. In this paper, in order to solve the problems in uncertainty conditions, an on-line learning control method based on the fundamental principle of reinforcement learning is proposed. The method is ADP (adaptive dynamic programming) and in order to expedite the learning rate, the concept about eligibility traces is introduced here. Then eligibility trace and ADP is combined to present a new kind of traffic responsive control method. The new method is called action-dependent heuristic dynamic programming based on eligibility traces (ADHDP (lambda)). ADHDP (lambda) is an approximate optimal ramp metering method. Simulation studies on a hypothetical freeway indicate good control performance of the proposed real-time traffic controller.

关键词： Traffic control Artificial neural networks dynamic programming Training Vehicles Equations learning

来源：评论

学校读者我要写书评

暂无评论

A biologically-inspired computational model for transformation invariant target recognition

A biologically-inspired computational model for transformati...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Khan M. Iftekharuddin Yaqin Li Intelligence System and Image Processing Lab Department of Electrical and Computer Engineering University of Memphis Memphis TN USA

Transformation invariant image recognition has been an active research area due to its widespread applications in a variety of fields such as military operations, robotics, medical practices, geographic scene analysis, and many others. One of the primary challenges is detection and recognition of objects in the presence of transformations such as resolution, rotation, translation, scale and occlusion. In this work, we investigate a biologically-inspired computational modeling approach that exploits reinforcement learning (RL) for transformation-invariant image recognition. The RL is implemented in an adaptive critic design (ACD) framework to approximate the neuro-dynamic programming. Two ACD algorithms such as heuristic dynamic programming (HDP) and dual heuristic dynamic programming (DHP) are investigated and compared for transformation invariant recognition. The two learning algorithms are evaluated statistically using simulated transformations in 2-D images as well as with a large-scale UMIST 2-D face database with pose variations. Our simulations show promising results for both HDP and DHP for transformation-invariant image recognition as well as face authentication. Comparing the two algorithms, DHP outperforms HDP in learning capability, as DHP takes fewer steps to perform a successful recognition task in general. On the other hand, HDP is more robust than DHP as far as success rate across the database is concerned when applied in a stochastic and uncertain environment, and the computational complexity involved in HDP is much less.

关键词： Artificial neural networks Visualization Biology Biological system modeling Image recognition dynamic programming Image resolution

来源：评论

学校读者我要写书评

暂无评论

An approximate dynamic programming strategy for responsive traffic signal control

An approximate dynamic programming strategy for responsive t...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Cai, Chen Univ Coll London Ctr Transport Studies London WC1E 6BT England

ISBN: (纸本)9781424407064

This paper proposes an approximate dynamic programming strategy for responsive traffic signal control. It is the first attempt that optimizes signal control objective dynamically through adaptive approximation of value function. The proposed value function approximation is separable and exogenous factor independent. The algorithm updates the approximated value function progressively in operation, while preserving the structural property of the control problem. The convergence and performance of the algorithm have been tested in a range of experiments. It has been concluded that the new strategy is as good as the best existing control strategies while being efficient and simple in computation. It also has the potential of being extended to multi-phase signal control at isolate junction and to decentralized network operation.

关键词： dynamic programming Traffic control Function approximation Communication system traffic control adaptive control Roads learning Testing Delay Vehicle safety

来源：评论

学校读者我要写书评

暂无评论

Particle swarm optimized adaptive dynamic programming

Particle swarm optimized adaptive dynamic programming

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Dongbin Zhao Jianqiang Yi Liu, Derong Chinese Acad Sci Inst Automat Key Lab Complex Syst & Intelligence Sci Beijing 100080 Peoples R China Univ Illinois Dept Elect & Comp Engn Chicago IL 60607 USA

ISBN: (纸本)9781424407064

Particle swarm optimization is used for the training of the action network and critic network of the adaptive dynamic programming approach. The typical structures of the adaptive dynamic programming and particle swarm optimization are adopted for comparison to other learning algorithms such as gradient descent method. Besides simulation on the balancing of a cart pole plant, a more complex plant pendulum robot (pendubot) is tested for the learning performance. Compared to traditional adaptive dynamic programming approaches, the proposed evolutionary learning strategy is verified as faster convergence and higher efficiency. Furthermore, the structure becomes simple because the plant model does not need to be identified beforehand.

关键词： adaptive dynamic programming Particle swarm optimization Pendubot Pole balancing

来源：评论

学校读者我要写书评

暂无评论

Using ADP to understand and replicate brain intelligence: the next level design

Using ADP to understand and replicate brain intelligence: th...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Werbos, Paul J. Natl Sci Fdn Arlington VA 22203 USA

ISBN: (纸本)9781424407064

Since the 1960's I proposed that we could understand and replicate the highest level of intelligence seen in the brain, by building ever more capable and general systems for adaptive dynamic programming (ADP) - like "reinforcement learning" but based on approximating the Bellman equation and allowing the controller to know its utility function. Growing empirical evidence on the brain supports this approach. adaptive critic systems now meet tough engineering challenges and provide a kind of first-generation model of the brain. Lewis, Prokhorov and myself have early second-generation work. Mammal brains possess three core capabilities creativity/imagination and ways to manage spatial and temporal complexity - even beyond the second generation. This paper reviews previous progress, and describes new tools and approaches to overcome the spatial complexity gap.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Discrete-time adaptive dynamic programming using wavelet basis function neural networks

Discrete-time adaptive dynamic programming using wavelet bas...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Jin, Ning Liu, Derong Huang, Ting Pang, Zhongyu Univ Illinois Dept Elect & Comp Engn Chicago IL 60607 USA

ISBN: (纸本)9781424407064

dynamic programming for discrete time systems is difficult due to the "curse of dimensionality": one has to find a series of control actions that must be taken in sequence, hoping that this sequence will lead to the optimal performance cost, but the total cost of those actions will be unknown until the end of that sequence. In this paper, we present our work on adaptive dynamic programming (ADP) for nonlinear discrete time system using neural networks. The neural network we adopted here is the wavelet basis function (WBF) neural network. We will exam the performance of an ADP algorithm using WBF neural networks. The comparison shows that when WBF neural networks are employed, the ADP algorithm gives faster training speed than when PBF neural networks are employed.

关键词： dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Using reward-weighted regression for reinforcement learning of task space control

Using reward-weighted regression for reinforcement learning ...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Peters, Jan Schaal, Stefan Univ So Calif Los Angeles CA 90089 USA

ISBN: (纸本)9781424407064

Many robot control problems of practical importance, including task or operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning by backpropagation through an LSTM model/critic

Reinforcement learning by backpropagation through an LSTM mo...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Bakker, Bram Univ Amsterdam Inst Informat Intelligent Syst Lab Amsterdam NL-1098 SJ Amsterdam Netherlands

ISBN: (纸本)9781424407064

This paper describes backpropagation through an LSTM recurrent neural network model/critic, for reinforcement learning tasks in partially observable domains. This combines the advantage of LSTM's strength at learning long-term temporal dependencies to infer states in partially observable tasks, with the advantage of being able to learn high-dimensional and/or continuous actions with backpropagation's focused credit assignment mechanism.

关键词： Backpropagation State-space methods Recurrent neural networks Neural networks Observability dynamic programming learning systems Intelligent systems Intelligent networks Laboratories

来源：评论

学校读者我要写书评

暂无评论

Online reinforcement learning neural network controller design for nanomanipulation

Online reinforcement learning neural network controller desi...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Yang, Qinmin Jagannathan, S. Univ Missouri Dept Elect & Comp Engn Rolla MO 65401 USA

ISBN: (纸本)9781424407064

In this paper, a novel reinforcement learning neural network (NN)-based controller, referred to adaptive critic controller, is proposed for affine nonlinear discrete-time systems with applications to nanomanipulation. In the online NN reinforcement learning method, one NN is designated as the critic NN, which approximates the long-term cost function by assuming that the states of the nonlinear systems is available for measurement. An action NN is employed to derive an optimal control signal to track a desired system trajectory while minimizing the cost function. Online updating weight tuning schemes for these two NNs are also derived. By using the Lyapunov approach, the uniformly ultimate boundedness (UUB) of the tracking error and weight estimates is shown. Nanomanipulation implies manipulating objects with nanometer size. It takes several hours to perform a simple task in the nanoscale world. To accomplish the task automatically the proposed online learning control design is evaluated for the task of nanomanipulation and verified in the simulation environment.

关键词： neural network reinforcement learning on-line learning dynamic programming Lyapunov method nanomanipulation

来源：评论

学校读者我要写书评

暂无评论

Continuous-time ADP for linear systems with partially unknown dynamics

Continuous-time ADP for linear systems with partially unknow...

引用

ieee International symposium on Approximate dynamic programming and reinforcement learning

作者： Vrabie, Draguna Abu-Khalaf, Murad Lewis, Frank L. Wang, Youyi Univ Texas Automat & Robot Res Inst Ft Worth TX 76118 USA Nanyang Technol Univ Sch Elect & Elect Engn Singapore Singapore

ISBN: (纸本)9781424407064

Approximate dynamic programming has been formulated and applied mainly to discrete-time systems. Expressing the ADP concept for continuous-time systems raises difficult issues related to sampling time and system model knowledge requirements. In this paper is presented a novel online adaptive critic (AC) scheme, based on approximate dynamic programming (ADP), to solve the infinite horizon optimal control problem for continuous-time dynamical systems;thus bringing together concepts from the fields of computational intelligence and control theory. Only partial knowledge about the system model is used, as knowledge about the plant internal dynamics is not needed. The method is thus useful to determine the optimal controller for plants with partially unknown dynamics. It is shown that the proposed iterative ADP algorithm is in fact a Quasi-Newton method to solve the underlying Algebraic Riccati Equation (ARE) of the optimal control problem. An initial gain that determines a stabilizing control policy is not required. In control theory terms, in this paper is developed a direct adaptive control algorithm for obtaining the optimal control solution without knowing the system A matrix.

关键词： approximate dynamic programming adaptive critics policy iterations V-learning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共103页 << < 82 83 84 85 86 87 88 89 90 91 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：