检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

748 篇 会议
271 篇 期刊文献
4 册 图书

馆藏范围

1,023 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

712 篇 工学
- 520 篇 计算机科学与技术...
- 381 篇 电气工程
- 278 篇 控制科学与工程
- 153 篇 软件工程
- 79 篇 信息与通信工程
- 40 篇 交通运输工程
- 23 篇 仪器科学与技术
- 20 篇 机械工程
- 9 篇 生物工程
- 8 篇 电子科学与技术（可...
- 7 篇 力学（可授工学、理...
- 7 篇 土木工程
- 6 篇 动力工程及工程热...
- 6 篇 石油与天然气工程
- 4 篇 生物医学工程（可授...
- 3 篇 材料科学与工程（可...
- 3 篇 化学工程与技术
- 3 篇 航空宇航科学与技...
- 3 篇 安全科学与工程
118 篇 理学
- 98 篇 数学
- 32 篇 系统科学
- 22 篇 统计学（可授理学、...
- 10 篇 生物学
- 8 篇 物理学
- 4 篇 化学
66 篇 管理学
- 63 篇 管理科学与工程(可...
- 14 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 4 篇 应用经济学
3 篇 法学
- 3 篇 社会学
2 篇 医学
1 篇 教育学

主题

313 篇 reinforcement le...
216 篇 dynamic programm...
206 篇 optimal control
107 篇 adaptive dynamic...
104 篇 adaptive dynamic...
97 篇 learning
88 篇 neural networks
78 篇 heuristic algori...
68 篇 reinforcement le...
58 篇 learning (artifi...
54 篇 nonlinear system...
53 篇 convergence
51 篇 control systems
51 篇 mathematical mod...
48 篇 approximate dyna...
44 篇 approximation al...
43 篇 equations
42 篇 adaptive control
41 篇 artificial neura...
41 篇 cost function

机构

41 篇 chinese acad sci...
27 篇 univ rhode isl d...
17 篇 tianjin univ sch...
16 篇 univ sci & techn...
16 篇 univ illinois de...
15 篇 northeastern uni...
14 篇 beijing normal u...
13 篇 northeastern uni...
13 篇 guangdong univ t...
12 篇 northeastern uni...
9 篇 natl univ def te...
8 篇 ieee
8 篇 univ chinese aca...
7 篇 univ chinese aca...
7 篇 cent south univ ...
7 篇 southern univ sc...
7 篇 beijing univ tec...
6 篇 chinese acad sci...
6 篇 missouri univ sc...
5 篇 nanjing univ pos...

作者

54 篇 liu derong
37 篇 wei qinglai
29 篇 he haibo
22 篇 wang ding
21 篇 xu xin
19 篇 jiang zhong-ping
17 篇 lewis frank l.
17 篇 yang xiong
17 篇 zhang huaguang
17 篇 ni zhen
16 篇 zhao bo
15 篇 gao weinan
14 篇 zhao dongbin
13 篇 derong liu
13 篇 zhong xiangnan
12 篇 si jennie
10 篇 jagannathan s.
10 篇 dongbin zhao
10 篇 song ruizhuo
9 篇 abouheaf mohamme...

语言

992 篇 英文
25 篇 其他
6 篇 中文

检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"

共 1023 条记录，以下是771-780 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

ieee SSCI 2011 - symposium Series on Computational Intelligence - ieee ALIFE 2011: 2011 ieee symposium on Artificial Life

IEEE SSCI 2011 - Symposium Series on Computational Intellige...

引用

symposium Series on Computational Intelligence, ieee SSCI 2011 - 2011 ieee symposium on Artificial Life, ieee ALIFE 2011

ISBN: (纸本)9781612840635

The proceedings contain 30 papers. The topics discussed include: computation of population spatial distribution in individual-based ecosystem simulation;towards imitation-enhanced reinforcement learning in multi-agent systems;biologically inspired design principles for scalable, robust, adaptive, decentralized search and automated response (RADAR);look-ahead relevant information: reducing cognitive burden over prolonged tasks;information storage and transfer in the synchronization process in locally-connected networks;from babbling towards first words: the emergence of speech in a robot in real-time interaction;evolving robot controllers in PDL using genetic programming;ecosystemic methods for creative domains: niche construction and boundary formation;an interactive electronic art system based on artificial ecosystemics;network representation of cellular automata;and study of inheritable mutations in von Neumann self-reproducing automata using the GOLLY simulator.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Active exploration for robot parameter selection in episodic reinforcement learning

Active exploration for robot parameter selection in episodic...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Oliver Kroemer Jan Peters Max-Planck Institute Tubingen Germany

As the complexity of robots and other autonomous systems increases, it becomes more important that these systems can adapt and optimize their settings actively. However, such optimization is rarely trivial. Sampling from the system is often expensive in terms of time and other costs, and excessive sampling should therefore be avoided. The parameter space is also usually continuous and multi-dimensional. Given the inherent exploration-exploitation dilemma of the problem, we propose treating it as an episodic reinforcement learning problem. In this reinforcement learning framework, the policy is defined by the system's parameters and the rewards are given by the system's performance. The rewards accumulate during each episode of a task. In this paper, we present a method for efficiently sampling and optimizing in continuous multidimensional spaces. The approach is based on Gaussian process regression, which can represent continuous non-linear mappings from parameters to system performance. We employ an upper confidence bound policy, which explicitly manages the trade-off between exploration and exploitation. Unlike many other policies for this kind of problem, we do not rely on a discretization of the action space. The presented method was evaluated on a real robot. The robot had to learn grasping parameters in order to adapt its grasping execution to different objects. The proposed method was also tested on a more general gain tuning problem. The results of the experiments show that the presented method can quickly determine suitable parameters and is applicable to real online learning applications.

关键词： Robots Ground penetrating radar Tuning Upper bound Kernel Grasping Convergence

来源：评论

学校读者我要写书评

暂无评论

Directed exploration of policy space using support vector classifiers

Directed exploration of policy space using support vector cl...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Ioannis Rexakis Michail G. Lagoudakis Department of Electronic and Computer Engineering Technical University of Crete Crete Greece

Good policies in reinforcement learning problems typically exhibit significant structure. Several recent learning approaches based on the approximate policy iteration scheme suggest the use of classifiers for capturing this structure and representing policies compactly. Nevertheless, the space of possible policies, even under such structured representations, is huge and needs to be explored carefully to avoid computationally expensive simulations (rollouts) needed to probe the improved policy and obtain training samples at various points over the state space. Regarding rollouts as a scarce resource, we propose a method for directed exploration of policy space using support vector classifiers. We use a collection of binary support vector classifiers to represent policies, whereby each of these classifiers corresponds to a single action and captures the parts of the state space where this action dominates over the other actions. After an initial training phase with rollouts uniformly distributed over the entire state space, we use the support vectors of the classifiers to identify the critical parts of the state space with boundaries between different action choices in the represented policy. The policy is subsequently improved by probing the state space only at points around the support vectors that are distributed perpendicularly to the separating border. This directed focus on critical parts of the state space iteratively leads to the gradual refinement and improvement of the underlying policy and delivers excellent control policies in only a few iterations with a conservative use of rollouts. We demonstrate the proposed approach on three standard reinforcement learning domains: inverted pendulum, mountain car, and acrobot.

关键词： Support vector machines Training learning Space exploration Probes Training data Markov processes

来源：评论

学校读者我要写书评

暂无评论

2011 ieee International symposium on Intelligent Control, ISIC 2011

2011 IEEE International Symposium on Intelligent Control, IS...

引用

2011 ieee International symposium on Intelligent Control, ISIC 2011

ISBN: (纸本)9781457711046

The proceedings contain 39 papers. The topics discussed include: optimal network localization by particle swarm optimization;a framework for adaptive tuning of distributed model predictive controllers by Lagrange multipliers;probabilistic fault detection and handling algorithm for testing stability control systems with a drive-by-wire vehicle;distance-based control of cycle-free persistent formations;satellite formation flying with input saturation: an LMI approach;performance information in risk-averse control of model-following systems;an interpolation method of multiple terminal iterative learning control;formation control of mobile agent groups based on localization;image-correlation data association with phase-varying uncertainty techniques;approximate dynamic programming for stochastic systems with additive and multiplicative noise;and iterative learning control for discrete linear systems with zero Markov parameters using repetitive process stability theory.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Approximate dynamic programming for stochastic systems with additive and multiplicative noise

Approximate dynamic programming for stochastic systems with ...

引用

ieee International symposium on Intelligent Control (ISIC)

作者： Yu Jiang Zhong-Ping Jiang Department of Electrical and Computer Engineering Polytechnic Institute of New York University Brooklyn NY USA College of Engineering Beijing University China

This paper studies the stochastic optimal control problem with additive and multiplicative noise via reinforcement learning (RL) and approximate/adaptive dynamic programming (ADP). Using Itô calculus, a policy iteration algorithm is derived in the presence of both additive and multiplicative noise. It is shown that the expectation of the approximated cost matrix is guaranteed to converge to the solution of certain algebraic Riccati equation that gives rise to the optimal cost value. Furthermore, the covariance of the approximated cost matrix can be reduced by increasing the length of time interval between two consecutive iterations. Finally, the efficiency of the proposed ADP methodology is illustrated in a numerical example.

关键词： Noise Additives Symmetric matrices Covariance matrix Steady-state Approximation algorithms Convergence

来源：评论

学校读者我要写书评

暂无评论

Model-free H∞ stochastic optimal design for unknown linear networked control system zero-sum games via Q-learning

Model-free H∞ stochastic optimal design for unknown linear ...

引用

ieee International symposium on Intelligent Control (ISIC)

作者： Hao Xu S. Jagannathan Department of Electrical and Computer Engineering Missouri University of Science and Technology USA

In this paper, stochastic optimal strategy for unknown linear networked control system (NCS) quadratic zero-sum games related to H ∞ optimal control in the presence of random delays and packet losses is solved in forward-in-time manner. This approach does not require the knowledge of the system matrices since it uses Q-learning. The proposed stochastic optimal control approach, referred as adaptive dynamic programming (ADP), involves solving the action dependent Q-function Q(z, u, d) of the zero-sum game instead of solving the state dependent value function J(z) which satisfies a corresponding Game Theoretic Riccati equation (GRE). An adaptive estimator (AE) is proposed to learn the Q-function online and value and policy iterations are not needed unlike in traditional ADP schemes. Update laws for tuning the unknown parameters of adaptive estimator (AE) are derived. Lyapunov theory is used to show that all signals are asymptotic stable (AS) and that the approximated control and disturbance signals converge to optimal control and disturbance inputs. Simulation results are included to show the effectiveness of the scheme.

关键词： Delay Optimal control Games Cost function Game theory Stochastic processes Equations

来源：评论

学校读者我要写书评

暂无评论

Direct adaptive control of a flexible robot using reinforcement learning

Direct adaptive control of a flexible robot using reinforcem...

引用

International Conference on Industrial Electronics, Control and Robotics

作者： Subudhi, Bidyadhar Pradhan, Santanu Kumar

ISBN: (纸本)9781424485468

This paper proposes a new adaptive control using the concept of reinforcement learning to address adaptivity for varied payload conditions for a two-link flexible manipulator (TLFM). The application of reinforcement learning has been implemented using a method called adaptive dynamic programming. Decentralized controllers for the decoupled system have been also designed using LQR technique. Then the reinforcement learning is used to tune the gains of the optimal control to adapt in terms of different payload to the manipulator end effecter. Simulation results show that proposed controller provides better end point tracking then LQR fixed gain controller. © 2010 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

adaptive Fuzzy Control of Switched Objective Functions in Pursuit-Evasion Scenarios 49

Adaptive Fuzzy Control of Switched Objective Functions in Pu...

引用

49th ieee Conference on Decision and Control (CDC)

作者： Goode, Brian Kurdila, Andrew Roan, Mike Virginia Polytech Inst & State Univ Dept Mech Engn Blacksburg VA 24060 USA

ISBN: (纸本)9781424477463

In recent efforts, the authors have derived simple switched control schemes that qualitatively yield an attractive performance in two player pursuit-evasion games. A drawback of these methods is that detailed knowledge of an opponent's dynamics and strategy is required to implement the switching controller. Furthermore, an objective evaluated over a finite horizon may not guide an agent to the target set. To circumvent this potential shortcoming, a switching scheme is proposed where an adaptive fuzzy controller chooses the best objective function from a predefined library to increase the agent's reachability. The methodology we present builds on the common approximate dynamic programming reinforcement learning technique. We give conditions for showing when the controller is applicable and give an implementation example with the Homicidal Chauffeur problem.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming

A hierarchical learning architecture with multiple-goal repr...

引用

2010 International Conference on Networking, Sensing and Control, ICNSC 2010

作者： He, Haibo Liu, Bo Department of Electrical Computer and Biomedical Engineering University of Rhode Island Kingston RI 02881 United States Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken NJ 07030 United States

ISBN: (纸本)9781424464531

In this paper we propose a hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming (ADP). The key idea of this architecture is to integrate a reference network to provide the internal reinforcement representation (secondary reinforcement signal) to interact with the operation of the learning system. Such a reference network serves an important role to build the internal goal representations. Furthermore, motivated by recent research in neurobiological and psychology research, the proposed ADP architecture can be designed in a hierarchical way, in which different levels of internal reinforcement signals can be developed to represent multi-level goals for the intelligent system. Detailed system level architecture, learning and adaptation principle, and simulation results are presented in this work to demonstrate the effectiveness of this work. ©2010 ieee.

关键词： Intelligent systems

来源：评论

学校读者我要写书评

暂无评论

Iterative learning Control of A Class of Fractional Order Nonlinear Systems

Iterative Learning Control of A Class of Fractional Order No...

引用

ieee International Conference on Control Applications Part of 2010 ieee Multi-Conference on Systems and Control

作者： Li, Yan Ahn, Hyo-Sung Chen, YangQuan Shandong Univ Sch Control Sci & Engn Jinan 250061 Shandong Peoples R China Gwangju Inst Sci & Technol 1 Oryong Dong Gwangju South Korea Utah State Univ Ctr Self Organizing & Intelligent Syst Dept Elect & Comp Engn Logan UT 84322 USA

This paper firstly addresses the convergence analysis of iterative learning control of a class of fractional order nonlinear systems using the generalized Gronwall-Bellman lemma. Detailed problem definition and conver... 详细信息

ISBN: (纸本)9781424453610

关键词： Gronwall-Bellman lemma adaptive control convergence analysis convergence of numerical methods dynamic programming fractional order nonlinear systems iterative learning control iterative methods learning systems nonlinear systems

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共103页 << < 74 75 76 77 78 79 80 81 82 83 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：