检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

746 篇 会议
270 篇 期刊文献
4 册 图书

馆藏范围

1,020 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

711 篇 工学
- 520 篇 计算机科学与技术...
- 380 篇 电气工程
- 278 篇 控制科学与工程
- 153 篇 软件工程
- 79 篇 信息与通信工程
- 40 篇 交通运输工程
- 23 篇 仪器科学与技术
- 20 篇 机械工程
- 9 篇 生物工程
- 8 篇 电子科学与技术（可...
- 7 篇 力学（可授工学、理...
- 7 篇 土木工程
- 6 篇 动力工程及工程热...
- 6 篇 石油与天然气工程
- 4 篇 生物医学工程（可授...
- 3 篇 材料科学与工程（可...
- 3 篇 化学工程与技术
- 3 篇 航空宇航科学与技...
- 3 篇 安全科学与工程
118 篇 理学
- 98 篇 数学
- 32 篇 系统科学
- 22 篇 统计学（可授理学、...
- 10 篇 生物学
- 8 篇 物理学
- 4 篇 化学
66 篇 管理学
- 63 篇 管理科学与工程(可...
- 14 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 4 篇 应用经济学
3 篇 法学
- 3 篇 社会学
2 篇 医学
1 篇 教育学

主题

312 篇 reinforcement le...
216 篇 dynamic programm...
206 篇 optimal control
107 篇 adaptive dynamic...
104 篇 adaptive dynamic...
97 篇 learning
88 篇 neural networks
78 篇 heuristic algori...
68 篇 reinforcement le...
58 篇 learning (artifi...
54 篇 nonlinear system...
53 篇 convergence
51 篇 control systems
51 篇 mathematical mod...
48 篇 approximate dyna...
44 篇 approximation al...
43 篇 equations
42 篇 adaptive control
41 篇 artificial neura...
41 篇 cost function

机构

41 篇 chinese acad sci...
27 篇 univ rhode isl d...
17 篇 tianjin univ sch...
16 篇 univ sci & techn...
16 篇 univ illinois de...
15 篇 northeastern uni...
14 篇 beijing normal u...
13 篇 northeastern uni...
13 篇 guangdong univ t...
12 篇 northeastern uni...
9 篇 natl univ def te...
8 篇 ieee
8 篇 univ chinese aca...
7 篇 univ chinese aca...
7 篇 cent south univ ...
7 篇 southern univ sc...
7 篇 beijing univ tec...
6 篇 chinese acad sci...
6 篇 missouri univ sc...
5 篇 nanjing univ pos...

作者

54 篇 liu derong
37 篇 wei qinglai
29 篇 he haibo
22 篇 wang ding
21 篇 xu xin
19 篇 jiang zhong-ping
17 篇 lewis frank l.
17 篇 yang xiong
17 篇 zhang huaguang
17 篇 ni zhen
16 篇 zhao bo
15 篇 gao weinan
14 篇 zhao dongbin
13 篇 zhong xiangnan
12 篇 si jennie
12 篇 derong liu
10 篇 jagannathan s.
10 篇 dongbin zhao
10 篇 song ruizhuo
9 篇 abouheaf mohamme...

语言

994 篇 英文
20 篇 其他
6 篇 中文

检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"

共 1020 条记录，以下是651-660 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Stable Iterative Optimal Control for Discrete-Time Nonlinear Systems Using Numerical Controller

Stable Iterative Optimal Control for Discrete-Time Nonlinear...

引用

ieee International Conference on Vehicular Electronics and Safety (ICVES)

作者： Wei, Qinglai Liu, Derong Chinese Acad Sci State Key Lab Management & Control Complex Syst Inst Automat Beijing 100190 Peoples R China

ISBN: (纸本)9781479903801

This paper is concerned with a new iterative adaptive dynamic programming (ADP) algorithm to solve optimal control problems for infinite horizon discrete-time nonlinear systems using a numerical controller. The convergence conditions of the iterative ADP are developed considering the errors by the numerical controller which show that the iterative performance index functions can converge to the greatest lower bound of all performance indices within a finite error bound. Neural networks and digital computer are used to approximate the iterative performance index function and compute the numerically iterative control policy, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the present method.

关键词： adaptive critic designs adaptive dynamic programming approximate dynamic programming nonlinear systems optimal control neural networks reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Hierarchical dynamic Power Management Using Model-Free reinforcement learning

Hierarchical Dynamic Power Management Using Model-Free Reinf...

引用

14th International symposium on Quality Electronic Design (ISQED)

作者： Wang, Yanzhi Triki, Maryam Lin, Xue Ammari, Ahmed C. Pedram, Massoud Univ So Calif Dept Elect Engn Los Angeles CA 90089 USA

ISBN: (纸本)9781467349529;9781467349512

Model-free reinforcement learning (RL) has become a promising technigue for designing a robust dynamic power management (DPM) framework that can cope with variations and uncertainties that emanate from hardware and application characteristics. Moreover, the potentially significant benefit of performing application-level scheduling as part of the system-level power management should be harnessed. This paper presents an architecture for hierarchical DPM in an embedded system composed of a processor chip and connected 110 devices (which are called system components.) The goal is to facilitate saving in the system component power consumption, which tends to dominate the total power consumption. The proposed (online) adaptive DPM technique consists of two layers: an RL-based component-level local power manager (LPM) and a system-level global power manager (GPM). The LPM performs component power and latency optimization. It employs temporal difference learning on semi-Markov decision process (SMDP) for model-free RL, and it is specifically optimized for an environment in which multiple (heterogeneous) types of applications can run in the embedded system. The GPM interacts with the CPU scheduler to perform effective application-level scheduling, thereby, enabling the LPM to do even more component power optimizations. In this hierarchical DPM framework power and latency tradeoffs of each type of application can be precisely controlled based on a user-defined parameter. Experiments show that the amount of average power saving is up to 31.1% compared to existing approaches.

关键词： dynamic power management reinforcement learning Bayesian classification

来源：评论

学校读者我要写书评

暂无评论

A novel adaptive call admission control scheme for distributed reinforcement learning based dynamic spectrum access in cellular networks

A novel adaptive call admission control scheme for distribut...

引用

10th ieee International symposium on Wireless Communication Systems 2013, ISWCS 2013

作者： Morozs, Nils Clarke, Tim Grace, David Department of Electronics University of York Heslington York YO10 5DD United Kingdom

ISBN: (纸本)9783800735297

This paper introduces a novel Q-value based adaptive call admission control scheme (Q-CAC) for distributed reinforcement learning (RL) based dynamic spectrum access (DSA) in mobile cellular networks, which provides a good quality of service (QoS) without the need for spectrum sensing. A DSA algorithm has been developed in this paper using the stateless Q-learning algorithm with Win-or-Learn-Fast (WoLF) learning rates. Its performance was analysed using the spatial distribution of the probabilities of call blocking (BP) and dropping (DP) across the network and compared to that of a 100% accurate spectrum sensing based DSA scheme. The Q-CAC scheme demonstrated good controllability of the blocking probability using a Q-value based call admission threshold parameter. It significantly reduced spatial fluctuations in BP and DP, thus providing more cells with acceptable quality of service (QoS). © VDE Verlag GMBH.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

adaptive Control for an HVDC Transmission Link with FACTS and a Wind Farm

Adaptive Control for an HVDC Transmission Link with FACTS an...

引用

Conference of the ieee PES on Innovative Smart Grid Technologies (ISGT)

作者： Tang, Yufei He, Haibo Wen, Jinyu Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA Huazhong Univ Sci & Technol Coll Elect Elect Engn Wuhan 430074 Peoples R China

ISBN: (纸本)9781467348966;9781467348942

Due to the nonlinearity, uncertainty and complexity of the power system, it is a challenging task to design an effective control approach based on the exact model using traditional methods. In this paper, we investigate the application of a novel approximate dynamic programming (ADP) architecture, goal representation heuristic dynamic programming (GrHDP), to a large benchmark power system. Unlike traditional ADP design with an action network and a critic network, GrHDP integrates the third network, a goal network, into the actorcritic design (ACD) to automatically and adaptively build an internal reinforcement signal representation to facilitate learning and optimization. Then the GrHDP is employed to control the benchmark power system including a DFIG based wind farm and a STATCOM with HVDC transmission. Various power system states, including the voltage of STATCOM, current of DFIG and DC current of HVDC inverter, are provided to the GrHDP controller to generate three adaptive supplementary control signals. These adaptive supplementary control signals are then provided to the STATCOM controller, DFIG rotor side controller and HVDC master control, respectively. This control structure is validated in Matlab/Simulink to demonstrate its effectiveness in power system control.

关键词： Power system stability control smart grid control adaptive dynamic control (ADP) wind farm high voltage direct current (HVDC) transmission

来源：评论

学校读者我要写书评

暂无评论

Exploring the relationship of reward and punishment in reinforcement learning

Exploring the relationship of reward and punishment in reinf...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Robert Lowe Tom Ziemke Interaction Lab University of Skövde Skövde Sweden

We present a reinforcement learning algorithm based on Dyna-Sarsa that utilizes separate representations of reward and punishment when guiding state-action value learning and action selection. The adoption of policy meta-learning optimized by a genetic algorithm is explored and results in the context of a two-armed bandit goal-navigation task in a simple grid world are presented. The findings argue for an important role for a genetic algorithm approach for constructing the foundations of autonomous reinforcement learning agents.

关键词： Planning Genetic algorithms learning (artificial intelligence) Navigation Cost accounting Optimization Context

来源：评论

学校读者我要写书评

暂无评论

Optimal control for a class of nonlinear systems with state delay based on adaptive dynamic programming with ε-error bound

Optimal control for a class of nonlinear systems with state ...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Xiaofeng Lin Nuyun Cao Yuzhang Lin School of Electrical Engineering Guangxi University Nanning China Department of Electrical Engineering Tsinghua University Beijing China

In this paper, a finite-horizon ε-optimal control for a class of nonlinear systems with state delay is proposed by adaptive dynamic programming (ADP) algorithm. First of all, the performance index function is defined and the Hamilton-Jacobi-Bellman (HJB) equation is obtained for the problem, the convergence of the iterative algorithm is also presented. Then, ADP algorithm for finite-horizon optimal control is introduced with an ε-error bound so as to get the ε-optimal control, and BP neural network is used to implement ADP algorithm. At last, an example is given to demonstrate the effectiveness of the proposed algorithm.

关键词： Performance analysis Optimal control Delays Heuristic algorithms Nonlinear systems dynamic programming Neural networks

来源：评论

学校读者我要写书评

暂无评论

Finite-horizon optimal control design for uncertain linear discrete-time systems

Finite-horizon optimal control design for uncertain linear d...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Qiming Zhao Hao Xu S. Jagannathan Department of Electrical and Computer Engineering Missouri University of Science and Technology Rolla MO USA

In this paper, the finite-horizon optimal adaptive control design for linear discrete-time systems with unknown system dynamics by using adaptive dynamic programming (ADP) is presented. In the presence of full state feedback, the terminal state constraint is incorporated in solving the optimal feedback control via the Bellman equation. The optimal regulation of the uncertain linear system is solved in a forward-in-time and online manner without using value and/or policy iterations. Due to the nature of finite horizon, the stability of the closed-loop system is involved but verified by using Lyapunov theory. The effectiveness of the proposed method is verified by simulation results.

关键词： Optimal control Vectors Equations dynamic programming Mathematical model Linear systems learning (artificial intelligence)

来源：评论

学校读者我要写书评

暂无评论

Exponential moving average Q-learning algorithm

Exponential moving average Q-learning algorithm

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Mostafa D. Awheda Howard M. Schwartz Department of Systems and Computer Engineering Carleton University Ottawa Canada

A multi-agent policy iteration learning algorithm is proposed in this work. The Exponential Moving Average (EMA) mechanism is used to update the policy for a Q-learning agent so that it converges to an optimal policy against the policies of the other agents. The proposed EMA Q-learning algorithm is examined on a variety of matrix and stochastic games. Simulation results show that the proposed algorithm converges in a wider variety of situations than state-of-the-art multi-agent reinforcement learning (MARL) algorithms.

关键词： Games Nash equilibrium learning (artificial intelligence) Heuristic algorithms Probability distribution Vectors Markov processes

来源：评论

学校读者我要写书评

暂无评论

An integrated design for intensified direct heuristic dynamic programming

An integrated design for intensified direct heuristic dynami...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Xiong Luo Jennie Si Yuchao Zhou School of Computer and Communication Engineering University of Science and Technology Beijing (USTB) Beijing China Arizona State University Tempe AZ US

There has been a growing interest in the study of adaptive/approximate dynamic programming (ADP) in recent years. The ADP technique provides a powerful tool to understand and improve the principled technologies of machine intelligence system. As one of the ADP algorithms based on adaptive critic neural networks (NNs), the direct heuristic dynamic programming (direct HDP) has demonstrated some successful applications in solving realistic engineering control problems. In this study, based on a three-network architecture in which the reinforcement signal is approximated by an additional NN, a novel integrated design method for intensified direct HDP is developed. The new design approach is implemented by using multiple PID neural networks (PIDNNs), which effectively takes into account structural knowledge of system states and control that are usually present in a physical system. By using a Lyapunov stability approach, a uniformly ultimately boundedness (UUB) result is proved for our PIDNNs-based intensified direct HDP learning controller. Furthermore, the learning and control performances of the proposed design is tested using the popular cart-pole example to illustrate the key ideas of this paper.

关键词： Neural networks dynamic programming Convergence Lyapunov methods learning (artificial intelligence) Educational institutions Algorithm design and analysis

来源：评论

学校读者我要写书评

暂无评论

A novel approach for constructing basis functions in approximate dynamic programming for feedback control

A novel approach for constructing basis functions in approxi...

引用

ieee symposium on adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Jian Wang Zhenhua Huang Xin Xu College of Mechatronics and Automation National University of Defense Tech Changsha P. R. China

This paper presents a novel approach for constructing basis functions in approximate dynamic programming (ADP) through the locally linear embedding (LLE) process. It considers the experience (sample) data as a high-dimensional space and the basis functions to be solved as a low-dimensional space. Through mapping the high-dimensional data into a single global coordinate system of lower dimensionality, the solved basis functions in low-dimensional space have the property that nearby experience data in the high dimensional space remain nearby and similarly co-located with respect to one in the low dimensional space. Thus, the obtained basis functions can precisely approximate the real value/action-value function. The simulation results show that the basis functions obtained by LLE can represent the final policy with a higher precision.

关键词： learning (artificial intelligence) Function approximation dynamic programming Equations Linear approximation Vectors

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共102页 << < 62 63 64 65 66 67 68 69 70 71 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：