检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

299 篇 会议
8 篇 期刊文献

馆藏范围

307 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

180 篇 工学
- 158 篇 计算机科学与技术...
- 56 篇 电气工程
- 48 篇 软件工程
- 47 篇 控制科学与工程
- 13 篇 信息与通信工程
- 10 篇 机械工程
- 6 篇 仪器科学与技术
- 4 篇 力学（可授工学、理...
- 4 篇 生物工程
- 3 篇 动力工程及工程热...
- 2 篇 交通运输工程
- 2 篇 核科学与技术
- 2 篇 生物医学工程（可授...
- 1 篇 建筑学
- 1 篇 化学工程与技术
- 1 篇 航空宇航科学与技...
- 1 篇 食品科学与工程（可...
40 篇 理学
- 35 篇 数学
- 9 篇 系统科学
- 8 篇 统计学（可授理学、...
- 4 篇 物理学
- 4 篇 生物学
- 1 篇 化学
- 1 篇 天文学
- 1 篇 大气科学
- 1 篇 地球物理学
- 1 篇 地质学
18 篇 管理学
- 17 篇 管理科学与工程(可...
- 7 篇 工商管理
4 篇 经济学
- 4 篇 应用经济学
1 篇 医学

主题

115 篇 dynamic programm...
76 篇 reinforcement le...
67 篇 learning
47 篇 optimal control
30 篇 neural networks
27 篇 control systems
21 篇 approximate dyna...
21 篇 approximation al...
20 篇 function approxi...
20 篇 equations
17 篇 convergence
16 篇 adaptive dynamic...
16 篇 state-space meth...
16 篇 heuristic algori...
14 篇 mathematical mod...
13 篇 stochastic proce...
12 篇 learning (artifi...
12 篇 adaptive control
12 篇 cost function
11 篇 algorithm design...

机构

5 篇 arizona state un...
4 篇 department of el...
4 篇 school of inform...
4 篇 department of in...
4 篇 univ sci & techn...
4 篇 chinese acad sci...
4 篇 department of el...
3 篇 princeton univ d...
3 篇 northeastern uni...
3 篇 national science...
3 篇 robotics institu...
3 篇 univ illinois de...
3 篇 univ utrecht dep...
2 篇 univ groningen i...
2 篇 sharif univ tech...
2 篇 univ texas autom...
2 篇 pengcheng labora...
2 篇 guangxi univ sch...
2 篇 chinese acad sci...
2 篇 cemagref lisc au...

作者

14 篇 liu derong
9 篇 wei qinglai
8 篇 si jennie
7 篇 xu xin
5 篇 derong liu
4 篇 lewis frank l.
4 篇 martin riedmille...
4 篇 huaguang zhang
4 篇 jennie si
4 篇 marco a. wiering
4 篇 xin xu
4 篇 zhang huaguang
4 篇 dongbin zhao
4 篇 lei yang
4 篇 powell warren b.
4 篇 riedmiller marti...
3 篇 hado van hasselt
3 篇 van hasselt hado
3 篇 jagannathan s.
3 篇 munos remi

语言

305 篇 英文
1 篇 其他
1 篇 中文

检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"

共 307 条记录，以下是151-160 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Convergence of Model-Based Temporal Difference learning for Control

Convergence of Model-Based Temporal Difference Learning for ...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Hado van Hasselt Marco A. Wiering Department of Information and Computing Sciences University of Utrecht Utrecht Netherlands

A theoretical analysis of model-based temporal difference learning for control is given, leading to a proof of convergence. This work differs from earlier work on the convergence of temporal difference learning by proving convergence to the optimal value function. This means that not the values of the current policy are found, but instead the policy is updated in such a manner that ultimately the optimal policy is guaranteed to be reached

关键词： Convergence learning dynamic programming Intelligent systems Telephony Stochastic processes

来源：评论

学校读者我要写书评

暂无评论

The Knowledge Gradient Policy for Offline learning with Independent Normal Rewards

The Knowledge Gradient Policy for Offline Learning with Inde...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Peter Frazier Warren Powell Department of Operations Research and Financial Engineering Princeton University Engineering Princeton NJ USA

We define a new type of policy, the knowledge gradient policy, in the context of an offline learning problem. We show how to compute the knowledge gradient policy efficiently and demonstrate through Monte Carlo simula... 详细信息

关键词： Mirrors Knowledge engineering Bandwidth Time measurement Response surface methodology Operations research Bayesian methods Performance evaluation dynamic programming learning

来源：评论

学校读者我要写书评

暂无评论

Toward effective combination of off-line and on-line training in ADP framework

Toward effective combination of off-line and on-line trainin...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Danil Prokhorov Toyota Technical Center Ann Arbor MI USA

We are interested in finding the most effective combination between off-line and on-line/real-time training in approximate dynamic programming. We introduce our approach of combining proven off-line methods of training for robustness with a group of on-line methods. Training for robustness is carried out on reasonably accurate models with the multi-stream Kalman filter method (Feldkamp et al., 1998), whereas on-line adaptation is performed either with the help of a critic or by methods resembling reinforcement learning. We also illustrate importance of using recurrent neural networks for both controller/actor and critic

关键词： Neurocontrollers Robustness Recurrent neural networks Neural networks Adaptive control dynamic programming Robust control Programmable control learning Uncertainty

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning of LQR control policy by a double inverted-pendulum biomechanical model

Reinforcement learning of LQR control policy by a double inv...

引用

2023 ieee international Conference on Industrial Technology, ICIT 2023

作者： Iqbal, Kamran Haras, Muhammad University of Arkansas at Little Rock Little RockAR72204 United States

ISBN: (纸本)9798350336504

Optimal LQR feedback gains can be learned using reinforcement learning (RL) framework for systems with unknown dynamics using policy iteration methods. However, policy iteration in the case of inherently unstable systems becomes challenging. In this study we establish reinforcement learning of optimal feedback gains in the case of a nonlinear double inverted-pendulum (DIP) biomechanical model. Using an admissible initial policy, the biomechanical model was simulated in MATLAB and trajectory data were recorded. The state variables were transformed to quadratic basis function and used in approximate dynamic programming (ADP) to learn the solution to the algebraic Riccati equation (ARE) underlying the LQR problem. The RL results obtained in the case of an inherently unstable DIP system indicate relatively fast convergence and demonstrate the potential to apply RL techniques to more complex systems. © 2023 ieee.

关键词： Iterative methods

来源：评论

学校读者我要写书评

暂无评论

Chic: Experience-driven Scheduling in Machine learning Clusters 19

Chic: Experience-driven Scheduling in Machine Learning Clust...

引用

ieee/ACM international symposium on Quality of Service (IWQoS)

作者： Gong, Yifan Li, Baochun Liang, Ben Zhan, Zheng Univ Toronto Dept Elect & Comp Engn Toronto ON Canada Syracuse Univ Coll Engn & Comp Sci Syracuse NY 13244 USA

ISBN: (数字)9781450367783

ISBN: (纸本)9781450367783

Large-scale machine learning (ML) models are routinely trained in a distributed fashion, due to their increasing complexity and data sizes. In a shared cluster handling multiple distributed learning workloads with a parameter server framework, it is important to determine the adequate number of concurrent workers and parameter servers for each ML workload over time, in order to minimize the average completion time and increase resource utilization. Existing schedulers for machine learning workloads involve meticulously designed heuristics. However, as the execution environment is highly complex and dynamic, it is challenging to construct an accurate model to make online decisions. In this paper, we design an experience-driven approach that learns to manage the cluster directly from experience rather than using a mathematical model. We propose Chic, a scheduler that is tailored for scheduling machine learning workloads in a cluster by leveraging deep reinforcement learning techniques. With our design of the state space, action space, and reward function, Chic trains a deep neural network with a modified version of the cross-entropy method to approximate the policy for assigning workers and parameter servers for future workloads based on the experience of the agent. Furthermore, a simplified version named Chic-Pair with a shorter training time for the policy is purposed by assigning workers and parameter servers in a pair. We compare Chic and Chic-Pair with state-of-the-art heuristics, and our results show that Chic and Chic-Pair are able to reduce the average training time significantly for machine learning workloads under a wide variety of conditions.

关键词： Distributed Machine learning Deep reinforcement learning Work-load Scheduling

来源：评论

学校读者我要写书评

暂无评论

A hierarchical learning control framework for tracking tasks, based on model-free principles 23

A hierarchical learning control framework for tracking tasks...

引用

23rd international Conference on System Theory, Control and Computing (ICSTCC)

作者： Radac, Mircea-Bogdan Negru, Vlad Precup, Radu-Emil Politehn Univ Timisoara PUT AAI Dept Timisoara Romania PUT AAI Dept Timisoara Romania

ISBN: (纸本)9781728106991

A hierarchical tracking learning framework is proposed in this work, by which, an optimally learned tracking behavior is extrapolated to new unseen trajectories without the need for relearning. This intelligent behavior uses learned reference inputs controlled outputs pairs called primitives, over a feedback control system (CS). learning is based on model-free Iterative learning Control under linearity assumption of the underlying CS. The CS linearity is indirectly ensured at low-level through an output reference model (ORM) tracking neural network state-feedback controller learned with an iterative model-free approximate value iteration as a representative reinforcement learning approach. learning uses a large amount of input-output process data collected with an underperforming linear controller, but designed from fewer input-output data, in a model-free VRFT approach. The higher-level primitive-based learning is validated on a coupled highly-dimensional nonlinear aerodynamic positioning system. It proves that the optimal tracking behavior is extendable to new unseen trajectories, without relearning.

关键词： data-driven model-free control learning control neural networks Virtual Reference Feedback Tuning approximate dynamic programming reinforcement learning Iterative learning Control hierarchical control primitive-based control

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 2006 ieee international symposium on Intelligent Control, ISIC 2006

Proceedings of the 2006 IEEE International Symposium on Inte...

引用

2006 ieee international symposium on Intelligent Control, ISIC 2006

ISBN: (纸本)0780397983

The proceedings contain 94 papers. The topics discussed include: neural adaptive control of dynamic sandwich systems with hysteresis;radial basis function based iterative learning control for stochastic distribution systems;energy-efficient approaches to coverage holes detection in wireless sensor networks;optimal sensor placement for border perambulation;finite horizon discrete-time approximate dynamic programming;adaptive critic designs based coupled neurocontrollers for a static compensator;stability analysis and design for switched descriptor systems;a design of a partial sliding mode controller using duality to linear functional observer;stability of digital control systems with time delays;robust stabilization of nonlinear switched systems via switched output feedback;intermittent iterative learning control;and iterative learning control of perspective dynamic systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

The Research of Quadrotor Flight Control Based on reinforcement learning and ADP 8

The Research of Quadrotor Flight Control Based on Reinforcem...

引用

8th Annual international Conference on Network and Information Systems for Computers, ICNISC 2022

作者： Li, Xueyuan Xie, Wentao Zhan, Wentao Xi'an Aeronautics Computing Technique Research Institute Avic Xi'an China

ISBN: (纸本)9781665453516

This paper studies the application of Lookup-Table reinforcement learning method into the continuous state space control of quadrotor simulator and designs a attitude controller for the quadrotor simulator based on Q-learning;for the improvement of defects concerning difficulty in the learning algorithm's convergence and low efficiency in learning when Q-learning is faced with large-scale and continuous-space optimized decision, the method of kernel approximate dynamic programming is introduced, Kernel-based Least-Squares Policy Iteration (KLSPI) is proposed, and a controller for the quadrotor simulator is designed based on this algorithm. The experiment shows that the reinforcement learning control method is of fast convergence speed, small steady-state error, strong adaptive ability and good control effect;when dealing with the problem of continuous state space, the Least-Squares Policy Iteration can converge better strategies with fewer training data compared with the traditional method of discretizing state space first. © 2022 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Optimization Control of Rectifier in HVDC System with ADHDP

Optimization Control of Rectifier in HVDC System with ADHDP

引用

8th international symposium on Neural Networks

作者： Song, Chunning Zhou, Xiaohua Lin, Xiaofeng Song, Shaojian Guangxi Univ Coll Elect Engn Guangxi Nanning 530004 Peoples R China

ISBN: (纸本)9783642211102

A novel nonlinear optimal controller for a rectifier in HVDC transmission system, using artificial neural networks, is presented in this paper. The action dependent heuristic dynamic programming(ADHDP), a member of the adaptive critic designs family is used for the design of the rectifier neurocontroller. This neurocontroller provides optimal control based on reinforcement learning and approximate dynamic programming(ADP). A series of simulations for a rectifier in dulble-ended unipolar HVDC system model with proposed neurocontroller and conventional PI controller were carried out in MAT-LAB/Simulink environment. Simulation results are provided to show that the proposed controller performs better than the conventional PI controller, the current of DC line in HVDC system with the proposed controller can quickly track with the changing of the reference current and prevent the occurrence of the current of DC line collapse when the large disturbances occur.

关键词： Optimal control rectifier HVDC transmission system approximate dynamic programming(ADP) action dependent heuristic dynamic programming (ADHDP)

来源：评论

学校读者我要写书评

暂无评论

Randomly Sampling Actions In dynamic programming

Randomly Sampling Actions In Dynamic Programming

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Christopher G. Atkeson Robotics Institute Carnegie Mellon University Pittsburgh PA USA

We describe an approach towards reducing the curse of dimensionality for deterministic dynamic programming with continuous actions by randomly sampling actions while computing a steady state value function and policy. This approach results in globally optimized actions, without searching over a discretized multidimensional grid. We present results on finding time invariant control laws for two, four, and six dimensional deterministic swing up problems with up to 480 million discretized states

关键词： Sampling methods dynamic programming Cost function Computational efficiency Interpolation Steady-state learning Robots USA Councils Multidimensional systems

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共31页 << < 12 13 14 15 16 17 18 19 20 21 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：