检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

299 篇 会议
8 篇 期刊文献

馆藏范围

307 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

180 篇 工学
- 158 篇 计算机科学与技术...
- 56 篇 电气工程
- 48 篇 软件工程
- 47 篇 控制科学与工程
- 13 篇 信息与通信工程
- 10 篇 机械工程
- 6 篇 仪器科学与技术
- 4 篇 力学（可授工学、理...
- 4 篇 生物工程
- 3 篇 动力工程及工程热...
- 2 篇 交通运输工程
- 2 篇 核科学与技术
- 2 篇 生物医学工程（可授...
- 1 篇 建筑学
- 1 篇 化学工程与技术
- 1 篇 航空宇航科学与技...
- 1 篇 食品科学与工程（可...
40 篇 理学
- 35 篇 数学
- 9 篇 系统科学
- 8 篇 统计学（可授理学、...
- 4 篇 物理学
- 4 篇 生物学
- 1 篇 化学
- 1 篇 天文学
- 1 篇 大气科学
- 1 篇 地球物理学
- 1 篇 地质学
18 篇 管理学
- 17 篇 管理科学与工程(可...
- 7 篇 工商管理
4 篇 经济学
- 4 篇 应用经济学
1 篇 医学

主题

115 篇 dynamic programm...
76 篇 reinforcement le...
67 篇 learning
47 篇 optimal control
30 篇 neural networks
27 篇 control systems
21 篇 approximate dyna...
21 篇 approximation al...
20 篇 function approxi...
20 篇 equations
17 篇 convergence
16 篇 adaptive dynamic...
16 篇 state-space meth...
16 篇 heuristic algori...
14 篇 mathematical mod...
13 篇 stochastic proce...
12 篇 learning (artifi...
12 篇 adaptive control
12 篇 cost function
11 篇 algorithm design...

机构

5 篇 arizona state un...
4 篇 department of el...
4 篇 school of inform...
4 篇 department of in...
4 篇 univ sci & techn...
4 篇 chinese acad sci...
4 篇 department of el...
3 篇 princeton univ d...
3 篇 northeastern uni...
3 篇 national science...
3 篇 robotics institu...
3 篇 univ illinois de...
3 篇 univ utrecht dep...
2 篇 univ groningen i...
2 篇 sharif univ tech...
2 篇 univ texas autom...
2 篇 pengcheng labora...
2 篇 guangxi univ sch...
2 篇 chinese acad sci...
2 篇 cemagref lisc au...

作者

14 篇 liu derong
9 篇 wei qinglai
8 篇 si jennie
7 篇 xu xin
5 篇 derong liu
4 篇 lewis frank l.
4 篇 martin riedmille...
4 篇 huaguang zhang
4 篇 jennie si
4 篇 marco a. wiering
4 篇 xin xu
4 篇 zhang huaguang
4 篇 dongbin zhao
4 篇 lei yang
4 篇 powell warren b.
4 篇 riedmiller marti...
3 篇 hado van hasselt
3 篇 van hasselt hado
3 篇 jagannathan s.
3 篇 munos remi

语言

305 篇 英文
1 篇 其他
1 篇 中文

检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"

共 307 条记录，以下是281-290 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Self-learning path-tracking control of autonomous vehicles using kernel-based approximate dynamic programming

Self-learning path-tracking control of autonomous vehicles u...

引用

international Joint Conference on Neural Networks (IJCNN)

作者： Xin Xu Hongyu Zhang Bin Dai Han-gen He Institute of Automation College of Mechatronics and Automation National University of Defense Technology Changsha China

With the fast development of robotics and intelligent vehicles, there has been much research work on modeling and motion control of autonomous vehicles. However, due to model complexity, and unknown disturbances from dynamic environment, the motion control of autonomous vehicles is still a difficult problem. In this paper, a novel self-learning path-tracking control method is proposed for a car-like robotic vehicle, where kernel-based approximate dynamic programming (ADP) is used to optimize the controller performance with little prior knowledge on vehicle dynamics. The kernel-based ADP method is a recently developed reinforcement learning algorithm called kernel least-squares policy iteration (KLSPI), which uses kernel methods with automatic feature selection in policy evaluation to get better generalization performance and learning efficiency. By using KLSPI, the lateral control performance of the robotic vehicle can be optimized in a self-learning and data-driven style. Compared with previous learning control methods, the proposed method has advantages in learning efficiency and automatic feature selection. Simulation results show that the proposed method can obtain an optimized path-tracking control policy only in a few iterations, which will be very practical for real applications.

关键词： Vehicles Kernel Mobile robots Vehicle dynamics Dictionaries Gain Algorithm design and analysis

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning of LQR control policy by a double inverted-pendulum biomechanical model

Reinforcement learning of LQR control policy by a double inv...

引用

ieee international Conference on Industrial Technology (ICIT)

作者： Kamran Iqbal Muhammad Haras University of Arkansas at Little Rock Little Rock Arkansas

Optimal LQR feedback gains can be learned using reinforcement learning (RL) framework for systems with unknown dynamics using policy iteration methods. However, policy iteration in the case of inherently unstable systems becomes challenging. In this study we establish reinforcement learning of optimal feedback gains in the case of a nonlinear double inverted-pendulum (DIP) biomechanical model. Using an admissible initial policy, the biomechanical model was simulated in MATLAB and trajectory data were recorded. The state variables were transformed to quadratic basis function and used in approximate dynamic programming (ADP) to learn the solution to the algebraic Riccati equation (ARE) underlying the LQR problem. The RL results obtained in the case of an inherently unstable DIP system indicate relatively fast convergence and demonstrate the potential to apply RL techniques to more complex systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Ramp metering based on on-line ADHDP (λ) controller

Ramp metering based on on-line ADHDP (λ) controller

引用

international Joint Conference on Neural Networks (IJCNN)

作者： Xuerui Bai Dongbin Zhao Jianqiang Yi Jing Xu Laboratory of Complex Systems and Intelligence Science Institute of Automation Chinese Academy and Sciences Beijing China University of Arizona Tucson USA

Increasing dependence on car-based travel has led to the daily occurrence of freeway congestions around the world. In order to improve the worse and worse traffic congestion situation and solve the problems brought with it, a new kind of effective, fast, and robust method should be presented. Ramp metering has been developed as a traffic management strategy to alleviate congestion on freeways. But, it doesnpsilat work well in uncertainty situations. In this paper, in order to solve the problems in uncertainty conditions, an on-line learning control method based on the fundamental principle of reinforcement learning is proposed. The method is ADP (adaptive dynamic programming) and in order to expedite the learning rate, the concept about eligibility traces is introduced here. Then eligibility trace and ADP is combined to present a new kind of traffic responsive control method. The new method is called action-dependent heuristic dynamic programming based on eligibility traces (ADHDP (lambda)). ADHDP (lambda) is an approximate optimal ramp metering method. Simulation studies on a hypothetical freeway indicate good control performance of the proposed real-time traffic controller.

关键词： Traffic control Artificial neural networks dynamic programming Training Vehicles Equations learning

来源：评论

学校读者我要写书评

暂无评论

Optimal Control Applied to Wheeled Mobile Vehicles

Optimal Control Applied to Wheeled Mobile Vehicles

引用

ieee international symposium on Intelligent Signal Processing

作者： M. Gomez T. Martinez S. Sanchez D. Meziat Departamento de Automática Universidad de Alcalá Spain Departamento de Física Ingeniería de Sistemas y Teoría de la Señal Universidad de Alcalá Spain

ISBN: (纸本)1424408296;97

The goal of the work described in this paper is to develop a particular optimal control technique based on a Cell-Mapping technique in combination with the Q-learning reinforcement learning method to control wheeled mobile vehicles. This approach manages 4 state variables due to a dynamic model is performed instead of a kinematics model which can be done with less variables. This new solution can be applied to non-linear continuous systems where reinforcement learning methods have multiple constraints. Emphasis is given to the new combination of techniques, which applied to optimal control problems produce satisfactory results. The proposed algorithm is very robust to any change involved in the vehicle parameters because the vehicle model is estimated in real time from received experience.

关键词： Optimal control Vehicle dynamics learning Remotely operated vehicles Wheels Kinematics dynamic programming Trajectory Intelligent vehicles Path planning

来源：评论

学校读者我要写书评

暂无评论

A reinforcement learning approach to support setup decisions in distributed manufacturing systems

A reinforcement learning approach to support setup decisions...

引用

international Conference on Emerging Technologies and Factory Automation (ETFA)

作者： P. McDonnell S. Joshi Industrial and Manufacturing Engineering Pennsylvania State University University Park PA USA

A reinforcement learning approach to specifying payoffs for setup games is presented. Setup games are normal form, noncooperative games used by heterarchical machine controllers to evaluate reconfiguration decisions. While past work utilizing heuristic measures to approximate the effect of setup decisions has demonstrated promising performance, the lack of an accurate long-term model of system dynamics in these heuristic approaches limits their usefulness. The reinforcement learning approach iteratively learns the long term costs of setup decisions, accounting for both immediate decision effects and the effects of likely downstream decisions.

关键词： learning Manufacturing systems Control systems Centralized control Costs Automatic control Electrical equipment industry Manufacturing industries Fault tolerant systems dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Adaptive critic based design of a fuzzy motor speed controller

Adaptive critic based design of a fuzzy motor speed controll...

引用

ieee international symposium on Intelligent Control (ISIC)

作者： T.T. Shannon G.G. Lendaris Northwest Computational Intelligence Laboratory and Systems Science Portland State University USA

We show the applicability of the dual heuristic programming (DHP) method of approximate dynamic programming to the design of a fuzzy control system. The DHP and related techniques have been developed in the neurocontrol context but can be equally productive when used with fuzzy controllers or neuro-fuzzy hybrids. We demonstrate this technique on a speed controller for a brushless motor. In our example, we take advantage of the Takagi-Sugeno model framework to initialize the tunable parameters of our plant model with reasonable problem specific values, a practice difficult to perform when applying DHP to neurocontrol.

关键词： Programmable control Adaptive control Fuzzy control dynamic programming Neural networks Fuzzy neural networks Control systems learning Computational intelligence Laboratories

来源：评论

学校读者我要写书评

暂无评论

An analysis of gradient-based policy iteration

An analysis of gradient-based policy iteration

引用

international Joint Conference on Neural Networks (IJCNN)

作者： J. Dankert Lei Yang J. Si Department of Electrical Engineering Arizona State University Tempe AZ USA

Recently, a system theoretic framework for learning and optimization has been developed that shows how many approximate dynamic programming paradigms such as perturbation analysis, Markov decision processes, and reinforcement learning are very closely related. Using this system theoretic framework, a new optimization technique called gradient-based policy iteration (GBPI) has been developed. In this paper, we show how GBPI iteration can be extended to partially observable Markov decision processes (POMDPs). We also develop the value iteration analogue of GBPI and show that this new version of value iteration, extended to POMDPs, not only theoretically acts like value iteration but also does so numerically.

关键词： learning Control systems Terminology Algorithm design and analysis Poisson equations Electronic mail dynamic programming Operations research Artificial intelligence System performance

来源：评论

学校读者我要写书评

暂无评论

An adaptive clustering method for model-free reinforcement learning

An adaptive clustering method for model-free reinforcement l...

引用

ieee international Conference on Multi Topic

作者： A. Matt G. Regensburger Institute of Mathematics University of Innsbruck Austria

Machine learning for real world applications is a complex task due to the huge state and action sets they deal with and the a priori unknown dynamics of the environment involved. reinforcement learning offers very efficient model-free methods which are often combined with approximation architectures to overcome these problems. We present a Q-learning implementation that uses a new adaptive clustering method to approximate state and actions sets. Experimental results for an obstacle avoidance behavior with the mobile robot Khepera are given.

关键词： Clustering methods Equations Mobile robots Machine learning Artificial intelligence Decision making Stochastic processes dynamic programming

来源：评论

学校读者我要写书评

暂无评论

A hybrid model for learning sequential navigation

A hybrid model for learning sequential navigation

引用

ieee international symposium on Computational Intelligence in Robotics and Automation (CIRA)

作者： R. Sun T. Peterson University of Alabama Tuscaloosa AL USA

To deal with reactive sequential decision tasks, we present a learning model CLARION, which is a hybrid connectionist model consisting of both localist and distributed representations, based on the two-level approach proposed in Sun (1995). The model learns and utilizes procedural and declarative knowledge, tapping into the synergy of the two types of processes. It unifies neural, reinforcement, and symbolic methods to perform online, bottom-up learning. Experiments in various situations are reported that shed light on the working of the model.

关键词： Navigation Sun learning Mediation Robots dynamic programming Humans

来源：评论

学校读者我要写书评

暂无评论

Nonparametric decentralized sequential detection

Nonparametric decentralized sequential detection

引用

ieee international symposium on Information Theory

作者： A. Kuh Department of Electrical Engineering University of Hawai Honolulu HI USA

We consider a decentralized sequential detection problem with a set of sensors and a fusion center. Each sensor receives information from inputs and possibly other sensors at discrete times and transmits summary information to a fusion center which processes the summary information by performing a sequential test to make a decision on one of two hypotheses. The work discussed differs from previous work by Veervalli, Basar and Poor (1993) in that the conditional densities given each hypothesis are unknown. Information about making good decisions is learned from observing real data and employing reinforcement learning procedures.

关键词： Sensor fusion Cost function Sensor phenomena and characterization dynamic programming Bayesian methods learning Neural networks Performance evaluation Testing Sensor systems

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共31页 << < 22 23 24 25 26 27 28 29 30 31 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：