检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

299 篇 会议
8 篇 期刊文献

馆藏范围

307 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

180 篇 工学
- 158 篇 计算机科学与技术...
- 56 篇 电气工程
- 48 篇 软件工程
- 47 篇 控制科学与工程
- 13 篇 信息与通信工程
- 10 篇 机械工程
- 6 篇 仪器科学与技术
- 4 篇 力学（可授工学、理...
- 4 篇 生物工程
- 3 篇 动力工程及工程热...
- 2 篇 交通运输工程
- 2 篇 核科学与技术
- 2 篇 生物医学工程（可授...
- 1 篇 建筑学
- 1 篇 化学工程与技术
- 1 篇 航空宇航科学与技...
- 1 篇 食品科学与工程（可...
40 篇 理学
- 35 篇 数学
- 9 篇 系统科学
- 8 篇 统计学（可授理学、...
- 4 篇 物理学
- 4 篇 生物学
- 1 篇 化学
- 1 篇 天文学
- 1 篇 大气科学
- 1 篇 地球物理学
- 1 篇 地质学
18 篇 管理学
- 17 篇 管理科学与工程(可...
- 7 篇 工商管理
4 篇 经济学
- 4 篇 应用经济学
1 篇 医学

主题

115 篇 dynamic programm...
76 篇 reinforcement le...
67 篇 learning
47 篇 optimal control
30 篇 neural networks
27 篇 control systems
21 篇 approximate dyna...
21 篇 approximation al...
20 篇 function approxi...
20 篇 equations
17 篇 convergence
16 篇 adaptive dynamic...
16 篇 state-space meth...
16 篇 heuristic algori...
14 篇 mathematical mod...
13 篇 stochastic proce...
12 篇 learning (artifi...
12 篇 adaptive control
12 篇 cost function
11 篇 algorithm design...

机构

5 篇 arizona state un...
4 篇 department of el...
4 篇 school of inform...
4 篇 department of in...
4 篇 univ sci & techn...
4 篇 chinese acad sci...
4 篇 department of el...
3 篇 princeton univ d...
3 篇 northeastern uni...
3 篇 national science...
3 篇 robotics institu...
3 篇 univ illinois de...
3 篇 univ utrecht dep...
2 篇 univ groningen i...
2 篇 sharif univ tech...
2 篇 univ texas autom...
2 篇 pengcheng labora...
2 篇 guangxi univ sch...
2 篇 chinese acad sci...
2 篇 cemagref lisc au...

作者

14 篇 liu derong
9 篇 wei qinglai
8 篇 si jennie
7 篇 xu xin
5 篇 derong liu
4 篇 lewis frank l.
4 篇 martin riedmille...
4 篇 huaguang zhang
4 篇 jennie si
4 篇 marco a. wiering
4 篇 xin xu
4 篇 zhang huaguang
4 篇 dongbin zhao
4 篇 lei yang
4 篇 powell warren b.
4 篇 riedmiller marti...
3 篇 hado van hasselt
3 篇 van hasselt hado
3 篇 jagannathan s.
3 篇 munos remi

语言

305 篇 英文
1 篇 其他
1 篇 中文

检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"

共 307 条记录，以下是261-270 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Bayesian Sequential Optimal Experimental Design for Linear Regression with reinforcement learning

Bayesian Sequential Optimal Experimental Design for Linear R...

引用

international Conference on Machine learning and Applications (ICMLA)

作者： Fadil Santosa Loren Anderson Dept. of Applied Mathematics and Statistics Johns Hopkins University Baltimore MD USA School of Mathematics University of Minnesota Twin Cities Minneapolis MN USA

We perform a comparison study on Bayesian sequential optimal experimental design algorithms applied to linear regression in two unknowns. We transform the Bayesian sequential optimal experimental design problem into a reinforcement learning problem to determine the power of deep reinforcement learning algorithms against baselines including batch design, greedy design, dynamic programming, and approximate dynamic programming. Using KL-divergence to measure information gain in the unknown parameters, we construct objectives for each algorithm to maximize information gain. This work showcases novel comparisons between the aforementioned algorithms and provides a new application of reinforcement learning to Bayesian sequential optimal experimental design for inverse problems in linear regression with multiple parameters.

关键词： Machine learning algorithms Inverse problems Heuristic algorithms Linear regression reinforcement learning Transforms Gain measurement

来源：评论

学校读者我要写书评

暂无评论

Fuzzy Q-learning: a new approach for fuzzy dynamic programming

Fuzzy Q-learning: a new approach for fuzzy dynamic programmi...

引用

ieee international Conference on Fuzzy Systems (FUZZ-ieee)

作者： H.R. Berenji NASA Ames Research Center Mountain View CA USA

Fuzzy reinforcement learning (FRL) involves "jump starting" reinforcement learning with fuzzy logic rules. By using FRL, prior domain knowledge, which may be very approximate and imprecise, can be expressed in terms of fuzzy rules and refined later through the learning process. In this paper, we develop a new algorithm called fuzzy Q-learning (or FQ-learning) which extends Watkin's Q-learning method. It can be used for decision processes in which the goals and/or the constraints, but not necessarily the system under control, are fuzzy in nature. An example of a fuzzy constraint is: "the weight of object A must not be substantially heavier than w" where w is a specified weight. Similarly, an example of a fuzzy goal is: "the robot must be in the vicinity of door k". We show that FQ-learning provides an alternative solution to this problem which is simpler than the Bellman-Zadeh's fuzzy dynamic programming approach. We apply the algorithm to a multistage decision making problem and a navigation task.< >

关键词： dynamic programming Fuzzy systems learning Decision making Intelligent systems Artificial intelligence Control systems Navigation NASA Fuzzy logic

来源：评论

学校读者我要写书评

暂无评论

Advances in reinforcement learning and their implications for intelligent control

Advances in reinforcement learning and their implications fo...

引用

ieee international symposium on Intelligent Control (ISIC)

作者： S.D. Whitehead R.S. Sutton D.H. Ballard Department of Computer Sciences University of Rochester Rochester NY USA

The focus of this work is on control architectures that are based on reinforcement learning. A number of recent advances that have contributed to the viability of reinforcement learning approaches to intelligent control are surveyed. These advances include the formalization of the relationship between reinforcement learning and dynamic programming, the use of internal predictive models to improve learning rate, and the integration of reinforcement learning with active perception. On the basis of these advances and other results, it is concluded that control architectures base on reinforcement learning are now in a position to satisfy many of the criteria associated with intelligent control.< >

关键词： learning Intelligent control Control systems Intelligent systems Intelligent sensors Optimal control Problem-solving Programmable control Adaptive control Computer science

来源：评论

学校读者我要写书评

暂无评论

A Budget-aware Incentive Mechanism for Vehicle-to-Grid via reinforcement learning 31

A Budget-aware Incentive Mechanism for Vehicle-to-Grid via R...

引用

31st ieee/ACM international symposium on Quality of Service, IWQoS 2023

作者： Zhu, Tianxiang Zhang, Xiaoxi Duan, Jingpu Zhou, Zhi Chen, Xu Sun Yat-sen University Guangzhou China Southern University of Science and Technology Shenzhen China Pengcheng Laboratory Shenzhen China

ISBN: (纸本)9798350399738

With the increasing penetration of renewable energy and electric vehicles (EVs), the behavior of EVs' charging and discharging has shown great impact on the Micro Grid power load, motivating the development of Vehicle-to-Grid (V2G) technologies. However, the V2G market is still in its infancy, due to insufficient understanding of EV users' willingness and concerns. While many studies consider direct EV control, it's more realistic to indirectly affect users' behavior through monetary incentives. For better implementation flexibility, we advocate to display at charging piles strategically chosen incentives that are combined with electricity prices. Technically, this is the first model-free learning algorithm that can optimize incentives under unknown EV user reactions, increase the load control effectiveness and users' quality-of-service (QoS) simultaneously under a long-term incentive budget, and provide theoretical performance guarantees. We first construct a bi-level optimization framework to model the time-dependencies across our solutions. We then integrate primal-dual theories and upper-confidence bounds into reinforcement learning to balance power control and incentive consumption. A dynamic programming based algorithm is also proposed to maximize the aggregate user QoS. Finally, we prove bounded sub-optimality of our learning algorithm through theoretical analysis and conduct trace-driven simulations to demonstrate the advantages of our bi-level framework. © 2023 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Node Fault Prediction Assisted Small-World IoT Networks Using ML Frameworks: Towards Performance Improvement 18

Node Fault Prediction Assisted Small-World IoT Networks Usin...

引用

18th ieee international Conference on Advanced Networks and Telecommunications Systems, ANTS 2024

作者： Sharma, Neha Gupta, Aryaman Deepak, Sivala Pandey, Om Jee Indian Institute of Technology BHU Department of Electronics Engineering Varanasi India

ISBN: (纸本)9798350391725

The rapid growth of the Internet of Things (IoT) networks has led to the deployment of large-scale networks, enabling seamless connectivity and data exchange among various devices. To manage the complexity and ensure efficient communication in these expansive networks, adopting suitable network architecture becomes crucial. Small-world networks, characterized by a high Average Clustering Coefficient (ACC) and low Average Path Length (APL), have emerged as a promising architecture for IoT systems due to their efficient communication. However, introducing the Small-World Characteristics (SWC) in an IoT network is challenging due to the need for strategic placement of long-range connections while maintaining low APL and high ACC and ensuring scalability and robustness. In this work, we introduce SWC into the network using an actor-critic reinforcement learning algorithm. Additionally, ensuring the reliability of sensor nodes is crucial to maintaining the overall network performance. Therefore, we propose a joint method for dynamic node fault prediction and data routing within small-world IoT networks using advanced Machine learning (ML) frameworks. Several data routing experiments have been conducted to validate the effectiveness of the proposed approach using simulated small-world IoT networks. We analyzed major network parameters such as lifetime, latency, and throughput. We compared the proposed method with existing state-of-the-art approaches and observed promising results. © 2024 ieee.

关键词： APL (programming language)

来源：评论

学校读者我要写书评

暂无评论

Residual-gradient-based neural reinforcement learning for the optimal control of an acrobot

Residual-gradient-based neural reinforcement learning for th...

引用

ieee international symposium on Intelligent Control (ISIC)

作者： Xin Xu Han-gen He Department of Automatic Control National University of Defense Technology Changsha China

Based on the idea of dynamic programming, reinforcement learning (RL) has become an important model-free method to solve difficult optimal control problems. In this paper, a novel neural RL method is proposed to solve the time-optimal control problem of a class of under-actuated robots, which is called the acrobot. The RL method uses a modified residual gradient reinforcement learning algorithm called RGNP (residual gradient with nonstationary policy). The RGNP algorithm not only has guaranteed convergence under certain conditions but also can ensure the performance of the approximated optimal policy, which is superior to the previous residual gradient algorithms. Simulation results of the learning control of the acrobot illustrate the effectiveness of the proposed method.

关键词： learning Optimal control Convergence Function approximation Approximation algorithms dynamic programming Robots Algorithm design and analysis Helium Electronic mail

来源：评论

学校读者我要写书评

暂无评论

reinforcement-learning-based Magneto-hydrodynamic Control of Hypersonic Flows

Reinforcement-Learning-based Magneto-hydrodynamic Control of...

引用

ieee symposium on Adaptive dynamic programming and reinforcement learning, (ADPRL)

作者： Nilesh V. Kulkarni Minh Q. Phan NASA Ames Research Center QSS Group Inc. Moffett Field CA USA Dartmouth College Hanover NH USA

ISBN: (纸本)1424407060

In this work, we design a policy-iteration-based Q-learning approach for on-line optimal control of ionized hypersonic flow at the inlet of a scramjet engine. Magneto-hydrodynamics (MHD) has been recently proposed as a means for flow control in various aerospace problems. This mechanism corresponds to applying external magnetic fields to ionized flows towards achieving desired flow behavior. The applications range from external flow control for producing forces and moments on the air-vehicle to internal flow control designs, which compress and extract electrical energy from the flow. The current work looks at the later problem of internal flow control. The baseline controller and Q-function parameterizations are derived from an off-line mixed predictive-control and dynamic-programming-based design. The nominal optimal neural network Q-function and controller are updated on-line to handle modeling errors in the off-line design. The on-line implementation investigates key concerns regarding the conservativeness of the update methods. Value-iteration-based update methods have been shown to converge in a probabilistic sense. However, simulations results illustrate that realistic implementations of these methods face significant training difficulties, often failing in learning the optimal controller on-line. The present approach, therefore, uses a policy-iteration-based update, which has time-based convergence guarantees. Given the special finite-horizon nature of the problem, three novel on-line update algorithms are proposed. These algorithms incorporate different mix of concepts, which include bootstrapping, and forward and backward dynamic programming update rules. Simulation results illustrate success of the proposed update algorithms in re-optimizing the performance of the MHD generator during system operation

关键词： Optimal control Engines Magnetohydrodynamics Aerospace control Magnetic fields Force control Control design Neural networks Error correction Convergence

来源：评论

学校读者我要写书评

暂无评论

A dynamic checkpointing scheme based on reinforcement learning

A dynamic checkpointing scheme based on reinforcement learni...

引用

Pacific Rim international symposium on Dependable Computing

作者： H. Okamura Y. Nishimura T. Dohi Graduate School of Engineering Department of Information Engineering Hiroshima University Higashihiroshima Japan

We develop a new checkpointing scheme for a uniprocess application. First, we model the checkpointing scheme by a semiMarkov decision process, and apply the reinforcement learning algorithm to estimate statistically the optimal checkpointing policy. More specifically, the representative reinforcement learning algorithm, called the Q-learning algorithm, is used to develop an adaptive checkpointing scheme. In simulation experiments, we examine the asymptotic behavior of the system overhead with adaptive checkpointing and show quantitatively that the proposed dynamic checkpoint algorithm is useful and robust under an incomplete knowledge on the failure time distribution.

关键词： Checkpointing learning Availability dynamic programming Adaptive systems Heuristic algorithms Robustness Fault tolerant systems Databases Delay

来源：评论

学校读者我要写书评

暂无评论

learning PROGRAMS FOR DECISION AND CONTROL

LEARNING PROGRAMS FOR DECISION AND CONTROL

引用

2001 international Conferences on Info-tech and Info-net

作者： Jennie Si Russell Enns Yu-tsung Wang Department of Electrical Engineering Arizona State University

This paper introduces learning programs,an approximate dynamic programming(ADP) or otherwise named Neural dynamic programming(NDP) algorithm developed and tested by the *** programs are particularly suited for learning based decision and control applications in both discrete and continuous state spaces, as demonstrated by our extensive examinations of both real life and artificial *** this paper,we first introduce the basic framework of our learning programs,the associated learning algorithms,and then extensive case studies to demonstrate the effectiveness of our learning *** is probably the first time that neural dynamic programming type of learning algorithms has been applied to complex,real life continuous state problems. Until now,reinforcement learning(another learning approach for approximate dynamic programming) has been mostly successful in discrete state space *** the other hand,prior NDP based approaches to controlling continuous state space systems have all been limited to smaller,or linearized,or decoupled *** the work presented here compliments and advances the existing literature in the general area of learning approaches in approximate dynamic programming.

关键词： net learning PROGRAMS FOR DECISION AND CONTROL NDP

来源：评论

学校读者我要写书评

暂无评论

A performance gradient perspective on approximate dynamic programming and its application to partially observable Markov decision processes

A performance gradient perspective on approximate dynamic pr...

引用

ieee international Conference on Computer-Aided Design

作者： James Dankert Lei Yang Jennie Si Department of Electrical Engineering Arizona State University Tempe AZ USA

This paper shows an approach to integrating common approximate dynamic programming (ADP) algorithms into a theoretical framework to address both analytical characteristics and algorithmic features. Several important insights are gained from this analysis, including new approaches to the creation of algorithms. Built on this paradigm, ADP learning algorithms are further developed to address a broader class of problems: optimization with partial observability. This framework is based on an average cost formulation which makes use of the concepts of differential costs and performance gradients to describe learning and optimization algorithms. Numerical simulations are conducted including a queueing problem and a maze problem to illustrate and verify features of the proposed algorithms. Pathways for applying this analysis to adaptive critics are also shown.

关键词： dynamic programming Function approximation Algorithm design and analysis Equations Cost function Optimization methods Intelligent control Heuristic algorithms Performance analysis Observability

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共31页 << < 22 23 24 25 26 27 28 29 30 31 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：