检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

745 篇 会议
265 篇 期刊文献
4 册 图书

馆藏范围

1,014 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

707 篇 工学
- 520 篇 计算机科学与技术...
- 376 篇 电气工程
- 275 篇 控制科学与工程
- 154 篇 软件工程
- 79 篇 信息与通信工程
- 39 篇 交通运输工程
- 23 篇 仪器科学与技术
- 20 篇 机械工程
- 9 篇 生物工程
- 8 篇 电子科学与技术（可...
- 7 篇 力学（可授工学、理...
- 6 篇 动力工程及工程热...
- 6 篇 石油与天然气工程
- 5 篇 土木工程
- 4 篇 航空宇航科学与技...
- 4 篇 生物医学工程（可授...
- 3 篇 材料科学与工程（可...
- 3 篇 化学工程与技术
- 3 篇 安全科学与工程
119 篇 理学
- 99 篇 数学
- 33 篇 系统科学
- 22 篇 统计学（可授理学、...
- 10 篇 生物学
- 8 篇 物理学
- 4 篇 化学
63 篇 管理学
- 60 篇 管理科学与工程(可...
- 15 篇 工商管理
- 5 篇 图书情报与档案管...
5 篇 经济学
- 4 篇 应用经济学
3 篇 法学
- 3 篇 社会学
2 篇 教育学
2 篇 医学

主题

309 篇 reinforcement le...
214 篇 dynamic programm...
202 篇 optimal control
105 篇 adaptive dynamic...
104 篇 adaptive dynamic...
96 篇 learning
87 篇 neural networks
72 篇 heuristic algori...
67 篇 reinforcement le...
58 篇 learning (artifi...
54 篇 nonlinear system...
52 篇 convergence
51 篇 control systems
51 篇 mathematical mod...
48 篇 approximate dyna...
44 篇 approximation al...
43 篇 equations
41 篇 adaptive control
40 篇 artificial neura...
39 篇 cost function

机构

41 篇 chinese acad sci...
27 篇 univ rhode isl d...
17 篇 tianjin univ sch...
16 篇 univ sci & techn...
16 篇 univ illinois de...
15 篇 northeastern uni...
14 篇 beijing normal u...
13 篇 northeastern uni...
12 篇 northeastern uni...
12 篇 guangdong univ t...
9 篇 natl univ def te...
8 篇 ieee
8 篇 univ chinese aca...
7 篇 univ chinese aca...
7 篇 cent south univ ...
7 篇 southern univ sc...
6 篇 chinese acad sci...
6 篇 missouri univ sc...
6 篇 beijing univ tec...
5 篇 nanjing univ pos...

作者

54 篇 liu derong
37 篇 wei qinglai
29 篇 he haibo
21 篇 xu xin
21 篇 wang ding
19 篇 jiang zhong-ping
17 篇 yang xiong
17 篇 zhang huaguang
17 篇 ni zhen
16 篇 lewis frank l.
16 篇 zhao bo
15 篇 gao weinan
14 篇 zhao dongbin
13 篇 zhong xiangnan
12 篇 si jennie
11 篇 derong liu
10 篇 jagannathan s.
10 篇 dongbin zhao
9 篇 song ruizhuo
9 篇 abouheaf mohamme...

语言

988 篇 英文
20 篇 其他
6 篇 中文

检索条件"任意字段=IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning"

共 1014 条记录，以下是981-990 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

learning Run-Time Compositions of Interacting Adaptations

Learning Run-Time Compositions of Interacting Adaptations

引用

SEAMS International Workshop on Software Engineering for adaptive and Self-Managing Systems, ICSE

作者： Nicolás Cardozo Ivana Dusparic Systems and Computing Engineering Department Universidad de los Andes Colombia School of Computer Science and Statistics Trinity College Dublin Ireland

Self-adaptive systems continuously adapt to internal and external changes in their execution environment. In context-based self-adaptation, adaptations take place in response to the characteristics of the execution environment, captured as a context. However, in large-scale adaptive systems operating in dynamic environments, multiple contexts are often active at the same time, requiring simultaneous execution of multiple adaptations. Complex interactions between such adaptations might not have been foreseen or accounted for at design time. For example, adaptations can partially overlap, requiring only partial execution of each, or they can be conflicting, requiring some of the adaptations not to be executed at all, in order to preserve system execution. To ensure a correct composition of adaptations, we propose ComInA, a novel reinforcement learning based approach, which autonomously learns interactions between adaptations as well as the most appropriate adaptation composition for each combination of active contexts, as they arise. We present an initial evaluation of ComInA in an urban public transport network simulation, where multiple adaptations to buses, routes, and stations are required. Early results show that ComInA correctly identifies whether adaptations are compatible or conflicting and learns to execute adaptations which maximize system performance. However, further investigation is needed into how best to utilize such identified relationships to optimize a wider range of metrics and utilize more complex composition strategies.

关键词：

来源：评论

学校读者我要写书评

暂无评论

adaptive Assist-as-needed Control Based on Actor-Critic reinforcement learning

Adaptive Assist-as-needed Control Based on Actor-Critic Rein...

引用

2019 ieee/RSJ International Conference on Intelligent Robots and Systems (IROS)

作者： Yufeng Zhang Shuai Li Karen J. Nolan Damiano Zanotto Wearable Robotics Systems (WRS) Lab. Stevens Institute of Technology Hoboken NJ USA Human Performance and Engineering Research Kessler Foundation West Orange NJ USA

In robot-assisted rehabilitation, assist-as-needed (AAN) controllers have been proposed to promote subjects' active participation, which is thought to lead to better training outcomes. Most of these AAN controllers require a patient-specific manual tuning of the parameters defining the underlying force-field, which typically results in a tedious and time-consuming process. In this paper, we propose a reinforcement-learning-based impedance controller that actively reshapes the stiffness of the force-field to the subject's performance, while providing assistance only when needed. This adaptability is made possible by correlating the subject's most recent performance to the ultimate control objective in real-time. In addition, the proposed controller is built upon action dependent heuristic dynamic programming using the actor-critic structure, and therefore does not require prior knowledge of the system model. The controller is experimentally validated with healthy subjects through a simulated ankle mobilization training session using a powered ankle-foot orthosis.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Integral reinforcement learning for Online Computation of Feedback Nash Strategies of Nonzero-Sum Differential Games

Integral Reinforcement Learning for Online Computation of Fe...

引用

2010 49th ieee Conference on Decision and Control

作者： Draguna Vrabie Frank Lewis Automation and Robotics Research Institute University of Texas at Arlington 7300 Jack Newell Blvd. S. Fort Worth TX 76118 USA

ISBN: (纸本)9781424477456

This paper presents an Approximate/adaptive dynamic programming (ADP) algorithm that finds online the Nash equilibrium for two-player nonzero-sum differential games with linear dynamics and infinite horizon quadratic cost. Each of the game players is using the procedure of Integral reinforcement learning (IRL) to calculate online the infinite horizon value function that it associates with every given set of feedback control policies. It will be shown that the online algorithm is mathematically equivalent to an offline iterative method, previously introduced in the literature, that solves the set of coupled algebraic Riccati equations (ARE) underlying the game problem using complete knowledge on the system dynamics. Here we show how the ADP techniques will enhance the capabilities of the offline method allowing an online solution without the requirement of complete knowledge of the system dynamics. The two participants in the continuous-time differential game are competing in real-time and the feedback Nash control strategies will be determined based on online measured data from the system. The algorithm is built on interplay between a learning phase, where each of the players is learning online the value that they associate with a given set of play policies, and a policy update step, performed by each of the payers towards decreasing the value of their cost. The players are learning concurrently. The feasibility of the ADP scheme is demonstrated in simulation.

关键词： differential games Players System dynamics Nash equilibrium game player Policies Riccati equations learning infinite horizon

来源：评论

学校读者我要写书评

暂无评论

Optimizing Video Conferencing QoS: A DRL-based Bitrate Allocation Framework

Optimizing Video Conferencing QoS: A DRL-based Bitrate Alloc...

引用

ieee symposium on Network Operations and Management

作者： Kyungchan Ko Sangwoo Ryu Nguyen Van Tu James Won-Ki Hong Department of Computer Science and Engineering POSTECH Korea Graduate School of Artificial Intelligence POSTECH Korea

ISBN: (数字)9798350327939

ISBN: (纸本)9798350327946

As the user count for video-related services continues to grow, ensuring high-quality service (QoS) for them will become even more crucial in the future. Many studies have been conducted to enhance the quality of on-demand video streaming using adaptive bitrate (ABR) algorithms and artificial intelligence (AI). This study addresses a more complex challenge than that of on-demand video streaming: enhancing service quality in multi-party, full-duplex communication scenarios, such as video conferences. We propose a deep reinforcement learning (DRL)-based video bitrate allocation framework for a media server in the video conferencing system. Our framework aims to increase overall QoS by applying an appropriate bitrate for each connection in a video conferencing call, considering the network conditions for users. We train the DRL model to maximize the aggregate QoS of users in a meeting by constructing a feedback loop between a media server and a DRL server. Our experimental results demonstrate that our framework can adaptively control the video bitrate according to changes in network conditions. As a result, it achieves higher video bitrates in the user application (approximately, 5% under stable network conditions and 35% over the highly dynamic network conditions) compared to the existing rule-based bandwidth allocation.

关键词： Bit rate Quality of service Bandwidth Streaming media Media dynamic scheduling Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

An adaptive Optimal Virtual Inertia Control Scheme for Stability Enhancement of Utility-Scale PV Power Plants

An Adaptive Optimal Virtual Inertia Control Scheme for Stabi...

引用

International Conference on Power Electronics and Motion Control (IPEMC)

作者： Diptak Pal Mrutyunjaya Sahani Sanjib Kumar Panda Department of ECE National University of Singapore Singapore Singapore

ISBN: (数字)9798350351330

ISBN: (纸本)9798350351347

Utility-scale inverter interfaced photovoltaic (PV) power plants integrated to weak grids experience several challenging issues, viz., poorly damped oscillations of power, transient dc bus over/under voltage and instability during LVRT, which ultimately result in tripping of units. To overcome these challenges, a reinforcement learning (RL) based adaptive optimal virtual inertia control (AOVIC) scheme has been proposed in this paper. The implemented virtual inertia control (VIC) scheme consists of two parts, (1) the virtual governor and (2) virtual inertia, which have been emulated using the stored energy of the DC-link capacitor. Small signal modelling and stability analysis have also been performed to demonstrate the effectiveness of such a VIC scheme for weak grid-tied PV power plant under variations in grid strength. Thereafter, an adaptive dynamic programming (ADP) strategy has been utilized for implementing the RL-based AOVIC scheme, which optimally tunes the VIC loop in a model-free manner with an aim of enhancing the power oscillation damping. Finally, the efficacy of the proposed AOVIC scheme has been evaluated by performing numerical simulations on a detailed nonlinear model of a weak grid-tied PV power plant.

关键词： Damping Adaptation models reinforcement learning Power system stability Numerical simulation Stability analysis Numerical models

来源：评论

学校读者我要写书评

暂无评论

Design and real-time implementation of optimal power system wide area system-centric controller based on temporal difference learning

Design and real-time implementation of optimal power system ...

引用

Conference Record of the ieee Industry Applications Society Annual Meeting (IAS)

作者： Reza Yousefian Sukumar Kamalasadan Department of Electrical and Computer Engineering University of North Carolina at Charlotte Charlotte NC

In this paper a new method for designing and implementing coordinated wide area controller architecture is presented and tested using real-time digital simulation on a benchmark two area power system model for improved power system dynamic stability. The algorithm is an optimal Wide Area System-Centric Controller and Observer (WASCCO) based on reinforcement and temporal difference learning which allows the system to learn from interaction and predict future states. The controller design uses a powerful technique of the adaptive critic design (ACD) family called dual heuristic programming (DHP). The DHP controllers training and testing are implemented on the Innovative Integration Picolo card consisting of the TMS320C28335 processor. The main advantage of this design is its ability to learn from the past using eligibility traces and predict the optimal trajectory through temporal difference learning in the format of Receding Horizon Control(RHC). Results on a two area system provides better response compared to conventional schemes.

关键词： Power system stability Generators Artificial neural networks Delays Real-time systems Mathematical model

来源：评论

学校读者我要写书评

暂无评论

dynamic Resource Management for Cloud-native Bulk Synchronous Parallel Applications

Dynamic Resource Management for Cloud-native Bulk Synchronou...

引用

International symposium on Object-Oriented Real-Time Distributed Computing

作者： Evan Wang Yogesh Barve Aniruddha Gokhale Hongyang Sun Dept of CS Vanderbilt University Nashville TN USA Dept of EECS University of Kansas Lawrence KS USA

Many traditional high-performance computing applications including those that follow the Bulk Synchronous Parallel (BSP) communication paradigm are increasingly being deployed in cloud-native virtualized and multi-tenant container clusters. However, such a shared, virtualized platform limits the degree of control that BSP applications can have in effectively allocating resources. This can adversely impact their performance, particularly when stragglers manifest in individual BSP supersteps. Existing BSP resource management solutions assume the same execution time for individual tasks at every superstep, which is not always the case. To address these limitations, we present a dynamic resource management middleware for cloud-native BSP applications comprising a heuristics algorithm that determines effective resource configurations across multiple supersteps while considering dynamic workloads per superstep, and trading off performance improvements with reconfiguration costs. Moreover, we design dynamic programming and reinforcement learning approaches that can be used as pluggable strategies to determine whether and when to enforce a reconfiguration. Empirical evaluations of our solution show between 10% and 25% improvement in performance over a baseline static approach even in the presence of reconfiguration penalty.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Application of reinforcement learning to The Orientation and Position Control of A 6 Degrees of Freedom Robotic Manipulator

Application of Reinforcement Learning to The Orientation and...

引用

ieee Latin American Robotics symposium, LARS

作者： Felipe Rigueira Campos Aline Xavier Fidêncio Jacó Domingues Gustavo Pessin Gustavo Freitas Programa de Pós-Graduação em Instrumentação Controle e Automação de Processos de Mineração Universidade Federal de Ouro Preto e Instituto Tecnológico Vale Ouro Preto MG Brazil Laboratório de Robótica Controle e Instrumentação Instituto Tecnológico Vale Ouro Preto MG Brazil Faculty of Electrical Engineering and Information Technology Ruhr-University Bochum Germany Departamento de Computação Universidade Federal de Ouro Preto Ouro Preto MG Brazil Departamento de Engenharia Elétrica Universidade Federal de Minas Gerais Belo Horizonte MG Brazil

ISBN: (数字)9781665462808

ISBN: (纸本)9781665462815

Applications with autonomous robots play an important role in the industry and in everyday life. Among them, the activities of manipulating and moving objects are highlighted by the wide variety of possible applications. These activities in static and known environments can be implemented through logic planned by the developer, but this is not feasible in dynamic environments. Machine learning (ML) techniques such as reinforcement learning (RL) algorithms have sought to replace the pre-defined programming by teaching the robot how to act. This paper presents the implementation of two RL algorithms, Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO), for orientation and position control of a 6-degree-of-freedom (6-DoF) robotic manipulator. The results demonstrated that the DDPG have a faster learning convergence in simpler activities, but if the complexity of the problem increases, it might not obtain a satisfactory behavior. On the other hand, PPO can solve more complex problems but it limits the convergence rate to the best result in order to avoid learning instability.

关键词： Service robots Position control reinforcement learning Power system stability Manipulators Stability analysis Behavioral sciences

来源：评论

学校读者我要写书评

暂无评论

Research on Control Strategy of Hybrid Superconducting Energy Storage Based on adaptive dynamic programming

Research on Control Strategy of Hybrid Superconducting Energ...

引用

International Conference on Applied Superconductivity and Electromagnetic Devices, ASEMD

作者： Yang Liu Xingfan Han Zuoxia Xing Pengtao Li Hengyu Liu Zhanpeng Jiang School of Electrical Engineering Shenyang University of Technology Shenyang China

Frequent charging and discharging of the battery will seriously shorten the battery life, thus increasing the power fluctuation in the distribution network. In this paper, a microgrid energy storage model combining superconducting magnetic energy storage (SMES) and battery energy storage technology is proposed. At the same time, the energy storage efficiency and the application scenario of superconducting energy storage are analyzed. In order to optimize the performance of the proposed microgrid energy storage model, reinforcement learning algorithm is used to solve the optimization strategy, and the feasibility of the energy storage model is verified by simulation analysis. The results show that the hybrid energy storage system is more conducive to the stable operation of power grid.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Multi-agent Deep reinforcement learning based Information-Energy Collaboration in Vehicle Edge Computing Networks

Multi-agent Deep Reinforcement Learning based Information-En...

引用

ieee International symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)

作者： Yaoyu Feng Biling Zhang Jung-Lang Yu School of Network Education Beijing University of Posts and Telecommunications P. R. China Department of Electrical Engineering Fu Jen Catholic University New Taipei City Taiwan

ISBN: (数字)9798350362244

ISBN: (纸本)9798350362251

In the vehicle edge computing network (VECN), how to deal with the computation resources and energy resources shortage problem the roadside units (RSUs) encounter when they are performing delay sensitive computation tasks is an important issue, especially during the peak hours and the situation of VECN is dynamic. To complete the computation tasks on time with the minimum expenditure, in this paper, we investigate the problem of information-energy collaboration among RSUs, where the spectrum management is also involved. For the considered scenario, the RSUs’ strategies of spectrum selection, computation task offloading and energy sharing are derived from the formulated optimization problem. Since the proposed problem is a highly complex mixed-integer nonlinear programming problem and the strategies are coupled with each other, a multi-agent deep deterministic policy gradient (MADDPG) based algorithm is proposed to find the sub-optimal solutions quickly in a dynamic environment. The simulation results show that our approach is superior to the existing schemes in terms of total system expenditure and the spectral efficiency.

关键词： Energy resources Spectral efficiency Simulation Heuristic algorithms Collaboration programming Vehicle dynamics Optimization Edge computing Radio spectrum management

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共102页 << < 93 94 95 96 97 98 99 100 101 102 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：