检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

299 篇 会议
8 篇 期刊文献

馆藏范围

307 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

180 篇 工学
- 158 篇 计算机科学与技术...
- 56 篇 电气工程
- 48 篇 软件工程
- 47 篇 控制科学与工程
- 13 篇 信息与通信工程
- 10 篇 机械工程
- 6 篇 仪器科学与技术
- 4 篇 力学（可授工学、理...
- 4 篇 生物工程
- 3 篇 动力工程及工程热...
- 2 篇 交通运输工程
- 2 篇 核科学与技术
- 2 篇 生物医学工程（可授...
- 1 篇 建筑学
- 1 篇 化学工程与技术
- 1 篇 航空宇航科学与技...
- 1 篇 食品科学与工程（可...
40 篇 理学
- 35 篇 数学
- 9 篇 系统科学
- 8 篇 统计学（可授理学、...
- 4 篇 物理学
- 4 篇 生物学
- 1 篇 化学
- 1 篇 天文学
- 1 篇 大气科学
- 1 篇 地球物理学
- 1 篇 地质学
18 篇 管理学
- 17 篇 管理科学与工程(可...
- 7 篇 工商管理
4 篇 经济学
- 4 篇 应用经济学
1 篇 医学

主题

115 篇 dynamic programm...
76 篇 reinforcement le...
67 篇 learning
47 篇 optimal control
30 篇 neural networks
27 篇 control systems
21 篇 approximate dyna...
21 篇 approximation al...
20 篇 function approxi...
20 篇 equations
17 篇 convergence
16 篇 adaptive dynamic...
16 篇 state-space meth...
16 篇 heuristic algori...
14 篇 mathematical mod...
13 篇 stochastic proce...
12 篇 learning (artifi...
12 篇 adaptive control
12 篇 cost function
11 篇 algorithm design...

机构

5 篇 arizona state un...
4 篇 department of el...
4 篇 school of inform...
4 篇 department of in...
4 篇 univ sci & techn...
4 篇 chinese acad sci...
4 篇 department of el...
3 篇 princeton univ d...
3 篇 northeastern uni...
3 篇 national science...
3 篇 robotics institu...
3 篇 univ illinois de...
3 篇 univ utrecht dep...
2 篇 univ groningen i...
2 篇 sharif univ tech...
2 篇 univ texas autom...
2 篇 pengcheng labora...
2 篇 guangxi univ sch...
2 篇 chinese acad sci...
2 篇 cemagref lisc au...

作者

14 篇 liu derong
9 篇 wei qinglai
8 篇 si jennie
7 篇 xu xin
5 篇 derong liu
4 篇 lewis frank l.
4 篇 martin riedmille...
4 篇 huaguang zhang
4 篇 jennie si
4 篇 marco a. wiering
4 篇 xin xu
4 篇 zhang huaguang
4 篇 dongbin zhao
4 篇 lei yang
4 篇 powell warren b.
4 篇 riedmiller marti...
3 篇 hado van hasselt
3 篇 van hasselt hado
3 篇 jagannathan s.
3 篇 munos remi

语言

305 篇 英文
1 篇 其他
1 篇 中文

检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"

共 307 条记录，以下是301-310 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Multi-agent Deep reinforcement learning based Information-Energy Collaboration in Vehicle Edge Computing Networks

Multi-agent Deep Reinforcement Learning based Information-En...

引用

ieee international symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)

作者： Yaoyu Feng Biling Zhang Jung-Lang Yu School of Network Education Beijing University of Posts and Telecommunications P. R. China Department of Electrical Engineering Fu Jen Catholic University New Taipei City Taiwan

ISBN: (数字)9798350362244

ISBN: (纸本)9798350362251

In the vehicle edge computing network (VECN), how to deal with the computation resources and energy resources shortage problem the roadside units (RSUs) encounter when they are performing delay sensitive computation tasks is an important issue, especially during the peak hours and the situation of VECN is dynamic. To complete the computation tasks on time with the minimum expenditure, in this paper, we investigate the problem of information-energy collaboration among RSUs, where the spectrum management is also involved. For the considered scenario, the RSUs’ strategies of spectrum selection, computation task offloading and energy sharing are derived from the formulated optimization problem. Since the proposed problem is a highly complex mixed-integer nonlinear programming problem and the strategies are coupled with each other, a multi-agent deep deterministic policy gradient (MADDPG) based algorithm is proposed to find the sub-optimal solutions quickly in a dynamic environment. The simulation results show that our approach is superior to the existing schemes in terms of total system expenditure and the spectral efficiency.

关键词： Energy resources Spectral efficiency Simulation Heuristic algorithms Collaboration programming Vehicle dynamics Optimization Edge computing Radio spectrum management

来源：评论

学校读者我要写书评

暂无评论

From Reward to Histone: Combining Temporal-Difference learning and Epigenetic Inheritance for Swarm's Coevolving Decision Making

From Reward to Histone: Combining Temporal-Difference Learni...

引用

ieee international Conference on Development and learning, ICDL

作者： Faqihza Mukhlish John Page Michael Bain School of Mechanical and Manufacturing Engineering University of New South Wales Sydney Australia School of Computer Science and Engineering University of New South Wales Sydney Australia

ISBN: (数字)9781728173061

ISBN: (纸本)9781728173207

Applying intelligence to a group of simple robots known as swarm robots has become an exciting technology in assisting or replacing humans to fulfil complex, dangerous and harsh missions. However, building a strategy for a swarm to thrive in a dynamic environment is challenging because of control decentralisation and interactions between agents. The decision-making process in a robotic task commonly takes place in sequential stages. By understanding the subsequent action-reaction process, a strategy to make optimal decisions in a respective environment can be learnt. Hence, using the concept of epigenetic inheritance, novel evolutionary-learning mechanisms for a swarm will be discussed in this paper. reinforcement evolutionary learning using epigenetic inheritance (RELEpi) is proposed in this article. This method utilizes reward, temporal difference and epigenetic inheritance to approximate optimal action and behaviour policies. The proposed method opens possibilities to combine reward-based learning and evolutionary methods as a stacked process where histone value is used rather than fitness function. The formulation consists of methylation and epigenetic mechanisms, inspired by the epigenome studies. The methylation process helps the accumulation of the reward to histone value of the gene. Epigenetic mechanisms give the ability to mate genetic information along with their histone value.

关键词： Robots Task analysis Decision making Swarm robotics Organisms Linear programming Technological innovation

来源：评论

学校读者我要写书评

暂无评论

An Online Model-Free reinforcement learning Approach for 6-DOF Robot Manipulators

An Online Model-Free Reinforcement Learning Approach for 6-D...

引用

international Workshop on Robot Sensing (ROSE)

作者： Zeyad Hosny Abdullah Nassar Ahmed AboElyazeed Mahmoud Mohamed Mohammed Abouheaf Wail Gueaieb School of Electrical Engineering and Computer Science University of Ottawa Ottawa ON K1N6N5 Canada Robotics Engineering Bowling Green State University Bowling Green 43402 OH USA

Controlling 6 Degrees-of-Freedom (DoF) robotic manipulators in an online, model-free manner poses significant challenges due to their complex coupling, non-linearities, and the need to account for unmodeled dynamics. This paper introduces a model-free adaptive approach for real-time control of a 6 DoF “EPSON” robotic manipulator, without requiring any prior knowledge of the manipulator’s dynamics. Initially, we lay out the framework for an optimal control solution. A performance index is introduced, leveraging error dynamics and correction control signals, offering the capability to incorporate high-order error dynamics without the need to explicitly derive error trajectories. The order of error dynamics is determined by the chosen number of error samples. We assume a kernel-based solution structure aligning with the performance index, resulting in a temporal difference equation. This equation can be optimized to formulate a model-free control strategy. Subsequently, a reinforcement learning approach is adopted to approximate the underlying strategy. Infeasible exact solutions are overcome by employing a value iteration mechanism to adapt the actor-critic structures within an adaptive critics framework. To validate the proposed approach, it is compared against a conventional proportional-integral controller. A Unified Robot Description Format file is generated to facilitate the import of the robotic manipulator into the MATLAB Simulink environment, enabling its control. Ultimately, the proposed method yields superior results in terms of the dynamic characteristics of the response, demonstrating its effectiveness over the conventional approach.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PhD Forum Abstract: Diffusion-based Task Scheduling for Efficient AI-Generated Content in Edge Networks

PhD Forum Abstract: Diffusion-based Task Scheduling for Effi...

引用

international symposium on Information Processing in Sensor Networks (IPSN)

作者： Changfu Xu Hong Kong and BNU-HKBU United International College Hong Kong Baptist University Zhuhai China

ISBN: (数字)9798350362015

ISBN: (纸本)9798350362022

The Artificial Intelligence-Generated Content (AIGC) technique has gained significant popularity in creating diverse content. However, the current deployment of AIGC services is a centralized framework, thus leading to high response times. To address this issue, we propose a diffusion-based task scheduling method that considers the integration of the diffusion model, Deep reinforcement learning (DRL), and Mobile Edge Computing (MEC) technique to improve the AIGC efficiency. This challenges efficient server selection without prior information in dynamic MEC systems. We formulate our problem as an online integer linear programming problem aiming to minimize task offloading delay. Furthermore, we propose a novel AIGC Task Scheduling (DDRL-ATS) algorithm based on Diffusion DRL (DDRL) that effectively addresses this problem. The DDRL-ATS algorithm achieves efficient AIGC tailored for heterogeneous MEC environments. Additionally, an online Adaptive Multi-server Selection and Allocation (DDRL-AMSA) algorithm based on DDRL is proposed to further enhance the AIGC efficiency. Moreover, our DDRL-AMSA algorithm achieves near-optimal solutions within approximate linear time complexity bounds. Finally, experimental results validate the effectiveness of our method by showcasing at least a reduction of 13.54% in task offloading delay compared to state-of-the-art methods.

关键词： Multi-access edge computing Processor scheduling Heuristic algorithms Approximation algorithms Scheduling Delays Time factors Servers Resource management Time complexity

来源：评论

学校读者我要写书评

暂无评论

Virtual Network Function Embedding under Nodal Outage using reinforcement learning

Virtual Network Function Embedding under Nodal Outage using ...

引用

international symposium on Advanced Networks and Telecommunication Systems (ANTS)

作者： Swarna Bindu Chetty Hamed Ahmadi Avishek Nag School of Electrical and Electronic Engineering University College Dublin Dublin Ireland University of York United Kingdom

ISBN: (数字)9781728192901

ISBN: (纸本)9781728192918

With the emergence of various types of applications such as delay-sensitive applications, future communication networks are expected to be increasingly complex and dynamic. Network Function Virtualization (NFV) provides the necessary support towards efficient management of such complex networks, by disintegrating the dependency on the hardware devices via virtualizing the network functions and placing them on shared data centres. However, one of the main challenges of the NFV paradigm is the resource allocation problem which is known as NFV-Resource Allocation (NFV-RA). NFV-RA is a method of deploying software-based network functions on the substrate nodes, subject to the constraints imposed by the underlying infrastructure and the agreed Service Level Agreement (SLA). This work investigates the potential of reinforcement learning (RL) as a fast yet accurate means (as compared to integer linear programming) for deploying the softwarized network functions onto substrate networks under several Quality of Service (QoS) constraints. In addition to the regular resource constraints and latency constraints, we introduced the concept of a complete outage of certain nodes in the network. This outage can be either due to a disaster or unavailability of network topology information due to proprietary and ownership issues. We have analyzed the network performance on different network topologies, different capacities of the nodes and the links, and different degrees of the nodal outage. The computational time escalated with the increase in the network density to achieve the optimal solutions; this is because Q-learning is an iterative process which results in a slow exploration. Our results also show that for certain topologies and a certain combination of resources, we can achieve between 7090% service acceptance rate even with a 40% nodal outage.

关键词： reinforcement learning Quality of service Network function virtualization Topology Resource management Substrates Service level agreements

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time Generalized Policy Iteration ADP Algorithm With Approximation Errors

Discrete-Time Generalized Policy Iteration ADP Algorithm Wit...

引用

ieee symposium Series on Computational Intelligence

作者： Qinglai Wei Benkai Li Ruizhuo Song The State Key Laboratory of Management and Control for Complex Systems Chinese Academy of Sciences Beijing China School of Automation and Electrical Engineering University of Science and Technology Beijing Beijing China

This paper concerns with a novel generalized policy iteration (GPI) algorithm with approximation errors. Approximation errors are explicitly considered in the GPI algorithm. The properties of the stable GPI algorithm with approximation errors are analyzed. The convergence of the developed algorithm is established to show that the iterative value function is convergent to a finite neighborhood of the optimal performance index function. Finally, numerical examples and comparisons are presented.

关键词： Adaptive critic designs Adaptive dynamic programming approximate dynamic programming Neuro-dynamic programming Generalized policy iteration Nonlinear systems Optimal control Neural networks reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

learning Recovery Strategies for dynamic Self-Healing in Reactive Systems

Learning Recovery Strategies for Dynamic Self-Healing in Rea...

引用

SEAMS international Workshop on Software Engineering for Adaptive and Self-Managing Systems, ICSE

作者： Mateo Sanabria Ivana Dusparic Nicolás Cardozo Universidad de los Andes Colombia Trinity College Dublin Ireland

ISBN: (数字)9798400705854

ISBN: (纸本)9798350363838

Self-healing systems depend on following a set of predefined instructions to recover from a known failure state. Failure states are generally detected based on domain specific specialized metrics. Failure fixes are applied at predefined application hooks that are not sufficiently expressive to manage different failure types. Self-healing is usually applied in the context of distributed systems, where the detection of failures is constrained to communication problems, and resolution strategies often consist of replacing complete components. However, current complex systems may reach failure states at a fine granularity not anticipated by developers (for example, value range changes for data streaming in IoT systems), making them unsuitable for existing self-healing techniques. To counter these problems, in this paper we propose a new self-healing framework that learns recovery strategies for healing fine-grained system behavior at run time. Our proposal targets complex reactive systems, defining monitors as predicates specifying satisfiability conditions of system properties. Such monitors are functionally expressive and can be defined at run time to detect failure states at any execution point. Once failure states are detected, we use a reinforcement learning-based technique to learn a recovery strategy based on users' corrective sequences. Finally, to execute the learned strategies, we extract them as Context-oriented programming variations that activate dynamically whenever the failure state is detected, overwriting the base system behavior with the recovery strategy for that state. We validate the feasibility and effectiveness of our framework through a prototypical reactive application for tracking mouse movements, and the DeltaIoT exemplar for self-healing systems. Our results demonstrate that with just the definition of monitors, the system is effective in detecting and recovering from failures between 55% - 92% of the cases in the first application, and at pa

关键词： Measurement Adaptive systems Tracking Autonomous systems programming Proposals Complex systems

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共31页 << < 22 23 24 25 26 27 28 29 30 31 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：