检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

299 篇 会议
8 篇 期刊文献

馆藏范围

307 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

180 篇 工学
- 158 篇 计算机科学与技术...
- 56 篇 电气工程
- 48 篇 软件工程
- 47 篇 控制科学与工程
- 13 篇 信息与通信工程
- 10 篇 机械工程
- 6 篇 仪器科学与技术
- 4 篇 力学（可授工学、理...
- 4 篇 生物工程
- 3 篇 动力工程及工程热...
- 2 篇 交通运输工程
- 2 篇 核科学与技术
- 2 篇 生物医学工程（可授...
- 1 篇 建筑学
- 1 篇 化学工程与技术
- 1 篇 航空宇航科学与技...
- 1 篇 食品科学与工程（可...
40 篇 理学
- 35 篇 数学
- 9 篇 系统科学
- 8 篇 统计学（可授理学、...
- 4 篇 物理学
- 4 篇 生物学
- 1 篇 化学
- 1 篇 天文学
- 1 篇 大气科学
- 1 篇 地球物理学
- 1 篇 地质学
18 篇 管理学
- 17 篇 管理科学与工程(可...
- 7 篇 工商管理
4 篇 经济学
- 4 篇 应用经济学
1 篇 医学

主题

115 篇 dynamic programm...
76 篇 reinforcement le...
67 篇 learning
47 篇 optimal control
30 篇 neural networks
27 篇 control systems
21 篇 approximate dyna...
21 篇 approximation al...
20 篇 function approxi...
20 篇 equations
17 篇 convergence
16 篇 adaptive dynamic...
16 篇 state-space meth...
16 篇 heuristic algori...
14 篇 mathematical mod...
13 篇 stochastic proce...
12 篇 learning (artifi...
12 篇 adaptive control
12 篇 cost function
11 篇 algorithm design...

机构

5 篇 arizona state un...
4 篇 department of el...
4 篇 school of inform...
4 篇 department of in...
4 篇 univ sci & techn...
4 篇 chinese acad sci...
4 篇 department of el...
3 篇 princeton univ d...
3 篇 northeastern uni...
3 篇 national science...
3 篇 robotics institu...
3 篇 univ illinois de...
3 篇 univ utrecht dep...
2 篇 univ groningen i...
2 篇 sharif univ tech...
2 篇 univ texas autom...
2 篇 pengcheng labora...
2 篇 guangxi univ sch...
2 篇 chinese acad sci...
2 篇 cemagref lisc au...

作者

14 篇 liu derong
9 篇 wei qinglai
8 篇 si jennie
7 篇 xu xin
5 篇 derong liu
4 篇 lewis frank l.
4 篇 martin riedmille...
4 篇 huaguang zhang
4 篇 jennie si
4 篇 marco a. wiering
4 篇 xin xu
4 篇 zhang huaguang
4 篇 dongbin zhao
4 篇 lei yang
4 篇 powell warren b.
4 篇 riedmiller marti...
3 篇 hado van hasselt
3 篇 van hasselt hado
3 篇 jagannathan s.
3 篇 munos remi

语言

305 篇 英文
1 篇 其他
1 篇 中文

检索条件"任意字段=IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning"

共 307 条记录，以下是11-20 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Node Fault Prediction Assisted Small-World IoT Networks Using ML Frameworks: Towards Performance Improvement 18

Node Fault Prediction Assisted Small-World IoT Networks Usin...

引用

18th ieee international Conference on Advanced Networks and Telecommunications Systems, ANTS 2024

作者： Sharma, Neha Gupta, Aryaman Deepak, Sivala Pandey, Om Jee Indian Institute of Technology BHU Department of Electronics Engineering Varanasi India

ISBN: (纸本)9798350391725

The rapid growth of the Internet of Things (IoT) networks has led to the deployment of large-scale networks, enabling seamless connectivity and data exchange among various devices. To manage the complexity and ensure efficient communication in these expansive networks, adopting suitable network architecture becomes crucial. Small-world networks, characterized by a high Average Clustering Coefficient (ACC) and low Average Path Length (APL), have emerged as a promising architecture for IoT systems due to their efficient communication. However, introducing the Small-World Characteristics (SWC) in an IoT network is challenging due to the need for strategic placement of long-range connections while maintaining low APL and high ACC and ensuring scalability and robustness. In this work, we introduce SWC into the network using an actor-critic reinforcement learning algorithm. Additionally, ensuring the reliability of sensor nodes is crucial to maintaining the overall network performance. Therefore, we propose a joint method for dynamic node fault prediction and data routing within small-world IoT networks using advanced Machine learning (ML) frameworks. Several data routing experiments have been conducted to validate the effectiveness of the proposed approach using simulated small-world IoT networks. We analyzed major network parameters such as lifetime, latency, and throughput. We compared the proposed method with existing state-of-the-art approaches and observed promising results. © 2024 ieee.

关键词： APL (programming language)

来源：评论

学校读者我要写书评

暂无评论

An Online Model-Free reinforcement learning Approach for 6-DOF Robot Manipulators

An Online Model-Free Reinforcement Learning Approach for 6-D...

引用

2023 ieee international symposium on Robotic and Sensors Environments, ROSE 2023

作者： Hosny, Zeyad Nassar, Abdullah Aboelyazeed, Ahmed Mohamed, Mahmoud Abouheaf, Mohammed Gueaieb, Wail University of Ottawa School of Electrical Engineering and Computer Science OttawaONK1N6N5 Canada Bowling Green State University Robotics Engineering Bowling GreenOH43402 United States

ISBN: (纸本)9798350308044

Controlling 6 Degrees-of-Freedom (DoF) robotic manipulators in an online, model-free manner poses significant challenges due to their complex coupling, non-linearities, and the need to account for unmodeled dynamics. This paper introduces a model-free adaptive approach for real-time control of a 6 DoF 'EPSON' robotic manipulator, without requiring any prior knowledge of the manipulator's dynamics. Initially, we lay out the framework for an optimal control solution. A performance index is introduced, leveraging error dynamics and correction control signals, offering the capability to incorporate high-order error dynamics without the need to explicitly derive error trajectories. The order of error dynamics is determined by the chosen number of error samples. We assume a kernel-based solution structure aligning with the performance index, resulting in a temporal difference equation. This equation can be optimized to formulate a model-free control strategy. Subsequently, a reinforcement learning approach is adopted to approximate the underlying strategy. Infeasible exact solutions are overcome by employing a value iteration mechanism to adapt the actor-critic structures within an adaptive critics framework. To validate the proposed approach, it is compared against a conventional proportional-integral controller. A Unified Robot Description Format file is generated to facilitate the import of the robotic manipulator into the MATLAB Simulink environment, enabling its control. Ultimately, the proposed method yields superior results in terms of the dynamic characteristics of the response, demonstrating its effectiveness over the conventional approach. © 2023 ieee.

关键词： Real time control

来源：评论

学校读者我要写书评

暂无评论

Solving PBQP-Based Register Allocation using Deep reinforcement learning 22

Solving PBQP-Based Register Allocation using Deep Reinforcem...

引用

20th ieee/ACM international symposium on Code Generation and Optimization (CGO)

作者： Kim, Minsu Park, Jeong-Keun Moon, Soo-Mook Seoul Natl Univ Dept Elect & Comp Engn Seoul South Korea

ISBN: (纸本)9781665405843

Irregularly structured registers are hard to abstract and allocate. Partitioned Boolean quadratic programming (PBQP) is a useful abstraction to represent complex register constraints, even those in highly irregular processors of automated test equipment (ATE) of DRAM memory chips. The PBQP problem is NP-hard, requiring a heuristic solution. If no spill is allowed as in ATE, however, we have to enumerate more to find a solution rather than to approximate, since a spill means a total compilation failure. We propose solving the PBQP problem with deep reinforcement learning (Deep-RL), more specifically, a model-based approach using Monte Carlo tree search and deep neural network as used in Alphazero, a proven Deep-RL technology. Through elaborate training with random PBQP graphs, our Deep-RL solver could cut the search space sharply, making an enumeration-based solution more affordable. Furthermore, by employing backtracking with a proper coloring order, Deep-RL can find a solution with modestly-trained neural networks with even less search space. Our experiments show that Deep-RL can successfully find a solution for 10 product-level ATE programs while searching much fewer (e.g., 1/3,500) states than the previous PBQP enumeration solver. Also, when applied to C programs in Byrn-test-suite for regular CPUs, it achieves a competitive performance to the existing PBQP register allocator in LLVM.

关键词： Training Program processors Neural networks reinforcement learning Search problems Registers Test equipment

来源：评论

学校读者我要写书评

暂无评论

The Research of Quadrotor Flight Control Based on reinforcement learning and ADP 8

The Research of Quadrotor Flight Control Based on Reinforcem...

引用

8th Annual international Conference on Network and Information Systems for Computers, ICNISC 2022

作者： Li, Xueyuan Xie, Wentao Zhan, Wentao Xi'an Aeronautics Computing Technique Research Institute Avic Xi'an China

ISBN: (纸本)9781665453516

This paper studies the application of Lookup-Table reinforcement learning method into the continuous state space control of quadrotor simulator and designs a attitude controller for the quadrotor simulator based on Q-learning;for the improvement of defects concerning difficulty in the learning algorithm's convergence and low efficiency in learning when Q-learning is faced with large-scale and continuous-space optimized decision, the method of kernel approximate dynamic programming is introduced, Kernel-based Least-Squares Policy Iteration (KLSPI) is proposed, and a controller for the quadrotor simulator is designed based on this algorithm. The experiment shows that the reinforcement learning control method is of fast convergence speed, small steady-state error, strong adaptive ability and good control effect;when dealing with the problem of continuous state space, the Least-Squares Policy Iteration can converge better strategies with fewer training data compared with the traditional method of discretizing state space first. © 2022 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A learning Approach to Multi-robot Task Allocation with Priority Constraints and Uncertainty

A Learning Approach to Multi-robot Task Allocation with Prio...

引用

2022 ieee international Conference on Industrial Technology, ICIT 2022

作者： Deng, Fuqin Huang, Huanzhao Fu, Lanhui Yue, Hongwei Zhang, Jianmin Wu, Zexiao Lam, Tin Lun Wuyi University School of Intelligent Manufacturing Guangdong Jiangmen529020 China The Shenzhen Institute of Artificial Intelligence and Robotics for Society Guangdong Shenzhen518000 China The 3irobotix Co. Ltd Guangdong Shenzhen518000 China Guangdong University of Education School of Physics and Information Engineering Guangdong Guangzhou510303 China The Chinese University of HongKong School of Science and Engineering Guangdong Shenzhen518000 China

ISBN: (纸本)9781728119489

Multi-robot task allocation has an important impact on the efficiency of multi-robot collaboration. For single-shot allocation without complicated constraints, some exact algorithms and heuristic algorithms can find the optimal solution efficiently. However, considering the priority constraints and uncertain execution time of robots for multiple times of allocation in an approximate dynamic programming environment, traditional methods such as heuristic algorithms have limited performance. To obtain better performance, we propose a method based on deep reinforcement learning. Specifically, we first use the directed acyclic graph to describe the priority relationship between tasks. Then we propose a graph neural network with a hierarchical attention mechanism to extract the characteristics of the task groups. Finally, we design the policy network to solve the approximate dynamic programming problem of multi-robot task allocation. Through training on the dataset of a given environment, the policy network can gradually refine the decision-making process by reinforcement learning. Experiment results show that the proposed modeling and solving method can find better solutions than existing heuristic algorithms. Furthermore, the learned strategy can be directly applied in other untrained environments with superior performance. © 2022 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A reinforcement learning Solution to the Nonlinear Spacecraft Pursuit-Evasion Game Problem

A Reinforcement Learning Solution to the Nonlinear Spacecraf...

引用

ieee international Conference on Cyber Technology in Automation, Control, and Intelligent Systems

作者： Haoqi Huang Guangtao Ran Yueyong Lyu Guangfu Ma Department of Control Science and Engineering Harbin Institute of Technology Harbin China

ISBN: (数字)9798331506056

ISBN: (纸本)9798331506063

The pursuit-evasion game of non-cooperative spacecrafts under nonlinear dynamics is currently a hot topic in orbital gaming. We describe the above pursuit-evasion game model using differential game theory, transforming the gaming problem into a bilateral optimal control problem. Using elliptical orbit line-of-sight (LOS) dynamics with simple field-of-view constrains as the system model, we solve the Nash equilibrium solution for the two-body pursuit-evasion under the assumption of complete information. Due to the difficulty in obtaining analytical solutions for the Nash equilibrium, we adopt a reinforcement learning (RL)-based adaptive dynamic programming method. We obtain the approximate Nash equilibrium solution with RL method eventually and provide a successful simulation example.

关键词： Space vehicles Optimal control Line-of-sight propagation Games reinforcement learning Differential games Nash equilibrium Orbits Nonlinear dynamical systems dynamic programming

来源：评论

学校读者我要写书评

暂无评论

A Budget-aware Incentive Mechanism for Vehicle-to-Grid via reinforcement learning 31

A Budget-aware Incentive Mechanism for Vehicle-to-Grid via R...

引用

31st ieee/ACM international symposium on Quality of Service, IWQoS 2023

作者： Zhu, Tianxiang Zhang, Xiaoxi Duan, Jingpu Zhou, Zhi Chen, Xu Sun Yat-sen University Guangzhou China Southern University of Science and Technology Shenzhen China Pengcheng Laboratory Shenzhen China

ISBN: (纸本)9798350399738

With the increasing penetration of renewable energy and electric vehicles (EVs), the behavior of EVs' charging and discharging has shown great impact on the Micro Grid power load, motivating the development of Vehicle-to-Grid (V2G) technologies. However, the V2G market is still in its infancy, due to insufficient understanding of EV users' willingness and concerns. While many studies consider direct EV control, it's more realistic to indirectly affect users' behavior through monetary incentives. For better implementation flexibility, we advocate to display at charging piles strategically chosen incentives that are combined with electricity prices. Technically, this is the first model-free learning algorithm that can optimize incentives under unknown EV user reactions, increase the load control effectiveness and users' quality-of-service (QoS) simultaneously under a long-term incentive budget, and provide theoretical performance guarantees. We first construct a bi-level optimization framework to model the time-dependencies across our solutions. We then integrate primal-dual theories and upper-confidence bounds into reinforcement learning to balance power control and incentive consumption. A dynamic programming based algorithm is also proposed to maximize the aggregate user QoS. Finally, we prove bounded sub-optimality of our learning algorithm through theoretical analysis and conduct trace-driven simulations to demonstrate the advantages of our bi-level framework. © 2023 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

2022 ieee 33rd Annual international symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2022

2022 IEEE 33rd Annual International Symposium on Personal, I...

引用

33rd ieee Annual international symposium on Personal, Indoor and Mobile Radio Communications, PIMRC 2022

ISBN: (纸本)9781665480536

The proceedings contain 237 papers. The topics discussed include: robust device position and pose detection using visible light without model knowledge: a branch-structured residual learning method;access point clustering in cell-free massive MIMO using multi-agent reinforcement learning;joint data and model driven channel-free signal detection based learned factor graph;sum-rate maximization in RIS-aided wireless-powered D2D communication networks;physical layer security in spherical-wave channel using massive MIMO;spatially-coupled faster-than-Nyquist signaling;indoor localization with CSI fingerprint utilizing depthwise separable convolution neural network;a hybrid machine learning based model for congestion prediction in mobile networks;mobile traffic forecasting for network slices: a federated-learning approach;a vector-based dynamic programming approach for small cell placement in dense urban;spatiotemporal graph attention networks for urban traffic flow prediction;deep learning based minimum length scheduling for half duplex wireless powered communication networks;on the effectiveness of semantic addressing for wake-up radio-enabled wireless sensor networks;and physical layer authentication based on continuous channel polarization response in low snr scenes.

关键词：

来源：评论

学校读者我要写书评

暂无评论

learning Intrusion Prevention Policies through Optimal Stopping 17

Learning Intrusion Prevention Policies through Optimal Stopp...

引用

17th international Conference on Network and Service Management (CNSM) - Smart Management for Future Networks and Services

作者： Hammar, Kim Stadler, Rolf KTH Royal Inst Technol Div Network & Syst Engn Stockholm Sweden KTH Ctr Cyber Def & Informat Secur Stockholm Sweden

ISBN: (纸本)9783903176362

We study automated intrusion prevention using reinforcement learning. In a novel approach, we formulate the problem of intrusion prevention as an optimal stopping problem. This formulation allows us insight into the structure of the optimal policies, which turn out to be threshold based. Since the computation of the optimal defender policy using dynamic programming is not feasible for practical cases, we approximate the optimal policy through reinforcement learning in a simulation environment. To define the dynamics of the simulation, we emulate the target infrastructure and collect measurements. Our evaluations show that the learned policies are close to optimal and that they indeed can be expressed using thresholds.

关键词： Network Security automation optimal stopping reinforcement learning Markov Decision Processes

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning of LQR control policy by a double inverted-pendulum biomechanical model

Reinforcement learning of LQR control policy by a double inv...

引用

ieee international Conference on Industrial Technology (ICIT)

作者： Kamran Iqbal Muhammad Haras University of Arkansas at Little Rock Little Rock Arkansas

Optimal LQR feedback gains can be learned using reinforcement learning (RL) framework for systems with unknown dynamics using policy iteration methods. However, policy iteration in the case of inherently unstable systems becomes challenging. In this study we establish reinforcement learning of optimal feedback gains in the case of a nonlinear double inverted-pendulum (DIP) biomechanical model. Using an admissible initial policy, the biomechanical model was simulated in MATLAB and trajectory data were recorded. The state variables were transformed to quadratic basis function and used in approximate dynamic programming (ADP) to learn the solution to the algebraic Riccati equation (ARE) underlying the LQR problem. The RL results obtained in the case of an inherently unstable DIP system indicate relatively fast convergence and demonstrate the potential to apply RL techniques to more complex systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共31页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：