检索结果-内蒙古大学图书馆

22nd ieee International symposium on a World of Wireless, Mobile and Multimedia Networks (ieee WoWMoM)

作者： Islam, Shafkat Badsha, Shahriar Sengupta, Shamik Univ Nevada Reno NV 89557 USA

ISBN: (纸本)9781665422635

Vehicular edge computing (VEC), being a novel computing paradigm, promises to provide divergent vehicular edge services, both functional (e.g., charging route prediction, emergency messages, etc.) and infotainment (e.g. video gaming applications, featured movie series, etc.), at the network edge while satisfying application-specific QoS requirements. Vehicles usually send these service requests to nearest roadside units (RSUs), which contain mobile edge servers, according to the functional requirements or the vehicle owner preferences. However, the VEC server's virtual resources may fall short compared to the unbounded amount of real-time service requests (infotainment/functional) during rush hours. This limitation entails VEC servers to fail to meet the stringent latency requirements which may create unwanted malfunction event during driving in the requested vehicles (if functional/critical service requests are delayed in processing). Moreover, the VEC environment's intrinsic properties, i.e. mobility, application-specific distinct latency requirements, traffic congestion, and uncertain task arrival rate, make the VEC task scheduling problem a non-trivial one. In this paper, we propose an extreme reinforcement learning (ERL) based context-aware VEC task scheduler that can make online adaptive scheduling decisions to meet the application-specific latency requirements for both types of tasks (i.e. functional and infotainment). The scheduler can make scheduling decisions directly from its experience without prior knowledge or the VEC environment model. Finally, we present extensive simulation results to confirm the efficacy of the proposed scheduler. Results show that the VEC server can achieve successful (by meeting QoS requirements) task completion rate of above 96% for different task arrival rates (ranging from 10 to 50 arrival/s) using the proposed scheduler. In the simulation, we also analyze the scheduling algorithm's scalability in response to the vertical expa

关键词： Vehicular Edge Computing Task Scheduling Extreme learning Machine reinforcement learning Markov Decision Process QoS

来源：评论

学校读者我要写书评

暂无评论

adaptive Q-leaming-supported Resource Allocation Model in Vehicular Fogs

Adaptive Q-leaming-supported Resource Allocation Model in Ve...

引用

ieee symposium on Computers and Communications (ISCC)

作者： Md Tahmid Hossain Robson E. de Grande Department of Computer Science Brock University Canada

ISBN: (数字)9781665497923

ISBN: (纸本)9781665497930

Vehicular Cloud Computing (VCC) exhibits many drawbacks with the demands of vehicular applications and intermittent network conditions. Vehicular Fog computing is a novel method for supporting and promoting the effective sharing of services and resources in urban areas. Diverse works on vehicular resource management have sought to handle the very dynamic vehicular environment using various methods, such as policy-based greedy and stochastic techniques. Nevertheless, high vehicular mobility poses many issues that compromise service consistency, efficiency, and quality. adaptive vehicular Fogs incorporating reinforcement learning can deal with mobility and correctly distribute services and resources across all Fogs. Thus, we introduce an adaptive resource management model using cloudlet dwell time for resource estimation, mathematical formula for Fog selection, and reinforcement learning for iterative review and feedback mechanism for generating optimal resource allocation policy.

关键词： Adaptation models Q-learning Computational modeling Estimation dynamic scheduling Mathematical models Resource management

来源：评论

学校读者我要写书评

暂无评论

Mobile-Aware Online Task Offloading Based on Deep reinforcement learning in Mobile Edge Computing Networks

Mobile-Aware Online Task Offloading Based on Deep Reinforcem...

引用

ieee International symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)

作者： Yuting Li Yitong Liu Xingcheng Liu Qiang Tu Yi Xie School of Electronics and Information Technology Sun Yat-sen University Guangzhou China School of Computer Science and Engineering Sun Yat-sen University Guangzhou China Jiangsu Viscore Technologies Co. Ltd. Suzhou China

Mobile Edge Computing (MEC) is one of the key enabling technologies for future 6G wireless networks that can provide lower latency service and more efficient resource utilization for future intelligent applications and the Internet of Things (IoT), while also reducing the energy consumption of end devices. In the intricate dynamic edge environment, the task offloading problem is entangled with several factors, such as the uncertainty of online tasks, the heterogeneity of edge servers, and the mobility of devices. In this paper, considering the randomness of online task arrivals, time-varying channels, and mobility of devices, a deep reinforcement learning-based online task offloading (DRL-OTO) algorithm is designed to minimize the energy consumption of all mobile devices. Specifically, by portraying the system model consisting of the communication model, energy consumption model, and node mobility model, the task offloading optimization problem is modeled as a mixed integer nonlinear programming (MINLP) problem. By decomposing this problem, each mobile device first determines the edge server to be offloaded, and then the DRL-OTO algorithm is designed by utilizing the DDPG method, in which each mobile device is able to determine the offloading rate. Simulation results show that the proposed DRL-OTO algorithm can achieve fast convergence and is able to reduce energy consumption, thus increasing the utility of all devices in the dynamic edge environment.

关键词：

来源：评论

学校读者我要写书评

暂无评论

adaptive dynamic programming for Decentralized Stabilization of Uncertain Nonlinear Large-Scale Systems With Mismatched Interconnections

引用

ieee TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2020年第8期50卷 2870-2882页

作者： Yang, Xiong He, Haibo Tianjin Univ Sch Elect & Informat Engn Tianjin 300072 Peoples R China Univ Rhode Isl Dept Elect Comp & Biomed Engn Kingston RI 02881 USA

This paper presents a novel decentralized control strategy for a class of uncertain nonlinear large-scale systems with mismatched interconnections. First, it is shown that the decentralized controller for the overall system can be represented by an array of optimal control policies of auxiliary subsystems. Then, within the framework of adaptive dynamic programming, a simultaneous policy iteration (SPI) algorithm is developed to solve the Hamilton-Jacobi-Bellman equations associated with auxiliary subsystem optimal control policies. The convergence of the SPI algorithm is guaranteed by an equivalence relationship. To implement the present SPI algorithm, actor and critic neural networks are applied to approximate the optimal control policies and the optimal value functions, respectively. Meanwhile, both the least squares method and the Monte Carlo integration technique are employed to derive the unknown weight parameters. Furthermore, by using Lyapunov's direct method, the overall system with the obtained decentralized controller is proved to be asymptotically stable. Finally, the effectiveness of the proposed decentralized control scheme is illustrated via simulations for nonlinear plants and unstable power systems.

关键词： Large-scale systems Decentralized control Optimal control dynamic programming Robustness Approximation algorithms adaptive dynamic programming (ADP) decentralized control large-scale systems mismatched interconnections reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

reinforcement-learning-Based Risk-Sensitive Optimal Feedback Mechanisms of Biological Motor Control

Reinforcement-Learning-Based Risk-Sensitive Optimal Feedback...

引用

ieee Conference on Decision and Control

作者： Leilei Cui Bo Pang Zhong-Ping Jiang Department of Electrical and Computer Engineering Control and Networks Lab Tandon School of Engineering New York University Brooklyn NY USA

Risk sensitivity is a fundamental aspect of biological motor control that accounts for both the expectation and variability of movement cost in the face of uncertainty. However, most computational models of biological motor control rely on model-based risk-sensitive optimal control, which requires an accurate internal representation in the central neural system to predict the outcomes of motor commands. In reality, the dynamics of human-environment interaction is too complex to be accurately modeled, and noise further complicates system identification. To address this issue, this paper proposes a novel risk-sensitive computational mechanism for biological motor control based on reinforcement learning (RL) and adaptive dynamic programming (ADP). The proposed ADP-based mechanism suggests that humans can directly learn an approximation of the risk-sensitive optimal feedback controller from noisy sensory data without the need for system identification. Numerical validation of the proposed mechanism is conducted on the arm-reaching task under divergent force field. The preliminary computational results align with the experimental observations from the past literature of computational neuroscience.

关键词：

来源：评论

学校读者我要写书评

暂无评论

MARS: An adaptive Multi-Agent DRL-based Scheduler for Multipath QUIC in dynamic Networks

MARS: An Adaptive Multi-Agent DRL-based Scheduler for Multip...

引用

International Workshop on Quality of Service

作者： Xueqiang Han Biao Han Ruidong Li Xiaolan Ji College of Computer National University of Defense Technology Changsha China Graduate School of Natural Science and Technology Kanazawa University Kanazawa Japan

The multipath extension of the Quick UDP Internet Connection (QUIC) protocol, also called MPQUIC, is currently attracting increasing attention from both industry and academia. The multipath scheduler of MPQUIC determines how to distribute the packets onto different paths. However, our experimental results show that they fail to adapt to various receive buffer sizes and Quality of Service (QoS) requirements while applying current multipath schedulers into MPQUIC due to the diversity of devices and applications. These problems are especially severe under heterogeneous and dynamic network environments. To tackle these problems, we propose MARS, a Multi-Agent deep reinforcement learning (MADRL) based Multipath QUIC Scheduler, which is able to promptly adapt to dynamic network environments. It exploits the MADRL method to learn a neural network for each path and generate scheduling policy. Besides, it introduces a novel multi-objective reward function that takes out-of-order (OFO) queue size and different QoS metrics into consideration to realize adaptive scheduling optimization. We implement MARS in an MPQUIC prototype and compare it with the state-of-the-art multipath schedulers in both emulated and real-world networks. Experimental results show that MARS outperforms the other schedulers with better adaptive capability regarding the receive buffer sizes and QoS.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Enhancing 5G Network Throughput Using reinforcement learning

Enhancing 5G Network Throughput Using Reinforcement Learning

引用

Communication, Computing and Signal Processing (IICCCS), ieee International Conference on

作者： Myasar Mundher Adnan Muydinov Firuzjon Farkhodjonovich Rohit Shrivastava Anika Bhandari AR Aravind Sandeep Kumar S Department Of Computers Techniques Engineering College Of Technical Engineering The Islamic University Najaf Iraq Department Of Computers Techniques Engineering College Of Technical Engineering The Islamic University Of Al Diwaniyah Al Diwaniyah Iraq Head Of Educational and Methodological Department Fergana Medical Institute Of Public Health Fergana Uzbekistan Department of Electronics & Communication Engineering IES College of Technology Bhopal M.P. India Department of Computer Application Chandigarh Engineering College Chandigarh Group of Colleges Mohali Punjab India Prince Shri Venkateshwara Padmavathy Engineering College Chennai India Department of Computer and Communication Engineering NMAM Institute of Technology (NITTE Deemed to be University) Udupi Karnataka India

ISBN: (数字)9798350390759

ISBN: (纸本)9798350390766

In this research, RL is proposed as a solution towards achieving the dynamic optimization of 5G network throughput hindered by the complexity of Intersystems environments. To this extent, the optimization problem is formulated and modeled within the framework of Markov Decision Process and the study resorts to techniques from Deep reinforcement learning leading to the design of adaptive algorithms that optimise various network parameters in real-time. It uses simulated raw 5G network data and other data open for public usage for training it and the application itself is coded in Python programming language with TensorFlow for implementing different algorithms. As for the performance, it could be seen that the RL-optimized approach is superior to traditional methods and leads to better system performance in terms of throughput, latency, energy efficiency and user satisfaction in a number of scenarios. The study identifies the possibility to apply RL to enable the learning capability of 5G networks in supporting new services and optimizing the network performance. This research finding can be beneficial for the advancement of 5G wireless communication technology by demonstrating the versatility and advantages of utilizing RL in the network.

关键词： Training 5G mobile communication Wireless networks System performance Signal processing Throughput Complexity theory Telecommunication computing Time factors Optimization

来源：评论

学校读者我要写书评

暂无评论

dynamic Resource Management for Cloud-native Bulk Synchronous Parallel Applications

Dynamic Resource Management for Cloud-native Bulk Synchronou...

引用

International symposium on Object-Oriented Real-Time Distributed Computing

作者： Evan Wang Yogesh Barve Aniruddha Gokhale Hongyang Sun Dept of CS Vanderbilt University Nashville TN USA Dept of EECS University of Kansas Lawrence KS USA

Many traditional high-performance computing applications including those that follow the Bulk Synchronous Parallel (BSP) communication paradigm are increasingly being deployed in cloud-native virtualized and multi-tenant container clusters. However, such a shared, virtualized platform limits the degree of control that BSP applications can have in effectively allocating resources. This can adversely impact their performance, particularly when stragglers manifest in individual BSP supersteps. Existing BSP resource management solutions assume the same execution time for individual tasks at every superstep, which is not always the case. To address these limitations, we present a dynamic resource management middleware for cloud-native BSP applications comprising a heuristics algorithm that determines effective resource configurations across multiple supersteps while considering dynamic workloads per superstep, and trading off performance improvements with reconfiguration costs. Moreover, we design dynamic programming and reinforcement learning approaches that can be used as pluggable strategies to determine whether and when to enforce a reconfiguration. Empirical evaluations of our solution show between 10% and 25% improvement in performance over a baseline static approach even in the presence of reconfiguration penalty.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Optimization control of UAVs based on self-learning adaptive dynamic programming 35

Optimization control of UAVs based on self-learning adaptive...

引用

35th Youth Academic Annual Conference of Chinese-Association-of-Automation (YAC)

作者： Ye, Shuai Zhou, Ying-Jiang Jiang, Guo-Ping Lin, Qiong Nanjing Univ Posts & Telecommun Dept Automat & Artificial Intelligence Nanjing 210023 Peoples R China

ISBN: (纸本)9781728176840

In UAVs, optimal control has attracted more and more attention. In this paper, a self-learning adaptive dynamic programming (ADP) architecture based reinforcement learning (RL) is proposed to obtain optimal control for UAVs. 1 Compared with the traditional ADP architecture including two networks, one is used to make policy, and the other is used to evaluate policy, We propose to add a third network to replace external reward signals, that is, the agent can acquire reward signals by itself and do not need to interact with the environment. The proposed self-learning ADP method can improve the control performance by online learning while ensuring the state of the system stable at the equilibrium point. Finally, the proposed control algorithm is applied to quadrotor UAVs, and the experimental results show that the effectiveness of the algorithm.

关键词： adaptive dynamic programming Self-learning adaptive dynamic programming reinforcement learning UAVs Online learning

来源：评论

学校读者我要写书评

暂无评论

Balancing Value Iteration and Policy Iteration for Discrete-Time Control

引用

ieee TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2020年第11期50卷 3948-3958页

作者： Luo, Biao Yang, Yin Wu, Huai-Ning Huang, Tingwen Cent South Univ Sch Automat Changsha 410083 Peoples R China Hamad Bin Khalifa Univ Coll Sci & Engn Doha Qatar Beihang Univ Sci & Technol Aircraft Control Lab Beijing 100191 Peoples R China Texas A&M Univ Qatar Dept Sci Doha Qatar

The optimal control problem of discrete-time nonlinear systems depends on the solution of the Bellman equation. In this paper, an adaptive reinforcement learning (RL) method is developed to solve the complex Bellman equation, which balances value iteration (VI) and policy iteration (PI). By adding a balance parameter, an adaptive RL integrates VI and PI together, which accelerates VI and avoids the need of an initial admissible control. The convergence of the adaptive RL is proved by showing that it converges to the Bellman equation. Subsequently, the adaptive RL is realized by using the neural network (NN) approximation for value function and a least-squares scheme is developed for updating NN weights. Then, the convergence of NN-based adaptive RL is proved with considering NN approximation error. To further improve its performance, an adaptive rule is developed for tuning balance parameter in adaptive RL iteration by iteration. Finally, the effectiveness of the adaptive RL is validated with simulation studies.

关键词： Convergence Optimal control Mathematical model Artificial neural networks adaptive systems Nonlinear systems reinforcement learning adaptive dynamic programming Bellman equation discrete-time neural network (NN) optimal control

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：