检索结果-内蒙古大学图书馆

ieee/IFIP Network Operations and Management symposium (NOMS)

作者： Zhang, Menglei Huang, Haoqiu Rui, LanLan Hui, Guo Wang, Ying Qiu, Xuesong Beijing Univ Posts & Telecommun State Key Lab Networking & Switching Technol Beijing Peoples R China China Aerosp Sci & Ind Corp Ltd Network Informat Dept Beijing Peoples R China

ISBN: (纸本)9781728149738

Cloud computing technologies can not satisfy the requirements of applications on the mobile terminals because of their disadvantages in delay, link load and energy. So Mobile Edge Computing (MEC) is proposed as a kind of novel computing technology. As an important research direction of MEC, service migration methods still have limitations that they cannot learn migration paths and be adaptive in dynamic situation and user movement. In this paper, we propose a novel service migration policy method based on reinforcement learning. We firstly investigate user movement, four different edge network situations and traditional migration policies. Then we formulate the system requirements by Satisfiability Modulo Theory (SMT) logic to acquire the migration policy space. We further propose a dynamic-awareness deep Q-learning algorithm to select paths from the policy space iteratively and conduct dynamic awareness to adjust learning rate adaptively. Meanwhile, the optimal convergence of our algorithm is proved theoretically. Finally, the experimental results highlight the effectiveness as migration successful rate, service interruption time and load balance of our method compared to the other solutions.

关键词： Service Migration Mobile Edge Computing User Movement dynamic Awareness

来源：评论

学校读者我要写书评

暂无评论

adaptive optimal output regulation for industrial hydrocracking process

Adaptive optimal output regulation for industrial hydrocrack...

引用

Chinese Automation Congress (CAC)

作者： Li, Zhongmei Huang, Mengzhe Xue, Dong Du, Wenli East China Univ Sci & Technol Key Lab Adv Control & Optimizat Chem Proc Shanghai 200237 Peoples R China NYU Tandon Sch Engn Brooklyn NY 11201 USA

ISBN: (纸本)9781728176871

Hydrocracking is an important petrochemical process, in which the reactor temperature determines the final product distribution and quality. In this paper, a new data-driven optimal reactor temperature control method is proposed through the integration of reinforcement learning, adaptive dynamic programming (ADP) and output regulation theory. Different from the existing literature, the reactor temperature control problem is formulated as an output regulation problem, and a policy iteration (PI) based ADP algorithm is employed to find the adaptive optimal controller. The simulation results show that the actual nitrogen content in the stream of R101 outlet can be regulated to the desired value and keep the reaction temperature of each bed in R101 to a minimum while disturbance rejection is achieved.

关键词： Hydrocracking process adaptive dynamic programming (ADP) optimal tracking

来源：评论

学校读者我要写书评

暂无评论

Parallel Optimal Tracking Control Schemes for Mode-Dependent Control of Coupled Markov Jump Systems via Integral RL Method

引用

ieee TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2020年第3期17卷 1332-1342页

作者： Zhang, Kun Zhang, Hua-guang Cai, Yuliang Su, Rong Northeastern Univ Sch Informat Sci & Engn Shenyang 110819 Peoples R China Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang 110819 Peoples R China Nanyang Technol Univ Sch Elect & Elect Engn Singapore 639798 Singapore

This article is concerned with the optimal tracking control problem of the coupled Markov jump system (CMJS) by using the reinforcement learning (RL) technique. Based on the conventional optimal tracking architecture, an offline tracking iteration algorithm is first designed to solve the coupled algebraic Riccati equation that can hardly he solved by mathematical methods directly. To overcome the crucial requirements and existing shortcomings in the offline tracking method, a novel integral RL (IRL) tracking algorithm is first proposed for CMJS, which develops a transition-probability-free optimal tracking control scheme with a reconstructed augmented system and discounted cost function. Both the requirements of transition probability pi(ij) and system matrix A(i) are avoided via the designed IRI, algorithm. The stability and convergence of the novel schemes are proved by the Lyapunov theory, and the tracking objective is achieved as desired. Finally, we apply the designed algorithms in a fourth-order Markov jump control problem and the stochastic mass, spring, and damper system to track continuous sinusoidal waveforms, and the simulation results are provided to show the effectiveness and applicability. Note to Practitioners-In the practical engineering systems, many useful signals and interference vary randomly. Therefore, the tracking control of stochastic systems and dynamics, such as the Markovion, Ito's, Wiener, and Martingale processes, plays an important role in the modern industry. As a matter of fact, it is always desired to reduce the requirement of exact information and transition probability in the homogeneous Markovian process, which is very difficult to obtain accurate measurements. One way is integrating the adaptive reinforcement learning (RI) technique into the Markovian systems to learn this implicit information. However, a major restriction of the RL technique is that the control policy should be related to the finite performance index, which gene

关键词： adaptive dynamic programming (ADP) Markov jump system optimal control stochastic stability tracking control

来源：评论

学校读者我要写书评

暂无评论

Self-adaptive Threshold-based Policy for Microservices Elasticity 28

Self-adaptive Threshold-based Policy for Microservices Elast...

引用

28th ieee International symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (ieee MASCOTS)

作者： Rossi, Fabiana Cardellini, Valeria Lo Presti, Francesco Univ Roma Tor Vergata DICII Rome Italy

ISBN: (纸本)9781728192383

The microservice architecture structures an application as a collection of loosely coupled and distributed services. Since application workloads usually change over time, the number of replicas per microservice should be accordingly scaled at run-time. The most widely adopted scaling policy relies on statically defined thresholds, expressed in terms of system-oriented metrics. This policy might not be well-suited to scale multi-component and latency-sensitive applications, which express requirements in terms of response time. In this paper, we present a two-layered hierarchical solution for controlling the elasticity of microservice-based applications. The higher-level controller estimates the microservice contribution to the application performance, and informs the lower-level components. The latter accordingly scale the single microservices using a dynamic threshold-based policy. So, we propose MB Threshold and QL Threshold, two policies that employ respectively model-based and model-free reinforcement learning approaches to learn threshold update strategies. These policies can compute different thresholds for the different application components, according to the desired deployment objectives. A wide set of simulation results shows the benefits and flexibility of the proposed solution, emphasizing the advantages of using dynamic thresholds over the most adopted policy that uses static thresholds.

关键词： Hierarchical Control Elasticity Self-adaptation Microservice reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Recent Progress in reinforcement learning and adaptive dynamic programming for Advanced Control Applications

引用

ieee/CAA Journal of Automatica Sinica 2024年第1期

作者： Ding Wang Ning Gao Derong Liu Jinna Li Frank L.Lewis

来源：评论

学校读者我要写书评

暂无评论

Self-Tuning Network Control Architectures

Self-Tuning Network Control Architectures

引用

ieee Conference on Decision and Control

作者： Tyler Summers Karthik Ganapathy Iman Shames Mathias Hudoba de Badyn Control Optimization and Networks Lab University of Texas at Dallas Australian National University Automatic Control Laboratory ETH Zürich

ISBN: (数字)9781665467612

ISBN: (纸本)9781665467629

We formulate a general mathematical framework for self-tuning network control architecture design. This problem involves jointly adapting the locations of active sensors and actuators in the network and the feedback control policy to all available information about the time-varying network state and dynamics to optimize a performance criterion. We propose a general solution structure analogous to the classical self-tuning regulator from adaptive control. We show that a special case with full-state feedback can be solved in principle with dynamic programming, and in the linear quadratic setting the optimal cost functions and policies are piecewise quadratic and piecewise linear, respectively. For large networks where exhaustive architecture search is prohibitive, we describe a greedy heuristic for joint architecture-policy design. We demonstrate in numerical experiments that self-tuning architectures can provide dramatically improved performance over fixed architectures. Our general formulation provides an extremely rich and challenging problem space with opportunities to apply a wide variety of approximation methods from stochastic control, system identification, reinforcement learning, and static architecture design.

关键词： Actuators reinforcement learning Aerospace electronics Cost function Control systems Sensors Feedback control

来源：评论

学校读者我要写书评

暂无评论

Application of reinforcement learning to The Orientation and Position Control of A 6 Degrees of Freedom Robotic Manipulator

Application of Reinforcement Learning to The Orientation and...

引用

ieee Latin American Robotics symposium, LARS

作者： Felipe Rigueira Campos Aline Xavier Fidêncio Jacó Domingues Gustavo Pessin Gustavo Freitas Programa de Pós-Graduação em Instrumentação Controle e Automação de Processos de Mineração Universidade Federal de Ouro Preto e Instituto Tecnológico Vale Ouro Preto MG Brazil Laboratório de Robótica Controle e Instrumentação Instituto Tecnológico Vale Ouro Preto MG Brazil Faculty of Electrical Engineering and Information Technology Ruhr-University Bochum Germany Departamento de Computação Universidade Federal de Ouro Preto Ouro Preto MG Brazil Departamento de Engenharia Elétrica Universidade Federal de Minas Gerais Belo Horizonte MG Brazil

ISBN: (数字)9781665462808

ISBN: (纸本)9781665462815

Applications with autonomous robots play an important role in the industry and in everyday life. Among them, the activities of manipulating and moving objects are highlighted by the wide variety of possible applications. These activities in static and known environments can be implemented through logic planned by the developer, but this is not feasible in dynamic environments. Machine learning (ML) techniques such as reinforcement learning (RL) algorithms have sought to replace the pre-defined programming by teaching the robot how to act. This paper presents the implementation of two RL algorithms, Deep Deterministic Policy Gradient (DDPG) and Proximal Policy Optimization (PPO), for orientation and position control of a 6-degree-of-freedom (6-DoF) robotic manipulator. The results demonstrated that the DDPG have a faster learning convergence in simpler activities, but if the complexity of the problem increases, it might not obtain a satisfactory behavior. On the other hand, PPO can solve more complex problems but it limits the convergence rate to the best result in order to avoid learning instability.

关键词： Service robots Position control reinforcement learning Power system stability Manipulators Stability analysis Behavioral sciences

来源：评论

学校读者我要写书评

暂无评论

Neural-Network-Based Robust Control Schemes for Nonlinear Multiplayer Systems With Uncertainties via adaptive dynamic programming

引用

ieee TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2019年第3期49卷 579-588页

作者： Jiang, He Zhang, Huaguang Luo, Yanhong Han, Ji Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Liaoning Peoples R China

This paper investigates the robust control issues of nonlinear multiplayer systems by utilizing adaptive dynamic programming (ADP) methods and fills a gap in the ADP field, where actuator uncertainties for multiplayer systems are still not addressed. Two types of actuator uncertainties including bounded nonlinear perturbation and unknown constant actuator fault are taken into consideration. First, a data-driven reinforcement learning (RL) approach is derived to learn the optimal solutions of multiplayer nonzero-sum games. Then, based on the obtained optimal control policies, two robust control schemes are developed to handle these two different types of uncertainties, respectively, and the associated stability analysis is also provided. To implement the proposed iterative RL approach, a single neural network (NN) architecture with least-square-based updating law is given, which reduces the computation burden compared with the traditional dual NN architecture. Finally, two numerical examples are shown to test the feasibility of our proposed schemes.

关键词： adaptive dynamic programming (ADP) approximate dynamic programming neural network (NN) reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

Deep Conservative Policy Iteration 34

Deep Conservative Policy Iteration

引用

34th AAAI Conference on Artificial Intelligence / 32nd Innovative Applications of Artificial Intelligence Conference / 10th AAAI symposium on Educational Advances in Artificial Intelligence

作者： Vieillard, Nino Pietquin, Olivier Geist, Matthieu Google Res Brain Team Mountain View CA 94043 USA

ISBN: (纸本)9781577358350

Conservative Policy Iteration (CPI) is a founding algorithm of Approximate dynamic programming (ADP). Its core principle is to stabilize greediness through stochastic mixtures of consecutive policies. It comes with strong theoretical guarantees, and inspired approaches in deep reinforcement learning (RL). However, CPI itself has rarely been implemented, never with neural networks, and only experimented on toy problems. In this paper, we show how CPI can be practically combined with deep RL with discrete actions, in an off-policy manner. We also introduce adaptive mixture rates inspired by the theory. We experiment thoroughly the resulting algorithm on the simple Cartpole problem, and validate the proposed method on a representative subset of Atari games. Overall, this work suggests that revisiting classic ADP may lead to improved and more stable deep RL algorithms.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

AutoScale: Energy Efficiency Optimization for Stochastic Edge Inference Using reinforcement learning 53

AutoScale: Energy Efficiency Optimization for Stochastic Edg...

引用

53rd Annual ieee/ACM International symposium on Microarchitecture (MICRO)

作者： Kim, Young Geun Wu, Carole-Jean Arizona State Univ Tempe AZ 85287 USA Facebook AI Menlo Pk CA USA

ISBN: (纸本)9781728173832

Deep learning inference is increasingly run at the edge. As the programming and system stack support becomes mature, it enables acceleration opportunities in a mobile system, where the system performance envelope is scaled up with a plethora of programmable co-processors. Thus, intelligent services designed for mobile users can choose between running inference on the CPU or any of the co-processors in the mobile system, and exploiting connected systems such as the cloud or a nearby, locally connected mobile system. By doing so, these services can scale out the performance and increase the energy efficiency of edge mobile systems. This gives rise to a new challenge-deciding when inference should run where. Such execution scaling decision becomes more complicated with the stochastic nature of mobile-cloud execution environment, where signal strength variation in the wireless networks and resource interference can affect real-time inference performance and system energy efficiency. To enable energy efficient deep learning inference at the edge, this paper proposes AutoScale, an adaptive and lightweight execution scaling engine built on the custom-designed reinforcement learning algorithm. It continuously learns and selects the most energy efficient inference execution target by considering characteristics of neural networks and available systems in the collaborative cloud-edge execution environment while adapting to stochastic runtime variance. Real system implementation and evaluation, considering realistic execution scenarios, demonstrate an average of 9.8x and 1.6x energy efficiency improvement over the baseline mobile CPU and cloud offloading, respectively, while meeting the real-time performance and accuracy requirements.

关键词： Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：