检索结果-内蒙古大学图书馆

43rd Chinese Control Conference (CCC)

作者： Yuan, Zeqiang Wang, Ding Liu, Ao Hu, Qinna Zhou, Zihang Beijing Univ Technol Fac Informat Technol Beijing 100124 Peoples R China Beijing Univ Technol Beijing Key Lab Computat Intelligence & Intellige Beijing 100124 Peoples R China Beijing Univ Technol Beijing Inst Artificial Intelligence Beijing 100124 Peoples R China Beijing Univ Technol Beijing Lab Smart Environm Protect Beijing 100124 Peoples R China

ISBN: (纸本)9798350366907;9789887581581

In this paper, we develop a decentralized tracking control (DTC) strategy to stabilize a class of continuous-time nonlinear interconnected large-scale systems in the presence of unmatched external disturbances. This strategy is derived from an online learning approach of optimal control. Initially, the DTC problem is formulated to ensure both tracking control of isolated subsystems and the overall stability of the large-scale system. By combining the tracking error with the reference trajectory, some augmented systems are constructed and then the DTC design is transformed into an optimal control problem. Considering external disturbances, we formulated it as a two-player zero-sum game. Critic neural networks are utilized to solve the Hamilton-Jacobi-Isaacs equation. This approach allows for the estimation of Nash equilibrium solution that encompasses both the optimal control law and the worst-case disturbance law. Notably, a novel gradient descent strategy with momentum is established to tune the weights of the critic neural network for approximating the cost function better. Finally, an experimental simulation is provided to verify the effectiveness of the developed DTC scheme.

关键词： adaptive dynamic programming Decentralized control H-infinity control reinforcement learning Tracking control

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning-Based Optimal Control for Continuous-Time Quantized Systems via adaptive dynamic programming

Reinforcement Learning-Based Optimal Control for Continuous-...

引用

ieee SSD International Multi-Conference on Systems, Signals and Devices

作者： Omar Qasem Weinan Gao Omar R. Daoud Electrical and Computer Engineering Department School of Engineering and Computing American International University Al-Jahra Kuwait State Key Laboratory of Synthetical Automation for Process Industries Northeastern University Shenyang Lianning China

ISBN: (数字)9798331542726

ISBN: (纸本)9798331542733

Ahstract- This paper presents a comprehensive investigation into the stability analysis and design conditions for quantizers to ensure the closed-loop stability of continuous-time linear quantized systems, under the conditions derived by the small-gain theorem. The study begins by deriving explicit bounds on the quantizer parameters required for maintaining system stability. Building on this foundation, an optimal controller is designed using the linear quadratic regulator (LQR) framework, providing an efficient data-driven control strategy. To further enhance the system's performance, an adaptive dynamic programming (ADP) algorithm, referred to as the hybrid iteration (HI) method, is developed. This algorithm effectively learns the optimal control policy by leveraging the trajectories of the quantized states and inputs, thereby addressing the challenges posed by quantization constraints. The proposed HI approach combines the advantages of adaptive learning and optimization, making it well-suited for continuous-time systems with limited information. The simulation results confirm that the ADP approach with the provided conditions not only stabilizes the quantized system but also achieves optimal control performance under the specified quantization conditions. This study offers valuable insights and a robust methodological framework for addressing stability and control challenges, with insights to be expanded to continuous-time nonlinear quantized systems, with potential applications in various engineering domains, such as networked systems, robotics and autonomous systems.

关键词： Quantization (signal) adaptive systems Regulators Simulation Heuristic algorithms Optimal control Stability analysis dynamic programming Trajectory Thermal stability

来源：评论

学校读者我要写书评

暂无评论

A Budget-aware Incentive Mechanism for Vehicle-to-Grid via reinforcement learning 31

A Budget-aware Incentive Mechanism for Vehicle-to-Grid via R...

引用

31st ieee/ACM International symposium on Quality of Service, IWQoS 2023

作者： Zhu, Tianxiang Zhang, Xiaoxi Duan, Jingpu Zhou, Zhi Chen, Xu Sun Yat-sen University Guangzhou China Southern University of Science and Technology Shenzhen China Pengcheng Laboratory Shenzhen China

ISBN: (纸本)9798350399738

With the increasing penetration of renewable energy and electric vehicles (EVs), the behavior of EVs' charging and discharging has shown great impact on the Micro Grid power load, motivating the development of Vehicle-to-Grid (V2G) technologies. However, the V2G market is still in its infancy, due to insufficient understanding of EV users' willingness and concerns. While many studies consider direct EV control, it's more realistic to indirectly affect users' behavior through monetary incentives. For better implementation flexibility, we advocate to display at charging piles strategically chosen incentives that are combined with electricity prices. Technically, this is the first model-free learning algorithm that can optimize incentives under unknown EV user reactions, increase the load control effectiveness and users' quality-of-service (QoS) simultaneously under a long-term incentive budget, and provide theoretical performance guarantees. We first construct a bi-level optimization framework to model the time-dependencies across our solutions. We then integrate primal-dual theories and upper-confidence bounds into reinforcement learning to balance power control and incentive consumption. A dynamic programming based algorithm is also proposed to maximize the aggregate user QoS. Finally, we prove bounded sub-optimality of our learning algorithm through theoretical analysis and conduct trace-driven simulations to demonstrate the advantages of our bi-level framework. © 2023 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Multi-Kernel Online reinforcement learning for Path Tracking Control of Intelligent Vehicles

引用

ieee TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2021年第11期51卷 6962-6975页

作者： Liu, Jiahang Huang, Zhenhua Xu, Xin Zhang, Xinglong Sun, Shiliang Li, Dazi Natl Univ Def Technol Coll Intelligence Sci & Technol Changsha 410073 Peoples R China Rocket Engn Design & Res Inst Beijing 100011 Peoples R China East China Normal Univ Dept Comp Sci & Technol Shanghai 200241 Peoples R China Beijing Univ Chem Technol Coll Informat Sci & Technol Beijing 100029 Peoples R China

Path tracking control of intelligent vehicles has to deal with the difficulties of model uncertainties and nonlinearities. As a class of adaptive optimal control methods, reinforcement learning (RL) has received increasing attention in solving difficult control problems. However, feature representation and online learning ability are two major problems to be solved for learning control of uncertain dynamic systems. In this article, we propose a multi-kernel online RL approach for path tracking control of intelligent vehicles. In the proposed approach, a multiple kernel feature learning framework is designed for online learning control based on dual heuristic programming (DHP) and the new online learning control algorithm is called multi-kernel DHP (MKDHP). In MKDHP, instead of the expert knowledge for selecting and fine-tuning of a suitable kernel function, only a set of basic kernel functions is required to be predefined and the multi-kernel features can be learned for value function approximation in the critic. The simulation studies on path tracking control for intelligent vehicles have been conducted under S-curve and urban road conditions. The results demonstrated that compared with other typical path tracking controllers for intelligent vehicles, such as the linear quadratic regulator (LQR), the pure pursuit controller and the ribbon-based controller, the proposed multi-kernel learning controller can achieve better performance in terms of tracking precision and smoothness.

关键词： Kernel Intelligent vehicles Vehicle dynamics Adaptation models Optimal control Heuristic algorithms Approximation algorithms Intelligent vehicle multiple kernel learning (MKL) path tracking reinforcement learning (RL) steering control

来源：评论

学校读者我要写书评

暂无评论

Advancing RAN Slicing with Offline reinforcement learning

Advancing RAN Slicing with Offline Reinforcement Learning

引用

ieee International symposium on New Frontiers in dynamic Spectrum Access Networks, DySPAN

作者： Kun Yang Shu-Ping Yeh Menglei Zhang Jerry Sydir Jing Yang Cong Shen Department of Electrical and Computer Engineering University of Virginia USA Intel Corporation USA School of Electrical Engineering and Computer Science The Pennsylvania State University USA

ISBN: (数字)9798350317640

ISBN: (纸本)9798350317657

dynamic radio resource management (RRM) in wireless networks presents significant challenges, particularly in the context of Radio Access Network (RAN) slicing. This technology, crucial for catering to varying user requirements, often grapples with complex optimization scenarios. Existing reinforcement learning (RL) approaches, while achieving good performance in RAN slicing, typically rely on online algorithms or behavior cloning. These methods necessitate either continuous environmental interactions or access to high-quality datasets, hindering their practical deployment. Towards addressing these limitations, this paper introduces offline RL to solving the RAN slicing problem, marking a significant shift toward more feasible and adaptive RRM methods. We demonstrate how offline RL can effectively learn near-optimal policies from sub-optimal datasets, a notable advancement over existing practices. Our research highlights the inherent flexibility of offline RL, showcasing its ability to adjust policy criteria without the need for additional environmental interactions. Furthermore, we present empirical evidence of the efficacy of offline RL in adapting to various service-level requirements, illustrating its potential in diverse RAN slicing scenarios.

关键词： Wireless networks dynamic spectrum access Cloning reinforcement learning dynamic scheduling Resource management Optimization

来源：评论

学校读者我要写书评

暂无评论

Fast Solution of adaptive dynamic programming for Intelligent Vehicle Predictive Controller

Fast Solution of Adaptive Dynamic Programming for Intelligen...

引用

2022 Chinese Automation Congress, CAC 2022

作者： Wang, Yarong He, Jianping Shanghai Jiao Tong University The Dept. of Automation Shanghai China

ISBN: (纸本)9781665465335

In this paper, we propose an offline Deep adaptive dynamic programming (DADP) algorithm to improve the real-time performance of trajectory tracking control. Model Predictive Control (MPC) is an effective method for solving nonlinear optimal control problems of trajectory tracking, while as an online solution, the processing speed of MPC is too slow to meet the high real-time requirements of automatic driving motion control, especially under long predictive step length. DADP consists of two parts, policy evaluation and policy improvement, where both the policy function and the evaluation function are approximated by neural networks mapping the state of agent to control outputs and value evaluation of the state respectively. Besides, multi-agent parallel training is adopted to speed up the neural network training. The optimal policy function is obtained after the iterations of policy evaluation and policy improvement reach the final equilibrium. Finally, double lane change tracking is performed on the CarSim digital twin platform verifying the high real-time performance of DADP. © 2022 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A Time-Varying Deep reinforcement Model Predictive Control for DC Power Converter Systems 12

A Time-Varying Deep Reinforcement Model Predictive Control f...

引用

ieee 12th International symposium on Power Electronics for Distributed Generation Systems (PEDG)

作者： Andalibi, Milad Hajihosseini, Mojtaba Teymoori, Sam Kargar, Maryam Gheisarnejad, Meysam Shiraz Univ Dept Elect & Comp Engn Shiraz Iran Aarhus Univ Dept Dept Engn Appl Formal Methods Aarhus Denmark UCSC Dept Elect & Comp Engn Santa Cruz CA USA Islamic Azad Univ West Tehran Branch Dept Elect & Comp Engn Tehran Iran Islamic Azad Univ Dept Elect Engn Najafabad Iran

ISBN: (纸本)9781665404655

Today power converters, especially DC/DC converters, is of great importance in power electronics applications such as DC micro-grids (MGs). However, they have some limitation such as inability to handle constant power load (CPL) which results in instability problems in MGs. Thus, a controller with specific characters including, robustness and fast response to system dynamic is vital to address the unsteadiness. In this paper, an adaptive model prediction controller (AMPC) based on Deep reinforcement learning (DRL) is developed to tackle the de-stabilization problem. In the proposed AMPC controller, the controlling signal coefficient in each variable operation point is regarded as the adjustable controller parameter and adaptively designed by the learning ability of the Deep Q- Network (DQN) strategy, leading to a robust controlling approach. We have shown that our suggested smart controller for DC/DC converters feeding CPLs is robust and fast in dynamic response.

关键词： DC-DC Buck-Boost Converter Constant Power Load (CPL) adaptive Model Predictive Control (AMPC) Deep reinforcement learning (DRL)

来源：评论

学校读者我要写书评

暂无评论

Deep-reinforcement-learning-based User-Preference-Aware Rate Adaptation for Video Streaming 23

Deep-Reinforcement-Learning-based User-Preference-Aware Rate...

引用

23rd ieee International symposium on a World of Wireless, Mobile and Multimedia Networks (ieee WoWMoM)

作者： Lu, Lingyun Xiao, Jun Ni, Wei Du, Haifeng Zhang, Dalin Beijing Jiaotong Univ Sch Software Engn Beijing Peoples R China Beijing Jiaotong Univ Sch Comp & Informat Technol Beijing Peoples R China Commonwealth Sci & Ind Res Org CSIRO Sydney NSW Australia Beijing Sankuai Online Technol Co Ltd Beijing Peoples R China

ISBN: (数字)9781665408769

ISBN: (纸本)9781665408769

Online video is the most popular Internet application. As the throughput would frequently change under different network conditions, it is important to adaptively select the proper bitrate and improve user's quality of experience. In this paper, we propose a new DRL-based rate adaption algorithm for video streaming, which holistically captures user's preference of video contents, network throughput and buffer occupancy, and select the proper bitrate for video to improve the QoE. Specifically, we use 3D Convolutional neural (C3D) network to learn the spatio-temporal features, and implement the semantic analysis of videos. We also apply the Term Frequency-Inverse Document Frequency (TF-IDF) method to analyze the user's preference of different scene types, according to its viewing history. The dynamic adaptive streaming is formulated as a Markov Decision Process (MDP) problem, and use the Actor-Critic (A3C) algorithm to dynamically choose the optimal bitrate. As corroborated by simulations, our algorithm can accurately obtain the user's preference, keep the bitrate allocation consistent with the user's preference, and maintain video quality. Compared with the state-of-the-art Pensieve algorithm, our algorithm improves the average QoE by at least 12.5%. It also has a significant improvement over other baseline methods.

关键词： rate adaption user preference quality of experience (QoE) video analysis deep reinforcement learning (DRL) Actor-Critic (A3C)

来源：评论

学校读者我要写书评

暂无评论

An adaptive Deep Q-learning Strategy for Routing Schemes in SDN-Based Data Centre Networks 3

An Adaptive Deep Q-Learning Strategy for Routing Schemes in ...

引用

3rd International symposium on Advances in Informatics, Electronics and Education, ISAIEE 2022

作者： Li, Jian Wang, Shuoyu Huang, Yubo Liao, Kai Bi, Feili Lou, Xia College of Mathematics and Information Science Guiyang University Guiyang China College of Information Science and Technology Tibet University Lasa China College of Civil Engineering Southwest Jiaotong University Chengdu China College of Big Data and Information Engineering Guizhou University Guiyang China Guizhou Vocational College of Applied Technology Fuquan China

ISBN: (纸本)9781665463577

The enhancing size of uses on the cloud has improvised the requirement for dependable and elite execution network engineering in Datacentres. programming Defined Networking has worked on the adaptability, postponement, and throughput of networks in contrast with static arrangements. To adjust to the quick advancement of distributed computing, enormous information, and different innovations, the mix of server farm rout and SDN is anticipated to type network the executives more advantageous and adaptable. With this benefit, routing methodologies have been widely concentrated by specialists. In any case, the systems in the regulator chiefly depend on manual plan, the ideal arrangements are hard to be acquired in the powerful network climate. A few routing calculations, for example, network geographies with repetitive connections, for example, Fat-tree to give proficient burden adjusting, be that as it may, the failure of these plans to adjust to quickly changing traffic conduct restricts their exhibition. This research suggests DL-based networking plan for SDN. We utilize an adaptive deep Q-network (ADQN) to fabricate the deep reinforcement learning routing plan. We exhibit the adequacy of the proposed framework through broad reproductions. Profiting from nonstop learning with a worldwide view, the proposed framework has lower stream fruition time, throughput and better burden stability as well as better vigor, contrasted with OSPF. © 2022 ieee.

关键词： reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

A dynamic Consensus Scheme for Unlicensed Spectrum Sharing in Heterogeneous Networks 23

A Dynamic Consensus Scheme for Unlicensed Spectrum Sharing i...

引用

23rd ieee International Conference on Communication Technology, ICCT 2023

作者： Zhang, Hanwen Leng, Supeng Zhao, Pengcheng He, Jianhua School of Information and Communication Enginering University of Electronic Science and Technology of China Chengdu China School of Computer Science and Electronic Engineering University of Essex United Kingdom

ISBN: (纸本)9798350325959

Exploiting and sharing unlicensed spectrum resources among cellular and WiFi networks is critical for the fifth-generation (5G) and beyond networks due to the severe spectrum shortage and huge traffic demands. While distributed consensus with blockchain has been considered to realize fair and efficient spectrum sharing, the existing mechanism is not adaptive to wireless network traffic with diverse QoS requirements in dynamic environments, which can result in significant consensus overhead and low levels of QoS. To tackle the above problems of static consensus adopted by the existing works, we propose a two-layer blockchain framework with intelligent consensus scheme for distributed spectrum sharing. Specifically, we proposed a two-layer blockchain architecture including a global blockchain and a local blockchain, and adopt a lightweight Proof of strategy (PoG) consensus mechanism. The local blockchain is dedicated to making spectrum allocation strategies, while the global blockchain is responsible for the management and coordination of the local blockchain. Deep reinforcement learning model is designed for the global blockchain to learn the relationship between the consensus period of the local blockchain and the utilization of the allocated spectrum and maximize the throughput of local heterogeneous networks. Furthermore, we model and analyze the performance of PoG in complicated interference environments. The Lagrange method and the relaxation method are used to transform an NP-hard problem into a fractional programming problem that can be solved iteratively. Simulation results show that the proposed architecture and intelligent consensus mechanism can significantly improve system throughput and adapt to the dynamic environment with complicated interference. © 2023 ieee.

关键词： Blockchain dynamic Consensus Spectrum Sharing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：