检索结果-内蒙古大学图书馆

34th Chinese Control and Decision Conference (CCDC)

作者： Xia, Zhenxing Dai, Wei China Univ Min & Teclmol Sch Informat & Control Engn Xuzhou 221116 Jiangsu Peoples R China Northeastern Univ State Key Lab Synthet Automat Proc Ind Shenyang 110819 Liaoning Peoples R China

ISBN: (纸本)9781665478960

In this paper, adaptive dynamic programming (ADP) algorithm, with lifting technology, is develop to solve the multi-rate optimal control problem for discrete-time linear systems. We make use of the lifting technology to convert the multi-rate sample control problem to the single-rate one in the uniform cycle. The propose a Q-Leaming based approach to learn the optimal regulator by a value iteration (VI) algorithm. First, a class of continuous-time (CT) linear system with multi-timescale is considered. Then, the convergence of a Q-learning based algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value, and the control input also converges to the optimal values. Finally, HIL system for grinding process is given to illustrate the effective performance of the proposed method.

关键词： reinforcement learning (RL) Multi-Rate Optimal Control Lifting Technology

来源：评论

学校读者我要写书评

暂无评论

Enhancing Stock Market Predictions with Granger and reinforcement learning: An adaptive Approach

Enhancing Stock Market Predictions with Granger and Reinforc...

引用

symposium of Image, Signal Processing, and Artificial Vision (STSIVA)

作者： Chiayu Hsu Lin-Sheng Lee Jenhui Chen Department of Computer Science and Information Engineering Chang Gung University Taoyuan Taiwan

ISBN: (数字)9798350363203

ISBN: (纸本)9798350363210

Traditional graph neural networks (GNNs) construct static graph structures, which are unable to dynamically adapt to market changes. To address this challenge, this paper proposes an integrated prediction model combining GNNs and reinforcement learning (RL), enabling dynamic adjustment of graph structures to enhance the model's adaptability and predictive ability in volatile markets. Firstly, we use GNNs to extract complex inter-stock relationships from historical stock data, constructing a dynamic market structure. Secondly, through RL strategies within the RL model, we continuously adjust trading decisions to maximize long-term returns and adapt to market changes. This combined model not only captures market trading information but also responds promptly to market dynamics, achieving more accurate predictions. Experimental results on multiple datasets of Taiwan 50 constituent stocks demonstrate that our proposed GNN-RL model outperforms traditional methods in terms of accuracy and other metrics, proving its potential and superiority in practical applications.

关键词： Measurement Adaptation models Accuracy Computational modeling Decision making reinforcement learning Predictive models Data models Stock markets Forecasting

来源：评论

学校读者我要写书评

暂无评论

Speed Change Response of Switched Reluctance Motor Drives Under a Scheduled Q-learning Scheme 52

Speed Change Response of Switched Reluctance Motor Drives Un...

引用

52nd North American Power symposium (NAPS)

作者： Alharkan, Hamad Saadatmand, Sepehr Ferdowsi, Mehdi Shamsi, Pourya Missouri Univ Sci & Technol Dept Elect & Comp Engn Rolla MO 65409 USA

ISBN: (纸本)9781728181929

This paper investigated the speed change response of the scheduled Q-learning adaptive control of Switched Reluctance Motor (SRM) drives. This novel algorithm includes a scheduling approach to permit controlling the nonlinear domain of an SRM using a set of Q-learning cores, each of which is a Q-learning controller at a local linear operating point, which expands over the nonlinear surface of the system. Despite the effective tracking performance of this algorithm, the main issue with the use of this controller for SRM application is that motor speed appears inside the model of the machine and hence the Q-cores are directly impacted by the speed. To cope with this issue, the Q-table should retrain the Q-matrices whenever the rotational speed changes. This causes a slow speed change response due to learning process. In this paper, a new 3D Simulation and experimental results have illustrated the speed change response of SRM at different stages of the operation condition.

关键词： Switched reluctance motor (SRM) Current control reinforcement learning (RL) adaptive dynamic programming (ADP) Linear quadratic tracker (LQT) Least square (LS)

来源：评论

学校读者我要写书评

暂无评论

adaptive railway traffic control using approximate dynamic programming

引用

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES 2020年 113卷 91-107页

作者： Ghasempour, Taha Heydecker, Benjamin UCL Ctr Transport Studies London WC1E 6BT England

This study presents an adaptive railway traffic controller for real-time operations based on approximate dynamic programming (ADP). By assessing requirements and opportunities, the controller aims to limit consecutive delays resulting from trains that entered a control area behind schedule by sequencing them at a critical location in a timely manner, thus representing the practical requirements of railway operations. This approach depends on an approximation to the value function of dynamic programming after optimisation from a specified state, which is estimated dynamically from operational experience using reinforcement learning techniques. By using this approximation, the ADP avoids extensive explicit evaluation of performance and so reduces the computational burden substantially. In this investigation, we explore formulations of the approximation function and variants of the learning techniques used to estimate it. Evaluation of the ADP methods in a stochastic simulation environment shows considerable improvements in consecutive delays by comparison with the current industry practice of First-Come-First-Served sequencing. We also found that estimates of parameters of the approximate value function are similar across a range of test scenarios with different mean train entry delays.

关键词： Approximate dynamic programming reinforcement learning Railway traffic management adaptive control

来源：评论

学校读者我要写书评

暂无评论

Latency and Energy Minimization in NOMA-Assisted MEC Network: A Federated Deep reinforcement learning Approach

Latency and Energy Minimization in NOMA-Assisted MEC Network...

引用

ieee symposium on Computers and Communications (ISCC)

作者： Arian Ahmadi Anders Høst-Madsen Zixiang Xiong Department of Electrical Engineering University of Hawaii at Manoa Honolulu HI USA Department of Electrical and Computer Engineering Texas A&M University College Station TX USA

ISBN: (数字)9798350354232

ISBN: (纸本)9798350354249

Multi-access edge computing (MEC) is seen as a vital component of forthcoming 6G wireless networks, aiming to support emerging applications that demand high service reliability and low latency. However, ensuring the ultra-reliable and low-latency performance of MEC networks poses a significant challenge due to uncertainties associated with wireless links, constraints imposed by communication and computing resources, and the dynamic nature of network traffic. Enabling ultra-reliable and low-latency MEC mandates efficient load balancing jointly with resource allocation. In this paper, we investigate the joint optimization problem of offloading decisions, computation and communication resource allocation to minimize the expected weighted sum of delivery latency and energy consumption in a non-orthogonal multiple access (NOMA)-assisted MEC network. Given the formulated problem is a mixed-integer non-linear programming (MINLP), a new multi-agent federated deep reinforcement learning (FDRL) solution based on double deep Q-network (DDQN) is developed to efficiently optimize the offloading strategies across the MEC network while accelerating the learning process of the Internet-of-Thing (IoT) devices. Simulation results show that the proposed FDRL scheme can effectively reduce the weighted sum of delivery latency and energy consumption of IoT devices in the MEC network and outperform the baseline approaches.

关键词： Energy consumption Uncertainty Simulation Wireless networks Telecommunication traffic Minimization Deep reinforcement learning Internet of Things Resource management Low latency communication

来源：评论

学校读者我要写书评

暂无评论

reinforcement learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using dynamic Output Feedback

引用

ieee TRANSACTIONS ON CYBERNETICS 2020年第11期50卷 4670-4679页

作者： Rizvi, Syed Ali Asad Lin, Zongli Univ Virginia Charles L Brown Dept Elect & Comp Engn Charlottesville VA 22904 USA

In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning using dynamic output feedback. The design objective is to learn the optimal control parameters by using only the measurable input-output data, without requiring model information. A state parametrization scheme is presented which reconstructs the system state based on the filtered input and output signals. Based on this parametrization, two new output feedback adaptive dynamic programming Bellman equations are derived for the LQR problem based on policy iteration and value iteration (VI). Unlike the existing output feedback methods for continuous-time systems, the need to apply discrete approximation is obviated. In contrast with the static output feedback controllers, the proposed method can also handle systems that are state feedback stabilizable but not static output feedback stabilizable. An advantage of this scheme is that it stands immune to the exploration bias issue. Moreover, it does not require a discounted cost function and, thus, ensures the closed-loop stability and the optimality of the solution. Compared with earlier output feedback results, the proposed VI method does not require an initially stabilizing policy. We show that the estimates of the control parameters converge to those obtained by solving the LQR algebraic Riccati equation. A comprehensive simulation study is carried out to verify the proposed algorithms.

关键词： Output feedback Mathematical model Cost function Optimal control Iterative methods Stability analysis dynamic programming adaptive dynamic programming (ADP) linear quadratic regulator (LQR) output feedback reinforcement learning (RL)

来源：评论

学校读者我要写书评

暂无评论

adaptive Optimal Output Regulation of Discrete-Time Linear Systems: A reinforcement learning Approach

Adaptive Optimal Output Regulation of Discrete-Time Linear S...

引用

ieee Conference on Decision and Control

作者： Sayan Chakraborty Weinan Gao Kyriakos G. Vamvoudakis Zhong-Ping Jiang CAN Lab New York University Brooklyn NY USA State Key Laboratory of Synthetical Automation for Process Industries Northeastern University Shenyang China The Daniel Guggenheim School of Aerospace Engineering Georgia Institute of Technology GA USA

In this paper, we solve the optimal output regulation problem for discrete-time systems without precise knowledge of the system model. Drawing inspiration from reinforcement learning and adaptive dynamic programming, a data-driven solution is developed that enables asymptotic tracking and disturbance rejection. Notably, it is discovered that the proposed approach for discrete-time output regulation differs from the continuous-time approach in terms of the persistent excitation condition required for policy iteration to be unique and convergent. To address this issue, a new persistent excitation condition is introduced to ensure both uniqueness and convergence of the data-driven policy iteration. The efficacy of the proposed methodology is validated by an inverted pendulum on a cart example.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Optimizing Video Conferencing QoS: A DRL-based Bitrate Allocation Framework

Optimizing Video Conferencing QoS: A DRL-based Bitrate Alloc...

引用

ieee symposium on Network Operations and Management

作者： Kyungchan Ko Sangwoo Ryu Nguyen Van Tu James Won-Ki Hong Department of Computer Science and Engineering POSTECH Korea Graduate School of Artificial Intelligence POSTECH Korea

ISBN: (数字)9798350327939

ISBN: (纸本)9798350327946

As the user count for video-related services continues to grow, ensuring high-quality service (QoS) for them will become even more crucial in the future. Many studies have been conducted to enhance the quality of on-demand video streaming using adaptive bitrate (ABR) algorithms and artificial intelligence (AI). This study addresses a more complex challenge than that of on-demand video streaming: enhancing service quality in multi-party, full-duplex communication scenarios, such as video conferences. We propose a deep reinforcement learning (DRL)-based video bitrate allocation framework for a media server in the video conferencing system. Our framework aims to increase overall QoS by applying an appropriate bitrate for each connection in a video conferencing call, considering the network conditions for users. We train the DRL model to maximize the aggregate QoS of users in a meeting by constructing a feedback loop between a media server and a DRL server. Our experimental results demonstrate that our framework can adaptively control the video bitrate according to changes in network conditions. As a result, it achieves higher video bitrates in the user application (approximately, 5% under stable network conditions and 35% over the highly dynamic network conditions) compared to the existing rule-based bandwidth allocation.

关键词： Bit rate Quality of service Bandwidth Streaming media Media dynamic scheduling Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Applications of reinforcement learning in Three-phase Grid-connected Inverter

Applications of Reinforcement Learning in Three-phase Grid-c...

引用

ieee International Conference on Cyber Technology in Automation, Control, and Intelligent Systems

作者： Canming Li Control and Computer Engineering North China Electric Power University Beijing China

The grid-connected inverter is a key energy conversion device for grid-connected new energy and is widely used in distributed power generation system. However, the traditional control strategy has many limitations in the aspects of stability, system voltage and frequency adjustment, a large amount of renewable energy connected to the grid may cause power and frequency oscillation that threaten the stable operation of the grid. To solve this problem, an adaptive and optimal control method based on reinforcement learning and adaptive dynamic programming (ADP) is implemented for VSG three-phase grid-connected inverter. The method establishes a mathematical model of a VSG-based grid-connected inverter, and transformed into a standard linear quadratic regulation (LQR) optimization problem. On the basis of VSG power-frequency control, the dynamic compensation term given by the ADP algorithm is introduced into the active power loop. During the grid connection, the VSG output is optimally adjusted through the proposed adaptive optimal control strategy to reduce the system frequency fluctuation. In the case system dynamics are not known in the complex environment of grid-connected inverters, deep deterministic strategy gradient (DDPG) is used to replace the ADP algorithm in this paper. Finally, the effectiveness of the method is verified in the Simulink platform.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Deep reinforcement learning Based Resource Allocation Method in Future Wireless Networks with Blockchain Assisted MEC Network

Deep Reinforcement Learning Based Resource Allocation Method...

引用

ieee International symposium on World of Wireless Mobile and Multimedia Networks (WoWMoM)

作者： Prakhar Consul Ishan Budhiraja Deepak Garg Sachin Sharma Ammar Muthanna School of Computer Science Engineering & Technology Bennett University Greater Noida Uttar Pradesh India School of Computer Science and Artificial Intelligence SR University Warangal Telangana India State Bank of India Panchkula Chandigarh India Peoples Friendship University of Russia RUDN University Moscow Russia

ISBN: (数字)9798350394665

ISBN: (纸本)9798350394672

We present a blockchain-assisted mobile edge computing architecture for adaptive resource distribution in wireless communication systems, where the blockchain acts as an overhead system that provide command and control functionalities. In this context, achieving consensus across nodes while also ensuring the functionality of both MEC and blockchain systems is a big difficulty. Furthermore, resource distribution, frame size, and the number of sequential blocks generated by each contributor are important to Blockchain aided MEC functionality. As a result, a strategy for dynamic resource distribution and block creation is presented. To strengthen the efficiency of the overlapped blockchain system and enhance the quality of services (QoS) of the clients in the technologies to facilitate MEC system, spectrum allocation, frame size, and number of developing blocks for each distributor are framed as a joint optimization method that takes into account time-varying communication channels and MEC server saturation is defined. We use deep reinforcement learning (RAMBAN) to address this issue because standard approaches are ineffective. The simulation findings demonstrate that the efficacy of the suggested strategy when compared to different baseline approaches.

关键词： Wireless networks Quality of service Computer architecture Deep reinforcement learning Throughput Blockchains Resource management

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：