检索结果-内蒙古大学图书馆

A learning algorithm for TWDM-PON' DWBA with 5G fronthaul networks

OPTICAL FIBER TECHNOLOGY 2019年 53卷 102039-000页

作者： Liang, Siyuan Zhao, Feng Jiang, Wenli Xian Univ Posts & Telecommun Shaanxi Key Lab Informat Commun Network & Secur Xian 710121 Shaanxi Peoples R China Xi An Jiao Tong Univ 28 Xianning West Rd Xian 710049 Shaanxi Peoples R China

Time and wavelength-division multiplexed passive optical network (TWDM-PON) is expected to be the fronthaul networks of 5G, which requires low latency and large bandwidth. In this paper, we propose a q-learning based dynamic wavelength and bandwidth allocation (DWBA) algorithm for TWDM-PON. Simulation results show that the proposed DWBA algorithm can reduce the number of active channels by 40% and provide larger bandwidth with same wavelength number in comparation with the existing algorithms while satisfying the latency requirement of 5G fronthaul networks.

关键词： 5G networks Dynamic wavelength and bandwidth allocation NG-PON2 q-learning algorithm TWDM-PON

来源：评论

学校读者我要写书评

暂无评论

Path Planning for Wheeled Mobile Robot in Partially Known Uneven Terrain

引用

SENSORS 2022年第14期22卷 5217-5217页

作者： Zhang, Bo Li, Guobin Zheng, qixin Bai, Xiaoshan Ding, Yu Khan, Awais Shenzhen Univ Coll Mechatron & Control Engn Shenzhen 518060 Peoples R China Shenzhen Univ Shenzhen City Joint Lab Autonomous Unmanned Syst Shenzhen 518060 Peoples R China

Path planning for wheeled mobile robots on partially known uneven terrain is an open challenge since robot motions can be strongly influenced by terrain with incomplete environmental information such as locally detected obstacles and impassable terrain areas. This paper proposes a hierarchical path planning approach for a wheeled robot to move in a partially known uneven terrain. We first model the partially known uneven terrain environment respecting the terrain features, including the slope, step, and unevenness. Second, facilitated by the terrain model, we use A(star) algorithm to plan a global path for the robot based on the partially known map. Finally, the q-learning method is employed for local path planning to avoid locally detected obstacles in close range as well as impassable terrain areas when the robot tracks the global path. The simulation and experimental results show that the designed path planning approach provides satisfying paths that avoid locally detected obstacles and impassable areas in a partially known uneven terrain compared with the classical A(star) algorithm and the artificial potential field method.

关键词： hierarchical path planning uneven terrain A(star) algorithm q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

A novel axle temperature forecasting method based on decomposition, reinforcement learning optimization and neural network

引用

ADVANCED ENGINEERING INFORMATICS 2020年 44卷 101089-101089页

作者： Liu, Hui Yu, Chengming Yu, Chengqing Chen, Chao Wu, Haiping Cent South Univ Sch Traff & Transportat Engn Inst Artificial Intelligence & Robot IAIR Key Lab Traff Safety TrackMinist Educ Changsha 410075 Hunan Peoples R China

Axle temperature forecasting technology is important for monitoring the status of the train bogie and preventing the hot axle and other dangerous accidents. In order to achieve high-precision forecasting of axle temperature, a hybrid axle temperature time series forecasting model based on decomposition preprocessing method, parameter optimization method, and the Back Propagation (BP) neural network is proposed in this study. The modeling process consists of three phases. In stage I, the empirical wavelet transform (EWT) method is used to preprocess the original axle temperature series by decomposing them into several subseries. In stage II, the q-learning algorithm is used to optimize the initial weights and thresholds of the BP neural network. In stage III, the q-BPNN network is used to build the forecasting model and complete predicting all subseries. And the final forecasting results are generated by combining all prediction results of subseries. By comparing all results over three case predictions, it can be concluded that: (a) the proposed q-learning based parameter optimization method is effective in improving the accuracy of the BP neural network and works better than the traditional population-based optimization methods;(b) the proposed hybrid axle temperature forecasting model can get accurate prediction results in all cases and provides the best accuracy among eight general models.

关键词： Axle temperature forecasting Hybrid model Empirical wavelet transform q-learning algorithm Parameter optimization q-BPNN network

来源：评论

学校读者我要写书评

暂无评论

Battery energy storage control using a reinforcement learning approach with cyclic time-dependent Markov process

引用

INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS 2022年 134卷 107368-107368页

作者： Abedi, Sara Yoon, Sang Won Kwon, Soongeol Binghamton Univ Dept Syst Sci & Ind Engn 4400 Vestal Pkwy E Binghamton NY 13902 USA

Scheduling efficient energy management system operations to respond to the unstable customer demand, electricity prices, and weather increases the complexity of the control systems and requires a flexible and costeffective control policy. This study develops an intelligent and real-time battery energy storage control based on a reinforcement learning model focused on residential houses connected to the grid and equipped with solar photovoltaic panels and a battery energy storage system. Because the reinforcement learning's performance is very dependent on the design of the underlying Markov decision process, a cyclic time-dependent Markov Process is uniquely designed to capture existing daily cyclic patterns in demand, electricity price, and solar energy. The Markov Process is successfully used in the q-learning algorithm, resulting in more efficient battery energy control and saving electricity costs. The proposed q-learning algorithm is compared with benchmark models of a deterministic equivalent solution and a One-step Roll-out algorithm. Numerical experiments show the gap between the deterministic equivalent solution and q-learning approaches for one-month electricity cost decreased from 7.99% to 3.63% for house 27 and 6.91% to 3.26% for house 387 when the discrete size of demand, solar energy, price, and battery energy level adjusted to 20. Accordingly, the better performance of the proposed q-learning is demonstrated compared to the One-step Roll-out algorithm. Moreover, the effect of discrete size of state-space parameters on the adaptive q-learning performance and computational time are investigated. Variations in the electricity price significantly affect the q-learning algorithm's performance more than other parameters.

关键词： Energy management system Battery energy storage Reinforcement learning q-learning algorithm Cyclic time-dependent Markov process

来源：评论

学校读者我要写书评

暂无评论

A teaching-learning-based optimization algorithm with reinforcement learning to address wind farm layout optimization problem

引用

APPLIED SOFT COMPUTING 2024年 151卷

作者： Yu, Xiaobing Zhang, Wen Nanjing Univ Informat Sci & Technol Collaborat Innovat Ctr Forecast & Evaluat Meteorol Nanjing 210044 Peoples R China Nanjing Univ Informat Sci & Technol Sch Management Sci & Engn Nanjing 210044 Peoples R China Nanjing Univ Informat Sci & Technol Res Inst Risk Governance & Emergency Decis Making Sch Management Sci & Engn Nanjing 210044 Peoples R China

As the global demand for renewable energy continues to rise, wind energy has received widespread attention as an eco-friendly energy source. Wind power generation is regarded as one of the key means to reduce carbon emissions and achieve sustainable development. Usually, a mass of turbines works together to produce electricity in a wind farm. However, downstream turbines will inevitably be influenced by the wake generated by upstream turbines, resulting in unused wind energy being lost. To reduce the negative effects of the wake, maximization of wind farm output power, and minimization of wind farm cost, a teaching-learning-based optimization algorithm with reinforcement learning is proposed in this paper. The improvements of the proposed algorithm mainly include the following three points: i) the original serial structure of the algorithm is changed to a parallel structure to accelerate the convergence and improve the efficiency of the algorithm. ii) the parameter F, which is adjusted by RL, is proposed to adjust the selection of the updating phase due to the design of a parallel structure. iii) in the modified learner phase, an individual is added to participate in the update, and a selection probability is proposed to improve the ability of the algorithm to retain the information of superior individuals. To study the performance of the modified algorithm, it was first tested against 10 other advanced algorithms on a benchmark testing suite. They then ran numerical experiments on four hypothetical wind farm cases under two simulated wind conditions. Finally, the superiority of improved algorithm over others and the effectiveness of addressing wind farm layout problem are demonstrated by experimental results.

关键词： Teaching-learning-based optimization Reinforcement learning Wind farm layout optimization q-learning algorithm Meta-heuristic algorithm

来源：评论

学校读者我要写书评

暂无评论

Comparative Study of Reinforcement learning algorithms to solve Reconfigurable Facilities Layout Problem

引用

IFAC-PapersOnLine 2024年第19期58卷 55-60页

作者： Amine Chiboub Julien Francois Thècle Alix Rémy Dupas IMS Univ. Bordeaux CNRS Bordeaux INP IMS UMR 5218 F-33400 Talence France I2M Univ. Bordeaux CNRS Bordeaux INP I2M UMR 5295 F-33400 Talence France

Facilities Layout Problems (FLPs) aim to efficiently allocate facilities within a given space, considering various constraints such as minimizing transportation distances. These problems are commonly encountered in various types of advanced manufacturing systems, including Reconfigurable Manufacturing Systems (RMSs). RMSs enable easier layout changes to accommodate shifts in product mix, production volume, or process requirements thanks to their modularity and changeability. Reinforcement learning (RL) has proven its efficiency in addressing decision-making problems. Therefore, this paper introduces a comparative study between two RL algorithms to solve FLPs: Advantage Actor-Critic (A2C) and q-learning algorithms.

关键词： Facilities Layout Problem Reconfigurable Manufacturing Systems Reinforcement learning q-learning algorithm A2C algorithm

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning-Based Control for Resilient Community Microgrid Applications

引用

Journal of Power and Energy Engineering 2022年第9期10卷 1-13页

作者： Md Mahmudul Hasan Ishtiaque Zaman Miao He Michael Giesselmann Electrical and Computer Engineering Texas Tech University Lubbock USA

A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is considered. To fully exploit the capability of the community microgrid to operate in either grid-connected mode or islanded mode, as well as to achieve improved stability of the microgrid system, universal droop control, virtual inertia control, and a reinforcement learning-based control mechanism are combined in a cohesive manner, in which adaptive control parameters are determined online to tune the influence of the controllers. The microgrid model and control mechanisms are implemented in MATLAB/Simulink and set up in real-time simulation to test the feasibility and effectiveness of the proposed model. Experiment results reveal the effectiveness of regulating the controller’s frequency and voltage for various operating conditions and scenarios of a microgrid.

关键词： Microgrid Reinforcement learning q-learning algorithm Vehi-cle-to-Grid (V2G)

来源：评论

学校读者我要写书评

暂无评论

Timber Harvest Planning Using Reinforcement learning: A Feasibility Study

引用

FORESTS 2024年第10期15卷 1725页

作者： Ji, Hyo-Vin Han, Sang-Kyun Park, Jin-Woo Kangwon Natl Univ Coll Forest & Environm Sci Div Forest Sci Chunchon 24341 South Korea

This study developed a forest management plan model using reinforcement learning (q-learning) to optimize both the economic and ecological functions of forests. Management objectives for national forests were established, and forest conditions were analyzed using GIS spatial data and administrative records. A 60-year forest management plan was formulated to predict timber production and management performance across different regions and time periods. Our analysis revealed that Scenario 3 (Carbon Storage Priority) demonstrated the highest economic value, starting at approximately KRW 576.2 billion in the initial period and escalating to KRW 775.7 billion over six 10-year periods, totaling 60 years. In addition to its economic performance, Scenario 3 effectively improved forest age class structure and ensured a stable timber supply, making it the most balanced approach for sustainable forest management. By focusing on carbon storage as a key management goal, this approach highlights the potential for achieving both economic and environmental benefits concurrently. These results suggest that reinforcement learning is a powerful tool for developing long-term forest management strategies that address multiple objectives, including economic viability, ecological sustainability, and resource optimization.

关键词： q-learning algorithm carbon storage water source recharge timber production

来源：评论

学校读者我要写书评

暂无评论

Bilayer game strategy of regional integrated energy system under multi-agent incomplete information

引用

JOURNAL OF ENGINEERING-JOE 2019年第16期2019卷 1285-1291页

作者： Hao, Ran Ai, qian Jiang, Ziqing Shanghai Jiao Tong Univ Sch Elect Informat & Elect Engn Shanghai Peoples R China

In view of the high coupling degree of regional integrated energy system, a bilayer interaction strategy, consisting of energy suppliers, distribution networks, and users, is proposed. Game interaction strategy includes two aspects: scheduling and bidding. The independent system operator (ISO) coordinates all adjustable resources. Depending on the quotation price and multi-energy load prediction, ISO minimises the total energy cost, which realises the complementary of the multi-energy in the cooperative game. Under the assumption of incomplete information and bounded rationality, this study designs bidding functions and pay-as-bid settlement protocols. On this basis, according to history scheduling data and units' characteristics, agents for energy suppliers pursue maximum interests. Also, the non-cooperative bidding process in multi-energy market is simulated by using q-learning algorithm. Finally, the evolutionary process of the bilayer competitive game model is studied by practical example, and the existence local Nash equilibrium of the strategy is also proven.

关键词： pricing power markets game theory multi-agent systems tendering scheduling learning (artificial intelligence) evolutionary computation noncooperative bidding process multienergy market bilayer competitive game model bilayer game strategy regional integrated energy system multiagent incomplete information high coupling degree bilayer interaction strategy energy suppliers game interaction strategy scheduling independent system operator multienergy load prediction bidding functions pay-as-bid settlement protocols total energy cost minimization ISO coordinates quotation price bounded rationality unit characteristics history scheduling data q-learning algorithm evolutionary process local Nash equilibrium distribution networks

来源：评论

学校读者我要写书评

暂无评论

Cognitive Radio Jamming Mitigation using Markov Decision Process and Reinforcement learning

引用

Procedia Computer Science 2015年 73卷 199-208页

作者： Feten Slimeni Bart Scheers Zied Chtourou Vincent Le Nir Rabah Attia VRIT Lab - Military Academy of Tunisia Nabeul 8000 Tunisia CISS Departement - Royal Military Academy (RMA) Brussels 1000 Belgium SERCOM Lab - EPT University of Carthage Marsa 2078 Tunisia

The Cognitive radio technology is a promising solution to the imbalance between scarcity and under utilization of the spectrum. However, this technology is susceptible to both classical and advanced jamming attacks which can prevent it from the efficient exploitation of the free frequency bands. In this paper, we explain how a cognitive radio can exploit its ability of dynamic spectrum access and its learning capabilities to avoid jammed channels. We start by the definition of jamming attacks in cognitive radio networks and we give a review of its potential countermeasures. Then, we model the cognitive radio behavior in the suspicious environment as a markov decision process. To solve this optimization problem, we implement the q-learning algorithm in order to learn the jammer strategy and to pro-actively avoid jammed channels. We present the limits of this algorithm in cognitive radio context and we propose a modified version to speed up learning a safe strategy. The effectiveness of this modified algorithm is evaluated by simulations and compared to the original q-learning algorithm.

关键词： Cognitive radio network jamming attack q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：