Time and wavelength-division multiplexed passive optical network (TWDM-PON) is expected to be the fronthaul networks of 5G, which requires low latency and large bandwidth. In this paper, we propose a q-learning based ...
详细信息
Time and wavelength-division multiplexed passive optical network (TWDM-PON) is expected to be the fronthaul networks of 5G, which requires low latency and large bandwidth. In this paper, we propose a q-learning based dynamic wavelength and bandwidth allocation (DWBA) algorithm for TWDM-PON. Simulation results show that the proposed DWBA algorithm can reduce the number of active channels by 40% and provide larger bandwidth with same wavelength number in comparation with the existing algorithms while satisfying the latency requirement of 5G fronthaul networks.
Path planning for wheeled mobile robots on partially known uneven terrain is an open challenge since robot motions can be strongly influenced by terrain with incomplete environmental information such as locally detect...
详细信息
Path planning for wheeled mobile robots on partially known uneven terrain is an open challenge since robot motions can be strongly influenced by terrain with incomplete environmental information such as locally detected obstacles and impassable terrain areas. This paper proposes a hierarchical path planning approach for a wheeled robot to move in a partially known uneven terrain. We first model the partially known uneven terrain environment respecting the terrain features, including the slope, step, and unevenness. Second, facilitated by the terrain model, we use A(star) algorithm to plan a global path for the robot based on the partially known map. Finally, the q-learning method is employed for local path planning to avoid locally detected obstacles in close range as well as impassable terrain areas when the robot tracks the global path. The simulation and experimental results show that the designed path planning approach provides satisfying paths that avoid locally detected obstacles and impassable areas in a partially known uneven terrain compared with the classical A(star) algorithm and the artificial potential field method.
Axle temperature forecasting technology is important for monitoring the status of the train bogie and preventing the hot axle and other dangerous accidents. In order to achieve high-precision forecasting of axle tempe...
详细信息
Axle temperature forecasting technology is important for monitoring the status of the train bogie and preventing the hot axle and other dangerous accidents. In order to achieve high-precision forecasting of axle temperature, a hybrid axle temperature time series forecasting model based on decomposition preprocessing method, parameter optimization method, and the Back Propagation (BP) neural network is proposed in this study. The modeling process consists of three phases. In stage I, the empirical wavelet transform (EWT) method is used to preprocess the original axle temperature series by decomposing them into several subseries. In stage II, the q-learning algorithm is used to optimize the initial weights and thresholds of the BP neural network. In stage III, the q-BPNN network is used to build the forecasting model and complete predicting all subseries. And the final forecasting results are generated by combining all prediction results of subseries. By comparing all results over three case predictions, it can be concluded that: (a) the proposed q-learning based parameter optimization method is effective in improving the accuracy of the BP neural network and works better than the traditional population-based optimization methods;(b) the proposed hybrid axle temperature forecasting model can get accurate prediction results in all cases and provides the best accuracy among eight general models.
Scheduling efficient energy management system operations to respond to the unstable customer demand, electricity prices, and weather increases the complexity of the control systems and requires a flexible and costeffe...
详细信息
Scheduling efficient energy management system operations to respond to the unstable customer demand, electricity prices, and weather increases the complexity of the control systems and requires a flexible and costeffective control policy. This study develops an intelligent and real-time battery energy storage control based on a reinforcement learning model focused on residential houses connected to the grid and equipped with solar photovoltaic panels and a battery energy storage system. Because the reinforcement learning's performance is very dependent on the design of the underlying Markov decision process, a cyclic time-dependent Markov Process is uniquely designed to capture existing daily cyclic patterns in demand, electricity price, and solar energy. The Markov Process is successfully used in the q-learning algorithm, resulting in more efficient battery energy control and saving electricity costs. The proposed q-learning algorithm is compared with benchmark models of a deterministic equivalent solution and a One-step Roll-out algorithm. Numerical experiments show the gap between the deterministic equivalent solution and q-learning approaches for one-month electricity cost decreased from 7.99% to 3.63% for house 27 and 6.91% to 3.26% for house 387 when the discrete size of demand, solar energy, price, and battery energy level adjusted to 20. Accordingly, the better performance of the proposed q-learning is demonstrated compared to the One-step Roll-out algorithm. Moreover, the effect of discrete size of state-space parameters on the adaptive q-learning performance and computational time are investigated. Variations in the electricity price significantly affect the q-learning algorithm's performance more than other parameters.
As the global demand for renewable energy continues to rise, wind energy has received widespread attention as an eco-friendly energy source. Wind power generation is regarded as one of the key means to reduce carbon e...
详细信息
As the global demand for renewable energy continues to rise, wind energy has received widespread attention as an eco-friendly energy source. Wind power generation is regarded as one of the key means to reduce carbon emissions and achieve sustainable development. Usually, a mass of turbines works together to produce electricity in a wind farm. However, downstream turbines will inevitably be influenced by the wake generated by upstream turbines, resulting in unused wind energy being lost. To reduce the negative effects of the wake, maximization of wind farm output power, and minimization of wind farm cost, a teaching-learning-based optimization algorithm with reinforcement learning is proposed in this paper. The improvements of the proposed algorithm mainly include the following three points: i) the original serial structure of the algorithm is changed to a parallel structure to accelerate the convergence and improve the efficiency of the algorithm. ii) the parameter F, which is adjusted by RL, is proposed to adjust the selection of the updating phase due to the design of a parallel structure. iii) in the modified learner phase, an individual is added to participate in the update, and a selection probability is proposed to improve the ability of the algorithm to retain the information of superior individuals. To study the performance of the modified algorithm, it was first tested against 10 other advanced algorithms on a benchmark testing suite. They then ran numerical experiments on four hypothetical wind farm cases under two simulated wind conditions. Finally, the superiority of improved algorithm over others and the effectiveness of addressing wind farm layout problem are demonstrated by experimental results.
Facilities Layout Problems (FLPs) aim to efficiently allocate facilities within a given space, considering various constraints such as minimizing transportation distances. These problems are commonly encountered in va...
详细信息
Facilities Layout Problems (FLPs) aim to efficiently allocate facilities within a given space, considering various constraints such as minimizing transportation distances. These problems are commonly encountered in various types of advanced manufacturing systems, including Reconfigurable Manufacturing Systems (RMSs). RMSs enable easier layout changes to accommodate shifts in product mix, production volume, or process requirements thanks to their modularity and changeability. Reinforcement learning (RL) has proven its efficiency in addressing decision-making problems. Therefore, this paper introduces a comparative study between two RL algorithms to solve FLPs: Advantage Actor-Critic (A2C) and q-learning algorithms.
A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is ...
详细信息
A novel microgrid control strategy is presented in this paper. A resilient community microgrid model, which is equipped with solar PV generation and electric vehicles (EVs) and an improved inverter control system, is considered. To fully exploit the capability of the community microgrid to operate in either grid-connected mode or islanded mode, as well as to achieve improved stability of the microgrid system, universal droop control, virtual inertia control, and a reinforcement learning-based control mechanism are combined in a cohesive manner, in which adaptive control parameters are determined online to tune the influence of the controllers. The microgrid model and control mechanisms are implemented in MATLAB/Simulink and set up in real-time simulation to test the feasibility and effectiveness of the proposed model. Experiment results reveal the effectiveness of regulating the controller’s frequency and voltage for various operating conditions and scenarios of a microgrid.
This study developed a forest management plan model using reinforcement learning (q-learning) to optimize both the economic and ecological functions of forests. Management objectives for national forests were establis...
详细信息
This study developed a forest management plan model using reinforcement learning (q-learning) to optimize both the economic and ecological functions of forests. Management objectives for national forests were established, and forest conditions were analyzed using GIS spatial data and administrative records. A 60-year forest management plan was formulated to predict timber production and management performance across different regions and time periods. Our analysis revealed that Scenario 3 (Carbon Storage Priority) demonstrated the highest economic value, starting at approximately KRW 576.2 billion in the initial period and escalating to KRW 775.7 billion over six 10-year periods, totaling 60 years. In addition to its economic performance, Scenario 3 effectively improved forest age class structure and ensured a stable timber supply, making it the most balanced approach for sustainable forest management. By focusing on carbon storage as a key management goal, this approach highlights the potential for achieving both economic and environmental benefits concurrently. These results suggest that reinforcement learning is a powerful tool for developing long-term forest management strategies that address multiple objectives, including economic viability, ecological sustainability, and resource optimization.
In view of the high coupling degree of regional integrated energy system, a bilayer interaction strategy, consisting of energy suppliers, distribution networks, and users, is proposed. Game interaction strategy includ...
详细信息
In view of the high coupling degree of regional integrated energy system, a bilayer interaction strategy, consisting of energy suppliers, distribution networks, and users, is proposed. Game interaction strategy includes two aspects: scheduling and bidding. The independent system operator (ISO) coordinates all adjustable resources. Depending on the quotation price and multi-energy load prediction, ISO minimises the total energy cost, which realises the complementary of the multi-energy in the cooperative game. Under the assumption of incomplete information and bounded rationality, this study designs bidding functions and pay-as-bid settlement protocols. On this basis, according to history scheduling data and units' characteristics, agents for energy suppliers pursue maximum interests. Also, the non-cooperative bidding process in multi-energy market is simulated by using q-learning algorithm. Finally, the evolutionary process of the bilayer competitive game model is studied by practical example, and the existence local Nash equilibrium of the strategy is also proven.
The Cognitive radio technology is a promising solution to the imbalance between scarcity and under utilization of the spectrum. However, this technology is susceptible to both classical and advanced jamming attacks wh...
详细信息
The Cognitive radio technology is a promising solution to the imbalance between scarcity and under utilization of the spectrum. However, this technology is susceptible to both classical and advanced jamming attacks which can prevent it from the efficient exploitation of the free frequency bands. In this paper, we explain how a cognitive radio can exploit its ability of dynamic spectrum access and its learning capabilities to avoid jammed channels. We start by the definition of jamming attacks in cognitive radio networks and we give a review of its potential countermeasures. Then, we model the cognitive radio behavior in the suspicious environment as a markov decision process. To solve this optimization problem, we implement the q-learning algorithm in order to learn the jammer strategy and to pro-actively avoid jammed channels. We present the limits of this algorithm in cognitive radio context and we propose a modified version to speed up learning a safe strategy. The effectiveness of this modified algorithm is evaluated by simulations and compared to the original q-learning algorithm.
暂无评论