In order to simulate the bending and torsional coupled vibration of solar panels under external disturbances, an experimental system of a fixed flexible hinged plate is established. The dynamic model is established by...
详细信息
In order to simulate the bending and torsional coupled vibration of solar panels under external disturbances, an experimental system of a fixed flexible hinged plate is established. The dynamic model is established by finite element method (FEM), and the model parameters are accurately identified by wavelet transform and simulated annealing (SA) optimization method. Two laser displacement sensors and binocular vision are combined to detect the bending and torsional vibrations and used for feedback control. A combined piezoelectric (PZT) control scheme is designed to suppress the bending and torsional vibrations of the flexible hinged plate. A vibration control method based on divergence augmented policy optimization (DAPO) reinforcementlearning (RL) for the fixed flexible hinged plate is developed to train the modal controllers. Simulation and experiments are carried out to verify the effectiveness of the applied vibration control scheme and algorithm. The simulation and experimental results show that DAPO RL modal controllers have better vibration suppression effect than large gain proportional derivative (PD) controllers, especially for small amplitude residual vibration.
This article addresses adaptive optimized tracking control problem for strict-feedback cyclic switched nonlinear output constrained systems with average cyclic dwell time (ACDT). Different from most of the existing re...
详细信息
This article addresses adaptive optimized tracking control problem for strict-feedback cyclic switched nonlinear output constrained systems with average cyclic dwell time (ACDT). Different from most of the existing results on optimized control of switched nonlinear systems, a new mode-dependent reinforcementlearning (RL) algorithm of identifier-critic-actor architecture is designed. Technically, with the help of neural networks (NNs) switched state observer, the virtual and actual optimal controllers are developed by solving the Hamilton-Jacobi-Bellman (HJB) equation. Meanwhile, to reduce the impact of systems switching on the overall optimization control, the information of the switching signal is considered into the optimal performance index functions. In an attempt to settle the output constraints, a nonlinear output-dependent time-varying function is used to ensure that the system output never transgresses the prescribed regions. More importantly, by combining the improved ACDT method and Lyapunov stability theorem, a novel adaptive optimized control scheme is put forward which ensures that the boundedness of all signals in the closed-loop system. Finally, the effectiveness of the proposed optimized control algorithm is verified by numerical as well as practical simulations.
The multi-objective elevator group optimisation problem has attracted widespread attention due to its high practical significance. The elevator group control system (EGCS) is a typical multi- objective system designed...
详细信息
The multi-objective elevator group optimisation problem has attracted widespread attention due to its high practical significance. The elevator group control system (EGCS) is a typical multi- objective system designed to increase passenger service and reducing costs, such as energy consumption (EC). Based on the characteristics of elevator group scheduling problem, this paper treats EGCS as multiple reinforcementlearning agents that cooperate with each other. Each agent controls the operation of an elevator according to the average reward reinforcement learning algorithm (RLA), aiming to minimise the average waiting time (AWT) of passengers. Additionally, a neural network is employed to store and update the state behaviour value functions while dynamically classifying the state of unknown environments. The simulation results show that the reinforcementlearning based-scheduling algorithm has better scheduling effects than the static partition scheduling algorithm (SPSA) and cultural algorithm (CA).
There has been a continual rise in the quantity of smart and autonomous automobiles in recent decades. the effectiveness of communication among vehicles in Vehicular Ad-hoc Networks (VANET) is critical for ensuring th...
详细信息
There has been a continual rise in the quantity of smart and autonomous automobiles in recent decades. the effectiveness of communication among vehicles in Vehicular Ad-hoc Networks (VANET) is critical for ensuring the safety of drivers' lives. the primary objective of VANET is to share critical information regarding life-threatening events, such as traffic jams and accident alerts in a timely and accurate manner. Nevertheless, typical VANETs encounter several security issues involving threats to confidentiality, integrity, and availability. This paper proposes a new decentralized and tamper-resistant scheme for privacy preservation. We designed a new trust management system that utilizes blockchain technology. We strive to establish trust between vehicles and infrastructure and preserve privacy by guaranteeing the authenticity and integrity of the information exchanged in VANETS. Our proposal adopts the principles of reinforcementlearning to dynamically evaluate and allocate trust scores to vehicles and infrastructure based on their behavior. The scheme's performance has been evaluated based on key metrics. The results show that our new system provides an effective behavior management technique while preserving vehicle privacy.
Generating scientific management strategy contributes to the sustainable development of river ecological environment. In this study, a multi-objective coupled water and sediment regulation model aiming at minimizing s...
详细信息
Generating scientific management strategy contributes to the sustainable development of river ecological environment. In this study, a multi-objective coupled water and sediment regulation model aiming at minimizing sedimentation and inundation loss as well as maximizing ecological value in the lower Yellow River has been developed. A reinforcement Q-learningalgorithm was used to obtain optimized strategies from the multi-objective of sediment reduction, flood control and ecological restoration under different hydrological years. The results showed that the simulated channel sedimentation is very close to the measured value, which proves the applicability of the developed model. Under dry, normal and wet hydrological year, the effects of various regulation strategies on silt reduction, flood control and ecological restoration were obviously different. The regulation scheme of discharge at 3700 m(3)/s was verified to be suitable for dry and wet year, and that of discharge at 2600 m(3)/s was more suitable for normal year. Increasing the spacing of the beach area was better in normal year and wet year. Our findings suggested optimized strategies to address environmental challenges of the lower Yellow River in different hydrological years. This paper provides a reliable reference for improving the management of the lower Yellow River.
This study aims to address the lack of scientific and systematic decision systems in the field of Human Resources Management (HRM). By designing a HRM decision support system based on Multi-Agent systems and reinforce...
详细信息
ISBN:
(纸本)9798400718212
This study aims to address the lack of scientific and systematic decision systems in the field of Human Resources Management (HRM). By designing a HRM decision support system based on Multi-Agent systems and reinforcement learning algorithms, effective tools are provided to HR managers to assist them in making more scientific and systematic HRM decisions. The research analyzes the current issues in HRM practices and proposes comprehensive solutions. Through the optimization of Multi-Agent reinforcement learning algorithms, experiments validate the effectiveness of the system in supporting decision-making in HRM. The results demonstrate that the improved algorithms outperform traditional methods, confirming the efficacy of the system's design and optimization. This HRM decision support system, based on Multi-Agent systems and reinforcement learning algorithms, holds the potential to drive organizational development and enhance the efficiency of HRM. However, further research and practical application are needed to refine and optimize the system to adapt to the constantly evolving HRM environment.
An intelligent energy management strategy (EMS) based on an improved reinforcementlearning (RL) algorithm is developed to enhance the adaptability of the EMS and to further improve the fuel efficiency of a Plug-in Pa...
详细信息
An intelligent energy management strategy (EMS) based on an improved reinforcementlearning (RL) algorithm is developed to enhance the adaptability of the EMS and to further improve the fuel efficiency of a Plug-in Parallel Hybrid Electric Vehicle (PHEV). Both the numerical model and the energy management strategy of a plug-in PHEV are described. The improved RL with Q-learningalgorithm is implemented to acquire the optimal control strategies for improving fuel economy. The Markov Chain is employed to calculate the Transition Probability Matrix of the required power. A Kullback-Leibler (KL) divergence rate is designed to activate the update of EMS, when a new corresponding driving cycle is expected. An Exploration Factor (FT) is proposed to overcome the disadvantages of the normal RL algorithm in convergence rate and reward cost evaluation. The diverse KL divergence rates are examined to seek optimal solutions. The normal-RI, strategy, rule-based strategy, and dynamic programming strategy are implemented as benchmark strategies to verify the effectiveness of the proposed strategy. The validation results indicate that the improved RL algorithm with FY makes it possible to promote the EMS capable of significantly improving the energy efficiency of a plug-in PHEV.
In view of the lack of personalization and interactivity in current oral English learning, a dialogue system based on reinforcementlearning is introduced, aiming to improve users’ oral expression ability by dynamica...
详细信息
In view of the lack of personalization and interactivity in current oral English learning, a dialogue system based on reinforcementlearning is introduced, aiming to improve users’ oral expression ability by dynamically adjusting the learning path and feedback mechanism. The system adopts a deep Q-learningalgorithm, optimizes the voice interaction effect through a reward mechanism, and realizes real-time adaptive adjustment of the dialogue. First, a dialogue framework based on reinforcementlearning is designed, in which the user’s voice input is converted into text and used as the input of the system. Then, the system uses a deep Q-learningalgorithm to conduct feedback learning based on the user’s voice performance and grammatical errors, and adjusts the dialogue strategy and vocabulary recommendation in real time to improve the interactivity and accuracy of learning. Finally, the system trains multiple rounds of dialogues in a simulated environment to continuously optimize the speech recognition and dialogue response strategies. The whole process uses a reward mechanism to adjust the system behavior based on the actual performance of the user to ensure gradual improvement in the learning process. Learners with a medium foundation (intermediate) have a slight improvement in satisfaction scores, reaching 4.3 points, and their oral test improvement is +15 points. Learners with a high foundation (advanced) have a satisfaction score of 4.5 points in the use of the system, and their oral test improvement is +18 points, thanks to the detailed and accurate feedback provided by the system. By introducing a reinforcement learning algorithm, the designed English oral online dialogue system can adjust learning strategies in real time according to user feedback and improve learning effects. Future research will further optimize the reward mechanism and enhance the system’s adaptive ability and intelligent recommendation function to better serve the needs of different learner
In order to improve the reliability and economy of decentralized trade economy dynamic scheduling on e-Commerce platforms and shorten the running time of decentralized trade economy dynamic scheduling on e-Commerce pl...
详细信息
In order to improve the reliability and economy of decentralized trade economy dynamic scheduling on e-Commerce platforms and shorten the running time of decentralized trade economy dynamic scheduling on e-Commerce platforms, a decentralized trade economy dynamic scheduling method based on the reinforcement learning algorithm is proposed. In this paper, we analyze the basic theory of the reinforcement learning algorithm, study the Q -learningalgorithm, build a neural network to fit the value model, and initialize the reinforcement learning algorithm. With Markov decision process as the framework model, the optimal state behavior value function is updated by using the modeless discounted reward reinforcement learning algorithm Q-learning as the value iteration method. Gibbs distribution is used to construct exploratory random strategies to select behaviors with probability. Using the reinforcement learning algorithm and the three-layer feedforward neural network as the approximator of the state behavior value function, this paper studies the generalization of the value function faced by the decentralized trade economy dynamic scheduling of e-Commerce platforms and realizes the decentralized trade economy dynamic scheduling of e-Commerce platforms. The experimental results show that the proposed method can effectively improve the reliability and economy of the decentralized trade economy dynamic scheduling of e-Commerce platforms.
In today's information-based education era, the computer-aided instruction system under the background of "Internet +" is ushering in unprecedented development opportunities. Based on VARK model, this pa...
详细信息
In today's information-based education era, the computer-aided instruction system under the background of "Internet +" is ushering in unprecedented development opportunities. Based on VARK model, this paper discusses the design and implementation of computer-aided instruction system based on reinforcement learning algorithm. Through in-depth investigation of the application status of computer-aided instruction system and the characteristics of popular systems at home and abroad, combined with the principle and application of reinforcement learning algorithm, this study builds a system architecture with personalized education evaluation function. In the aspect of system design, with the help of neural network model and online test module, the study realizes intelligent analysis and evaluation of students' learning patterns, and improves the pertinence and effect of teaching. Through strict functional and performance tests, the stability and reliability of the system are verified, and the network and intelligent characteristics of the system are demonstrated in the learning process of students, which makes a positive contribution to the improvement of the level of education information and the optimization of teaching effect.
暂无评论