Transfer learning has shown great potential to accelerate reinforcement learning (RL) by utilizing prior knowledge of relevant task that has been learned in the past. Policy Reuse Q-learning (PRQL) is a general policy...
详细信息
ISBN:
(数字)9781728186719
ISBN:
(纸本)9781728186719
Transfer learning has shown great potential to accelerate reinforcement learning (RL) by utilizing prior knowledge of relevant task that has been learned in the past. Policy Reuse Q-learning (PRQL) is a general policy transfer framework, which speeds up the learning process of the target task by probabilistically reusing source policies from the policy library. In this paper, we propose an improved PRQL method to achieve more fast probabilistic policy reuse in deep reinforcement learning (DRL). First, we extend the basic PRQL algorithm to DRL, proposing a probability policy reuse algorithm that builds on DRL to solve more complexproblems. Second, PRQL algorithms usually use a metric based on the average gain to measure the similarity between tasks. However, it contains very limited information and must be delayed until the end of an episode to update, which is inefficient. Instead, we propose a new metric based on fitting the reward function, which can make the agent converge to the most suitable reuse policy more quickly and accurately. We demonstrate the detection accuracy, received cumulative reward, and speed of convergence of our method in three complex Markov tasks. Experimental results show that our method can consistently achieve efficient policy transfer in these tasks.
A strategy for correcting PID parameters in real time for fuzzy control is proposed for the problem of the difficulty of correcting traditional PID control parameters. Exploring the effect of three PID control paramet...
详细信息
ISBN:
(数字)9798350360240
ISBN:
(纸本)9798350384161
A strategy for correcting PID parameters in real time for fuzzy control is proposed for the problem of the difficulty of correcting traditional PID control parameters. Exploring the effect of three PID control parameters on the control's quality was the initial step. Next, this paper imports the fuzzy control strategy and creates the fuzzy controller. PID parameters are adjusted in real time by fuzzy inference, fuzzy inference, knowledge base, non-fuzzy. Finally, an experimental comparison between traditional and fuzzy PID was made using Matlab simulation experiments. Test results showed that the fuzzy PID algorithm significantly reduces overclock response time compared to traditional PID algorithms, with a 20% response time reduction and a 10% overclock reduction. Fuzzy PID is a more ideal and efficient control method thanks to its excellent performance and adaptability to the controlproblems of complexsystems.
In this paper, an event-triggered decentralized stabilization method based on adaptive dynamic programming (ADP) is proposed for nonlinear interconnected systems with constant-valued state constraints. By introducing ...
详细信息
ISBN:
(数字)9798331504755
ISBN:
(纸本)9798331504762
In this paper, an event-triggered decentralized stabilization method based on adaptive dynamic programming (ADP) is proposed for nonlinear interconnected systems with constant-valued state constraints. By introducing a barrier function for coordinate transformation, the original system with state constraints is transformed into an unconstrained form. Then, with the developed cost functions for auxiliary subsystems, the decentralized stabilization problem of interconnected systems is transformed into a series of optimal controlproblems. Here-after, to obtain the event-triggered optimal control policies, the local policy iteration algorithm is investigated to solve the event-triggered Hamilton-Jacobi-Bellman equations (HJBEs) for auxiliary subsystems. The local critic neural networks (NNs) are employed to approximate the cost functions. Furthermore, the closed-loop nonlinear interconnected system and the weight estimation errors of local critic NNs are guaranteed to be uniformly ultimately bounded by a set of developed decentralized stabilizing policies. Finally, a numerical example is employed to validate the effectiveness of the proposed method.
Unmanned Aerial Vehicles (UAVs) can use power series axis Permanent Magnet Synchronous Motors (PMSMs) to power onboard equipment, allowing the motors to operate under high-speed and low-carrier-ratio conditions. At th...
详细信息
ISBN:
(数字)9798350375855
ISBN:
(纸本)9798350375862
Unmanned Aerial Vehicles (UAVs) can use power series axis Permanent Magnet Synchronous Motors (PMSMs) to power onboard equipment, allowing the motors to operate under high-speed and low-carrier-ratio conditions. At this point, there will be problems such as fast response and complex system sensitivity, resulting in poor performance of many control methods applied at low speeds. Currently, the conventional PI control method is commonly used. To find alternative methods to traditional PI control, it is proposed to apply Linear Active Disturbance Rejection control (LADRC) to high-speed and low-carrier-ratio PMSM. In this paper, an equivalent LADRC controller to the PI controller is obtained and introduced into the UAV power system through parameter derivation and calculation. The proposed solution is assessed through simulation experiments and the result demonstrates that the proposed method can achieve the same or even better performance as the PI controller, which provides a new control method for high-speed and low-carrier-ratio PMSM.
Accurate modeling and control of robotic systems are often critical to perform complex manipulation tasks. Artificial Intelligence (AI) based approaches have gained widespread popularity to deal with task complexity a...
详细信息
The problem of blocking certain areas of network systems and their boundary case, which is the granulation of network to a set of isolated zones, is investigated in the article. The losses that await the system due to...
详细信息
Artificial bee colony (ABC) algorithm is often challenged by slow convergence, poor accuracy, and premature convergence in handling complex medium-scale optimization problems, due to its biased search equation and the...
详细信息
ISBN:
(数字)9798350377842
ISBN:
(纸本)9798350377859
Artificial bee colony (ABC) algorithm is often challenged by slow convergence, poor accuracy, and premature convergence in handling complex medium-scale optimization problems, due to its biased search equation and the high assimilation rate of bees within the colony. To trade off the ABC for global exploration and local exploitation of complex problem landscapes, this paper proposes an enhanced ABC based on elimination history and elite correction (HeCABC). Given the bias effects of the superior solutions and the historical inferior solutions eliminated on the search behavior of ABC, HeCABC separately formulates an exploration equation oriented by the historically eliminated inferior solutions and an exploitation equation upon multi-elite information fusion for employed bees and onlook bees, to regulate their exploration and exploitation intensity of the solution space. Meanwhile, HeCABC couples an elite correction strategy for fine-tuning the quality of the elites based on the update signal of these elites within the colony. HeCABC is experimented on various complex CEC 2014 test functions of 30 dimensions. The experimental results showcase its superior performance over five state-of-the-art ABC variants and two advanced swarm optimizers.
Safety in railway transportation is becoming urgent due to a large number of accidents and its severe socio-economic impact on a global scale. Currently, various smart train controlsystems are being developed and the...
详细信息
The more complex nonlinear feedback in the design of controller for multi-agent systems (MASs), the more chattering is likely to occur in the control input. However, it is difficult for the existing methods to achieve...
详细信息
ISBN:
(数字)9798350384185
ISBN:
(纸本)9798350384192
The more complex nonlinear feedback in the design of controller for multi-agent systems (MASs), the more chattering is likely to occur in the control input. However, it is difficult for the existing methods to achieve fixed-time control only through linear proportional feedback. For solving this problem, a novel explicit-time proportional control method is proposed for MASs. It can achieve the practical conditionally fixed-time stability of system. Firstly, a practical conditionally fixed-time stable system is proposed by a special proportional function. Then, a nominal controller is designed based on this stability system, which is called as practical explicit-time stabilization. In the end, the proposed method is applied to MASs, which makes this control system error converge to a predefined neighborhood of zero within an explicit time. Compared with the existing predefined-time back-stepping control methods, the smoothness of control input is improved by solving the problems of singularity and high gain. Theoretical analysis and simulation verify the main results.
For the sake of complete the detection of complex underwater environment in the engineering construction, this paper designs the structure of the underwater exploration robot according to the actual requirements of th...
详细信息
暂无评论