This article presents a proactive approach to resolving the conflict between safety and optimality for continuous-time (CT) safety-critical systems with unknown dynamics. The presented method guarantees safety and per...
详细信息
This article presents a proactive approach to resolving the conflict between safety and optimality for continuous-time (CT) safety-critical systems with unknown dynamics. The presented method guarantees safety and performance specifications by combining two controllers: a safe controller and an optimal controller. On the one hand, the safe controller is designed using only input and state data measurements and without requiring the state derivative data, which are typically required in data-drivencontrol of CT systems. State derivative measurement is costly, and its approximation introduces noise to the system. On the other hand, the optimal controller is learned using a low-complexity one-shot optimization problem, which again does not rely on prior knowledge of the system dynamics and state derivative data. Compared to existing optimal controllearning methods for CT systems, which are typically iterative, a one-shot optimization is considerably more sample-efficient and computationally efficient. The share of optimal and safe controllers in the overall control policy is obtained by solving a computationally efficient optimization problem involving a scalar variable in a data-driven manner. It is shown that the contribution of the safe controller dominates that of the optimal controller when the system's state is close to the safety boundaries, and this domination drops as the system trajectories move away from the safety boundaries. In this case, the optimal controller contributes more to the overall controller. The feasibility and stability of the proposed controller are shown. Finally, the simulation results show the efficacy of the proposed approach.
There are a large number of cyber-attacks in the power system, especially the false data injection attack (FDIA). This attack can bypass the traditional bad data detection mechanism (BDDM), and affect the operation of...
详细信息
ISBN:
(纸本)9798350321050
There are a large number of cyber-attacks in the power system, especially the false data injection attack (FDIA). This attack can bypass the traditional bad data detection mechanism (BDDM), and affect the operation of the power system. In this paper, for the purpose of guaranteeing the reliable operation of the cyber-physical power system (CPPS), a novel FDIA detection model is developed based on spatial-temporal graph neural network (STGNN). The STGNN can extract the temporal features and spatial features of measurement data simultaneously in the CPPS. Specially, the spatial features and the temporal features are extracted by graph neural network (GNN) and recurrent neural network (RNN), respectively. Simulation results based on ieee 14-bus system verify the performance of the proposed method.
Considering overshoot and chatter of the multi-input system with unknown interference, this paper studies the adaptive robust optimal controls of continuous-time two-input systems with an approximate dynamic programmi...
详细信息
ISBN:
(纸本)9798350321050
Considering overshoot and chatter of the multi-input system with unknown interference, this paper studies the adaptive robust optimal controls of continuous-time two-input systems with an approximate dynamic programming (ADP) based Q-function scheme. A complex Hamilton-Jacobi-Issacs (HJI) equation is obtained with the two-input system and the zero-game theory, where a value function is constructed. Solving the HJI equation is a challenging task. Thus, an ADP-based Q-function with a neural network is constructed to learn the saddle point of the HJI equation. Simultaneously, an integral reinforcement signal of the critic networks is introduced such that the system drift and input dynamics in the HJI equation are relaxed when studying the saddle-point intractable solution. Then, the adaptive robust optimal actor and worst disturbance are approximated with another three networks. Finally, an F-16 aircraft plant is used to verify the proposed ADP-based Q-function.
Reinforcement learning, as an effective framework for solving continuous decision tasks in machine learning, has been widely used in manipulator decision control. However, for manipulator grasping tasks in complex env...
详细信息
ISBN:
(纸本)9798350321050
Reinforcement learning, as an effective framework for solving continuous decision tasks in machine learning, has been widely used in manipulator decision control. However, for manipulator grasping tasks in complex environments, it is difficult for intelligence to improve performance by exploring to obtain high-quality interaction samples. In addition, the training models of reinforcement learning usually lack task generalization and need to be relearned to adapt to task changes. To address these issues, researchers have proposed transfer learning that uses external prior knowledge to help the target task to improve the reinforcement learning process. In this paper, the transfer of the manipulator grasping source task to the grasping target task based on the deep Q-network algorithm is achieved by constructing lateral connections between fully convolutional neural networks using Densenet. Experimental results in the CoppeliaSim simulation environment show that the methods successfully achieve inter-task transfer by constructing lateral connections between fully convolutional neural networks. The validated transfer reinforcement learning approach improves the effectiveness of task training while reducing the complexity of the network due to lateral connections.
With the rapid development of microgrid cluster operation, the problem of voltage regulation in the coordinated operation of multiple microgrids faces practical challenges. Aiming at the problem of voltage regulation ...
详细信息
ISBN:
(纸本)9798350321050
With the rapid development of microgrid cluster operation, the problem of voltage regulation in the coordinated operation of multiple microgrids faces practical challenges. Aiming at the problem of voltage regulation of multi-microgrids, this paper firstly establishes an optimization model of coordinated voltage regulation of multiple microgrids considering the coordination of source, grid, load and storage. Since the difficulty of solving the above optimization problem, it is further reformulated as a Markov game. Then, a novel collaborative voltage regulation algorithm based on multi-agent deep reinforcement learning (MADRL) is proposed. In order to improve the scalability of the algorithm, an attention mechanism is introduced into the multi-agent deep reinforcement learning algorithm. The simulation results show that the proposed algorithm can coordinate with multiple microgrids to regulate the voltage to a safe range.
As the uncertainties of intermittent energy and load in the integrated energy system gradually increase, traditional dispatch methods are limited to fixed physical models and parameter settings that can hardly respond...
详细信息
ISBN:
(纸本)9798350321050
As the uncertainties of intermittent energy and load in the integrated energy system gradually increase, traditional dispatch methods are limited to fixed physical models and parameter settings that can hardly respond to the random fluctuations in the dynamic system with source-load. In this paper, a deep reinforcement learning-based dynamic dispatch method for the integrated energy system is proposed to address this problem. First, a data-driven deep reinforcement learning model is constructed for the integrated energy system. Through the continuous interaction between the agent and the integrated energy system, the dispatch strategies are learned adaptively to reduce dependence on the physical models. Secondly, the variations of source-load uncertainties are characterized by adding random disturbances. Pivotal aspects such as state spaces, action spaces, reward mechanisms, and the training process of the deep reinforcement learning model are improved according to the characteristics of uncertainties. Then a proximal policy optimization algorithm is used to solve the problem, and the dynamic dispatch decisions of the integrated energy system are realized. Finally, simulation results verify the feasibility and effectiveness of the proposed method over different time scales and in uncertain environments.
driven by the increasing needs for production safety, a fault detection method based on multi-sensor fusion with adaptive weight coefficients is proposed in this paper to make full use of multi-measuring points inform...
详细信息
ISBN:
(纸本)9798350321050
driven by the increasing needs for production safety, a fault detection method based on multi-sensor fusion with adaptive weight coefficients is proposed in this paper to make full use of multi-measuring points information. To this end, considering the different information among multi-measuring points, the variance contribution rate (VCR) of vibration signals are used to design adaptive weight coefficients for data fusion to fully utilize the information contained in each vibration signal. On this basis, the least atoms contain time domain and frequency domain are extracted based on dictionary sparse representation (DSR) algorithm to represent the feature information of the original signal to weaken the influence of the curse of dimensionality. Finally, K-nearest neighbor distance is used in sparse residual space (SRS) for fault detection (K-SRS). The effectiveness of the proposed method is demonstrated by the rolling bearings data, and results show the advantage of our proposed approach.
The advanced train-to-train (T2T) communication technology, equipped with multiple high-speed trains (MHSTs), has the potential to enable train groups to maintain a stable T2T distance and achieve consensus tracking o...
详细信息
The advanced train-to-train (T2T) communication technology, equipped with multiple high-speed trains (MHSTs), has the potential to enable train groups to maintain a stable T2T distance and achieve consensus tracking of MHSTs, thereby enhancing operational safety and efficiency. This study focuses on the data-driven distributed control issue of MHSTs considering quantization effects and measurement bias, employing a learning approach. Firstly, an equivalent linearization model of MHSTs and a transmission model accounting for sensor bias are constructed. Subsequently, a distributed model free adaptive iterative learningcontrol (MFAILC) scheme using quantized signals is proposed. We then prove that the tracking error under the quantizer-based MFAILC is uniformly ultimately bounded, followed by further investigation on the impact of uniform quantizers. Finally, through a series of test conducted on the StarSim hardware-in-loop (HIL) semi-physical platform using quantified indicators, both the learning advantages of MFAILC and the influence of the quantization mechanism and measurement bias on MHSTs are verified.
In this paper, the problem of dissipative consensus iterative learningcontrol (ILC) is studied for singular multiagent systems (MASs). Firstly, a novel ILC algorithm is designed for such singular MASs. Then, under a ...
详细信息
Motion controlsystems are widely used in many fields of industry. Conventional control schemes are highly dependent on the system model to be designed. The performance of design would be greatly reduced, when the sys...
详细信息
ISBN:
(纸本)9798350387780;9798350387797
Motion controlsystems are widely used in many fields of industry. Conventional control schemes are highly dependent on the system model to be designed. The performance of design would be greatly reduced, when the system exists unknown disturbances or uncertainty. Therefore, some scholars pointed out that the dependency on the system models can be eliminated by data-driven design schemes. In this paper, the reinforcement learning-based methods are included, which appeal to attentions gradually. The disturbances rejection problem for motion controlsystems is studied based on reinforcement learning. Considering the continuity of state space and action space, a method based on deep reinforcement learning algorithm is proposed to reject the periodic disturbances. Proposed deep deterministic policy gradient (DDPG) and twin delayed deep deterministic policy gradient (TD3) based algorithms are compared in simulation. The simulation results show that the periodic disturbances of the motion controlsystems can be rejected effectively with the proposed reinforcement learningcontroller.
暂无评论