In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic no...
详细信息
In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic nonlinear systems. The idea is to use an iterative adaptive dynamic programming(ADP) technique to construct the iterative control law which optimizes the iterative Q function. When the optimal Q function is obtained, the optimal control law can be achieved by directly minimizing the optimal Q function, where the mathematical model of the system is not necessary. Convergence property is analyzed to show that the iterative Q function is monotonically non-increasing and converges to the solution of the optimality equation. It is also proven that any of the iterative control laws is a stable control law. Neural networks are employed to implement the policy iteration based deterministic Q-learning algorithm, by approximating the iterative Q function and the iterative control law, respectively. Finally, two simulation examples are presented to illustrate the performance of the developed algorithm.
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. Acco...
详细信息
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration (PI) is introduced to solve the rain-max optimization problem. The off-policy adaptive dynamic programming (ADP) algorithm is then proposed to find the solution of the tracking Hamilton-Jacobi- Isaacs (HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network (CNN), action neural network (ANN), and disturbance neural network (DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded (UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem.
This study proposes a learning impedance controller comprising a proportional feedback control term, a composite-learning-based uncertainty estimation term, and a robot-environment interaction control term. The impeda...
详细信息
This study proposes a learning impedance controller comprising a proportional feedback control term, a composite-learning-based uncertainty estimation term, and a robot-environment interaction control term. The impedance control problem is converted into a particular reference-trajectory tracking problem based on a generated reference trajectory. The proposed controller ensures the exponential convergence of the auxiliary tracking error and the uncertainty estimation error. The interaction control term improves the transient control performance through suppression/encouragement of the incorrect/correct robot *** composite-learning update law enhances the transient and steady-statecontrol performances based on the exponential convergence of the uncertainty estimation error and auxiliary tracking error. Finally, the effectiveness and advantages of the proposed impedance controller are validated by theoretical analysis and simulations on a parallel robot.
We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the perfo...
详细信息
We develop an online adaptive dynamic programming (ADP) based optimal control scheme for continuous-time chaotic systems. The idea is to use the ADP algorithm to obtain the optimal control input that makes the performance index function reach an optimum. The expression of the performance index function for the chaotic system is first presented. The online ADP algorithm is presented to achieve optimal control. In the ADP structure, neural networks are used to construct a critic network and an action network, which can obtain an approximate performance index function and the control input, respectively. It is proven that the critic parameter error dynamics and the closed-loop chaotic systems are uniformly ultimately bounded exponentially. Our simulation results illustrate the performance of the established optimal control method.
This paper proposes a long-term forecasting scheme and implementation method based on the interval type-2 fuzzy sets theory for traffic flow data. The type-2 fuzzy sets have advantages in modeling uncertainties becaus...
详细信息
This letter proposes a robust stochastic differential equation approach for learning point-to-point motions in an adversarial way. The proposed stochastic dynamical model combines the advantages of the stochastic diff...
详细信息
This letter proposes a robust stochastic differential equation approach for learning point-to-point motions in an adversarial way. The proposed stochastic dynamical model combines the advantages of the stochastic differential equation and the transformer-like function together to achieve both robustness and accuracy of the learning. The adversarial training method is proposed to simplify the way of updating the parameters of the model. The state of the proposed stochastic dynamical system is mathematically proved to converge asymptotically in the mean square sense, and it has been experimentally validated on the LASA dataset and by the trajectory-programming task of the Franka Emika robot. The experimental results show that: (1) the adversarial training method helps the model to achieve higher reproduction accuracy;(2) the trajectories generated by the proposed model achieve higher accuracy in both the noise-free condition (by approximately 14.9%) and the noisy condition (by approximately 17.8%) compared with the state-of-the-art methods in terms of the similarity to the demonstration;and (3) the proposed approach can learn smoother trajectories even if the observations are contaminated by noises.
In this paper, we discuss a novel graph matching problem, namely the parameterized Koopmans- Beckmann's graph matching (KBGMw). KBGMw is defined by a weighted linear combination of a series of Koopmans-Beckmann...
详细信息
In this paper, we discuss a novel graph matching problem, namely the parameterized Koopmans- Beckmann's graph matching (KBGMw). KBGMw is defined by a weighted linear combination of a series of Koopmans-Beckmann's graph matching. First, we show that KBGMw can be taken as a special case of the parameterized Lawler's graph matching, subject to certain conditions. Second, based on structured SVM, we propose a supervised learning method for automatically estimating the parameters of KBGMw. Experimental results on both synthetic and real image matching data sets show that the proposed method achieves relatively better performances, even superior to some deep learning methods. (c) 2020 Elsevier B.V. All rights reserved.
This paper focuses on the gaits planning method of the backward swimming for unsymmetrical structure bio-inspired robotic fish. Based on the differences between the anguilliform mode and carangiform mode swimming, a m...
详细信息
This paper focuses on the gaits planning method of the backward swimming for unsymmetrical structure bio-inspired robotic fish. Based on the differences between the anguilliform mode and carangiform mode swimming, a method for searching gaits of backward swimming was proposed to plan the motion of the developed carangiform robotic fish. The body envelope of European eel's backward swimming was mimicked according to the freely swimming model, which was proposed to analyze the propulsion produced by the undulation of the multi-link tail. Finally, simulations and experiments were conducted to demonstrate the gaits searching method for the bio-inspired carangiform robotic fish.
Text information contained in scene images is very helpful for high-level image understanding. In this study, the authors propose to learn co-occurrence of local strokes for scene text recognition by using a spatialit...
详细信息
Text information contained in scene images is very helpful for high-level image understanding. In this study, the authors propose to learn co-occurrence of local strokes for scene text recognition by using a spatiality embedded dictionary (SED). Unlike spatial pyramid partitioning images into grids to incorporate spatial information, the authors SED associates every codeword with a particular response region and introduces more precise spatial information for robust character recognition. After localised soft coding and max pooling of the first layer, a sparse dictionary is learned to model co-occurrence of several local strokes, which further improves classification performance. Experimental results on two scene character recognition datasets ICDAR2003 and CHARS74 K demonstrate that their character recognition method outperforms state-of-the-art methods. Besides, competitive word recognition results are also reported for four benchmark word recognition datasets ICDAR2003, ICDAR2011, ICDAR2013 and street view text when combining their character recognition method with a conditional random field language model.
Soft pressure sensors have recently attracted considerable attention because of their applications in human-machine interface, soft robotics, and prosthetics. However, there remain some challenges in achieving satisfa...
详细信息
Soft pressure sensors have recently attracted considerable attention because of their applications in human-machine interface, soft robotics, and prosthetics. However, there remain some challenges in achieving satisfactory performance (e.g., high sensitivity, wide sensing range, high stability) for soft pressure sensors. This article reports an intentional blocking based photoelectric pressure sensor. Two different blocking methods are investigated: the single-row-pyramid blocking and the double-row-pyramid blocking. The sensor has a simple structure, which is made of a light-emitting diode, photosensitive element, and silicone sensor shell. Experiments demonstrate that the sensor has a high sensitivity (the maximum sensitivity is 48.07 kPa(-1), and the minimum measurement pressure is 0.8 Pa), large pressure-sensing range (the sensing range is up to 120 kPa), superior stability (a drift about 0.4% over 12,130 repetitive cycles at 0-80 kPa), low drift (< +/- 0.2% in different 3-day testing), negligible hysteresis, and high signal-to-noise ratio (over 55 dB). By mounting the pressure sensor at the end of a robotic arm, the robot can detect subtle collisions (such as touching a balloon through a pinpoint). In addition, this article fabricates a tactile glove based on the proposed pressure sensor and shows the application of this glove for music playing and object weighing. This study provides a new structure for photoelectric sensors to increase sensitivity and also provides a more convenient way to fabricate photoelectric pressure sensors.
暂无评论