Optimal feedback design of dynamical systems is a significant topic in automatic control community and information *** for nonlinear systems,optimal control design always leads to coping with the nonlinear Hamilton-Ja...
详细信息
Optimal feedback design of dynamical systems is a significant topic in automatic control community and information *** for nonlinear systems,optimal control design always leads to coping with the nonlinear Hamilton-Jacobi-Bellman ***,it is intractable to acquire the analytic solution of the nonlinear Hamilton-JacobiBellman equation for general nonlinear systems.
The problem of generating optimal paths for curvature-constrained unmanned aerial vehicles (UAVs) performing surveillance of multiple ground targets is addressed in this paper. UAVs are modeled as Dubins vehicles so...
详细信息
The problem of generating optimal paths for curvature-constrained unmanned aerial vehicles (UAVs) performing surveillance of multiple ground targets is addressed in this paper. UAVs are modeled as Dubins vehicles so that the constraints of UAVs' minimal turning radius can be taken into account. In view of the effective surveillance range of the sensors equipped on UAVs, the problem is formulated as a Dubins traveling salesman problem with neighborhood (DTSPN). Considering its prohibitively high computational complexity, the Dubins paths in the sense of terminal heading relaxation are introduced to simplify the calculation of the Dubins distance, and a boundary-based encoding scheme is proposed to determine the visiting point of every target neighborhood. Then, an evolutionary algorithm is used to derive the optimal Dubins tour. To further enhance the quality of the solutions, a local search strategy based on approximate gradient is employed to improve the visiting points of target neighborhoods. Finally, by a minor modification to the individual encoding, the algorithm is easily extended to deal with other two more sophisticated DTSPN variants (multi-UAV scenario and multiple groups of targets scenario). The performance of the algorithm is demonstrated through comparative experiments with other two state-of-the-art DTSPN algorithms identified in literature. Numerical simulations exhibit that the algorithm proposed in this paper can find high-quality solutions to the DTSPN with lower computational cost and produce significantly improved performance over the other algorithms.
A model-based offline policy iteration(PI) algorithm and a model-free online Q-learning algorithm are proposed for solving fully cooperative linear quadratic dynamic games. The PI-based adaptive Q-learning method can ...
详细信息
A model-based offline policy iteration(PI) algorithm and a model-free online Q-learning algorithm are proposed for solving fully cooperative linear quadratic dynamic games. The PI-based adaptive Q-learning method can learn the feedback Nash equilibrium online using the state samples generated by behavior policies, without sending inquiries to the system model. Unlike the existing Q-learning methods, this novel Q-learning algorithm executes both policy evaluation and policy improvement in an adaptive *** prove the convergence of the offline PI algorithm by proving its equivalence to Newton's method while solving the game algebraic Riccati equation(GARE). Furthermore, we prove that the proposed Q-learning method will converge to the Nash equilibrium under a small learning rate if the method satisfies certain persistence of excitation conditions, which can be easily met by suitable behavior policies. Our simulation results demonstrate the good performance of the proposed online adaptive Q-learning algorithm.
With the development of human robot interaction technologies, haptic interfaces are widely used for 3 D applications to provide the sense of touch. These interfaces have been utilized in medical simulation, virtual as...
详细信息
With the development of human robot interaction technologies, haptic interfaces are widely used for 3 D applications to provide the sense of touch. These interfaces have been utilized in medical simulation, virtual assembly and remote manipulation tasks. However, haptic interface design and control are still critical problems to reproduce the highly sensitive touch sense of humans. This paper presents the development and evaluation of a7-DOF(degree of freedom) haptic interface based on the modified delta mechanism. Firstly, both kinematics and dynamics of the modified mechanism are analyzed and presented. A novel gravity compensation algorithm based on the physical model is proposed and validated in simulation. A haptic controller is proposed based on the forward kinematics and the gravity compensation algorithm. To evaluate the control performance of the haptic interface, a prototype has been implemented. Three kinds of experiments: gravity compensation, static response and force tracking are performed respectively. The experimental results show that the mean error of the gravity compensation is less than 0.7 N and the maximum continuous force along the axis can be up to 6 N. This demonstrates the good performance of the proposed haptic interface.
In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformat...
详细信息
In this paper, an optimal tracking control scheme is proposed for a class of discrete-time chaotic systems using the approximation-error-based adaptive dynamic programming (ADP) algorithm. Via the system transformation, the optimal tracking problem is transformed into an optimal regulation problem, and then the novel optimal tracking control method is proposed. It is shown that for the iterative ADP algorithm with finite approximation error, the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance index functions under some convergence conditions. Two examples are given to demonstrate the validity of the proposed optimal tracking control scheme for chaotic systems.
Broad learning system(BLS) has been proposed as an alternative method of deep learning. The architecture of BLS is that the input is randomly mapped into series of feature spaces which form the feature nodes, and the ...
详细信息
Broad learning system(BLS) has been proposed as an alternative method of deep learning. The architecture of BLS is that the input is randomly mapped into series of feature spaces which form the feature nodes, and the output of the feature nodes are expanded broadly to form the enhancement nodes, and then the output weights of the network can be determined analytically. The most advantage of BLS is that it can be learned incrementally without a retraining process when there comes new input data or neural nodes. It has been proven that BLS can overcome the inadequacies caused by training a large number of parameters in gradient-based deep learning algorithms. In this paper, a novel variant graph regularized broad learning system(GBLS) is proposed. Taking account of the locally invariant property of data, which means the similar images may share similar properties, the manifold learning is incorporated into the objective function of the standard BLS. In GBLS, the output weights are constrained to learn more discriminative information,and the classification ability can be further enhanced. Several experiments are carried out to verify that our proposed GBLS model can outperform the standard BLS. What is more, the GBLS also performs better compared with other state-of-the-art image recognition methods in several image databases.
In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic no...
详细信息
In this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Qlearning algorithm", is developed to solve the optimal control problems for discrete-time deterministic nonlinear systems. The idea is to use an iterative adaptive dynamic programming(ADP) technique to construct the iterative control law which optimizes the iterative Q function. When the optimal Q function is obtained, the optimal control law can be achieved by directly minimizing the optimal Q function, where the mathematical model of the system is not necessary. Convergence property is analyzed to show that the iterative Q function is monotonically non-increasing and converges to the solution of the optimality equation. It is also proven that any of the iterative control laws is a stable control law. Neural networks are employed to implement the policy iteration based deterministic Q-learning algorithm, by approximating the iterative Q function and the iterative control law, respectively. Finally, two simulation examples are presented to illustrate the performance of the developed algorithm.
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. Acco...
详细信息
We develop an optimal tracking control method for chaotic system with unknown dynamics and disturbances. The method allows the optimal cost function and the corresponding tracking control to update synchronously. According to the tracking error and the reference dynamics, the augmented system is constructed. Then the optimal tracking control problem is defined. The policy iteration (PI) is introduced to solve the rain-max optimization problem. The off-policy adaptive dynamic programming (ADP) algorithm is then proposed to find the solution of the tracking Hamilton-Jacobi- Isaacs (HJI) equation online only using measured data and without any knowledge about the system dynamics. Critic neural network (CNN), action neural network (ANN), and disturbance neural network (DNN) are used to approximate the cost function, control, and disturbance. The weights of these networks compose the augmented weight matrix, and the uniformly ultimately bounded (UUB) of which is proven. The convergence of the tracking error system is also proven. Two examples are given to show the effectiveness of the proposed synchronous solution method for the chaotic system tracking problem.
This study proposes a learning impedance controller comprising a proportional feedback control term, a composite-learning-based uncertainty estimation term, and a robot-environment interaction control term. The impeda...
详细信息
This study proposes a learning impedance controller comprising a proportional feedback control term, a composite-learning-based uncertainty estimation term, and a robot-environment interaction control term. The impedance control problem is converted into a particular reference-trajectory tracking problem based on a generated reference trajectory. The proposed controller ensures the exponential convergence of the auxiliary tracking error and the uncertainty estimation error. The interaction control term improves the transient control performance through suppression/encouragement of the incorrect/correct robot *** composite-learning update law enhances the transient and steady-statecontrol performances based on the exponential convergence of the uncertainty estimation error and auxiliary tracking error. Finally, the effectiveness and advantages of the proposed impedance controller are validated by theoretical analysis and simulations on a parallel robot.
Road boundary detection is essential for autonomous vehicle localization and decision-making,especially under GPS signal loss and lane *** road boundary detection in structural environments,obstacle occlusions and lar...
详细信息
Road boundary detection is essential for autonomous vehicle localization and decision-making,especially under GPS signal loss and lane *** road boundary detection in structural environments,obstacle occlusions and large road curvature are two significant ***,an effective and fast solution for these problems has remained *** solve these problems,a speed and accuracy tradeoff method for LiDAR-based road boundary detection in structured environments is *** proposed method consists of three main stages:1)a multi-feature based method is applied to extract feature points;2)a road-segmentation-line-based method is proposed for classifying left and right feature points;3)an iterative Gaussian Process Regression(GPR)is employed for filtering out false points and extracting boundary *** demonstrate the effectiveness of the proposed method,KITTI datasets is used for comprehensive experiments,and the performance of our approach is tested under different road *** experiments show the roadsegmentation-line-based method can classify left,and right feature points on structured curved roads,and the proposed iterative Gaussian Process Regression can extract road boundary points on varied road shapes and traffic ***,the proposed road boundary detection method can achieve real-time performance with an average of 70.5 ms per frame.
暂无评论