In this paper, we solve the zero-sum game problems for discrete-time affine nonlinear systems with known dynamics via iterative adaptive dynamic programming algorithm. First, a greedy heuristic dynamic programming ite...
详细信息
In this paper, we solve the zero-sum game problems for discrete-time affine nonlinear systems with known dynamics via iterative adaptive dynamic programming algorithm. First, a greedy heuristic dynamic programming iteration algorithm is developed to solve the zero-sum game problems, which can be used to solve the Hamilton-Jacobi-Isaacs equation associated with H-infinity optimal regulation control problems. The convergence analysis in terms of value function and control policy is provided. To facilitate the implementation of the algorithm, three neural networks are used to approximate the control policy, the disturbance policy, and the value function, respectively. Then, we extend the algorithm to R, optimal tracking control problems through system transformation. Finally, two simulation examples are presented to demonstrate the effectiveness of the proposed scheme. (C) 2013 Elsevier B.V. All rights reserved.
In this paper, a data-based method is developed for analyzing the controllability and observability of discrete-time linear systems in noisy environment. This method uses measured data to estimate the controllability ...
详细信息
In this paper, a data-based method is developed for analyzing the controllability and observability of discrete-time linear systems in noisy environment. This method uses measured data to estimate the controllability matrix and the observability matrix without identifying system models. The unbiasedness and consistency of this estimate with measurement noise and system noise are proven, respectively. As the estimated error of system parameters will not accumulate in calculating the controllability matrix and observability matrix, this method has a higher precision than traditional methods, especially in high-dimensional state space. In the simulation, the advantages of the data-based method in accuracy and convergence are illustrated. (C) 2014 Elsevier Inc. All rights reserved.
Welcome to the first issue of the IEEE Transactions on Computational Social systems (TCSS) for 2018, and Happy New Year to everyone. According to the Chinese lunar calendar, this is the year of the Dog, which in Chine...
Welcome to the first issue of the IEEE Transactions on Computational Social systems (TCSS) for 2018, and Happy New Year to everyone. According to the Chinese lunar calendar, this is the year of the Dog, which in Chinese culture represents trust, loyalty, dedication, and energy. As such, I would like to take this opportunity to express my best wishes of a happy, healthy, and high-producing 2018 to each and every one of our readers, reviewers, and editors.
In this paper, the H-infinity optimal control problem for a class of continuous-time nonlinear systems is investigated using event-triggered method. First, the H-infinity optimal control problem is formulated as a two...
详细信息
In this paper, the H-infinity optimal control problem for a class of continuous-time nonlinear systems is investigated using event-triggered method. First, the H-infinity optimal control problem is formulated as a two-player zero-sum (ZS) differential game. Then, an adaptive triggering condition is derived for the ZS game with an event-triggered control policy and a time-triggered disturbance policy. The event-triggered controller is updated only when the triggering condition is not satisfied. Therefore, the communication between the plant and the controller is reduced. Furthermore, a positive lower bound on the minimal intersample time is provided to avoid Zeno behavior. For implementation purpose, the event-triggered concurrent learning algorithm is proposed, where only one critic neural network (NN) is used to approximate the value function, the control policy and the disturbance policy. During the learning process, the traditional persistence of excitation condition is relaxed using the recorded data and instantaneous data together. Meanwhile, the stability of closed-loop system and the uniform ultimate boundedness (UUB) of the critic NN's parameters are proved by using Lyapunov technique. Finally, simulation results verify the feasibility to the ZS game and the corresponding H-infinity control problem.
The ability to detect concept drift, i.e., a structural change in the acquired datastream, and react accordingly is a major achievement for intelligent sensing units. This ability allows the unit, for actively tuning ...
详细信息
The ability to detect concept drift, i.e., a structural change in the acquired datastream, and react accordingly is a major achievement for intelligent sensing units. This ability allows the unit, for actively tuning the application, to maintain high performance, changing online the operational strategy, detecting and isolating possible occurring faults to name a few tasks. In the paper, we consider a just-in-time strategy for adaptation;the sensing unit reacts exactly when needed, i.e., when concept drift is detected. Change detection tests (CDTs), designed to inspect structural changes in industrial and environmental data, are coupled here with adaptive k-nearest neighbor and support vector machine classifiers, and suitably retrained when the change is detected. Computational complexity and memory requirements of the CDT and the classifier, due to precious limited resources in embedded sensing, are taken into account in the application design. We show that a hierarchical CDT coupled with an adaptive resource-aware classifier is a suitable tool for processing and classifying sequential streams of data.
Graph matching is a fundamental problem in pattern recognition and computer vision. In this paper we introduce a novel graph matching algorithm to find the specified number of best vertex assignments between two label...
详细信息
Graph matching is a fundamental problem in pattern recognition and computer vision. In this paper we introduce a novel graph matching algorithm to find the specified number of best vertex assignments between two labeled weighted graphs. The problem is first explicitly formulated as the minimization of a quadratic objective function and then solved by an optimization algorithm based on the recently proposed graduated nonconvexity and concavity procedure (GNCCP). Simulations on both synthetic data and real world images witness the effectiveness of the proposed method. (C) 2015 Elsevier B.V. All rights reserved
This paper proposes a novel leg orthosis for lower limb rehabilitation robots of the sitting/lying type. It consists of three joint mechanisms: hip, knee and ankle, and two sets of links: thigh and crus. Each driving ...
详细信息
This paper proposes a novel leg orthosis for lower limb rehabilitation robots of the sitting/lying type. It consists of three joint mechanisms: hip, knee and ankle, and two sets of links: thigh and crus. Each driving motor is located close to the associated joint and the rotational axis of each joint mechanism is unique and stable. These features make it outperform the similar mechanisms in stability and dynamic performance. Different forms of eccentric slider-crank mechanisms are applied in the three joint mechanisms, respectively, such that they can be optimized independently. The optimization problems for the hip and knee joint mechanisms, characterized as strongly nonlinear, are developed respectively. Then, a particle swarm optimization algorithm is used to obtain the optimal solutions, which are subsequently validated by comprehensive comparisons. Moreover, the kinematics necessary for motion control and trajectory tracking are investigated, which denote the relationships between the displacements and velocities of the joint mechanisms, lead screws and the end effector. Finally, this paper illustrates the feasibility of the application of the leg orthosis to actual rehabilitation exercises by a simulation example. (C) 2014 Elsevier Ltd. All rights reserved.
In this paper, the first probably approximately correct (PAC) algorithm for continuous deterministic systems without relying on any system dynamics is proposed. It combines the state aggregation technique and the effi...
详细信息
In this paper, the first probably approximately correct (PAC) algorithm for continuous deterministic systems without relying on any system dynamics is proposed. It combines the state aggregation technique and the efficient exploration principle, and makes high utilization of online observed samples. We use a grid to partition the continuous state space into different cells to save samples. A near-upper Q operator is defined to produce a near-upper Q function using samples in each cell. The corresponding greedy policy effectively balances between exploration and exploitation. With the rigorous analysis, we prove that there is a polynomial time bound of executing nonoptimal actions in our algorithm. After finite steps, the final policy reaches near optimal in the framework of PAC. The implementation requires no knowledge of systems and has less computation complexity. Simulation studies confirm that it is a better performance than other similar PAC algorithms.
In this paper, we develop an integral reinforcement learning algorithm based on policy iteration to learn online the Nash equilibrium solution for a two-player zero-sum differential game with completely unknown linear...
详细信息
In this paper, we develop an integral reinforcement learning algorithm based on policy iteration to learn online the Nash equilibrium solution for a two-player zero-sum differential game with completely unknown linear continuous-time dynamics. This algorithm is a fully model-free method solving the game algebraic Riccati equation forward in time. The developed algorithm updates value function, control and disturbance policies simultaneously. The convergence of the algorithm is demonstrated to be equivalent to Newton's method. To implement this algorithm, one critic network and two action networks are used to approximate the game value function, control and disturbance policies, respectively, and the least squares method is used to estimate the unknown parameters. The effectiveness of the developed scheme is demonstrated in the simulation by designing an H-infinity state feedback controller for a power system. Note to Practitioners-Noncooperative zero-sum differential game provides an ideal tool to study multiplayer optimal decision and control problems. Existing approaches usually solve the Nash equilibrium solution by means of offline iterative computation, and require the exact knowledge of the system dynamics. However, it is difficult to obtain the exact knowledge of the system dynamics for many real-world industrial systems. The algorithm developed in this paper is a fully model-free method which solves the zero-sum differential game problem forward in time by making use of online measured data. This method is not affected by errors between an identification model and a real system, and responds fast to changes of the system dynamics. Exploration signals are required to satisfy the persistence of excitation condition to update the value function and the policies, and these signals do not affect the convergence of the learning process. The least squares method is used to obtain the approximate solution for the zero-sum games with unknown dynamics. The developed a
In this paper, we establish error bounds of adaptive dynamic programming algorithms for solving undiscounted infinite-horizon optimal control problems of discrete-time deterministic nonlinear systems. We consider appr...
详细信息
In this paper, we establish error bounds of adaptive dynamic programming algorithms for solving undiscounted infinite-horizon optimal control problems of discrete-time deterministic nonlinear systems. We consider approximation errors in the update equations of both value function and control policy. We utilize a new assumption instead of the contraction assumption in discounted optimal control problems. We establish the error bounds for approximate value iteration based on a new error condition. Furthermore, we also establish the error bounds for approximate policy iteration and approximate optimistic policy iteration algorithms. It is shown that the iterative approximate value function can converge to a finite neighborhood of the optimal value function under some conditions. To implement the developed algorithms, critic and action neural networks are used to approximate the value function and control policy, respectively. Finally, a simulation example is given to demonstrate the effectiveness of the developed algorithms.
暂无评论