The ability to detect concept drift, i.e., a structural change in the acquired datastream, and react accordingly is a major achievement for intelligent sensing units. This ability allows the unit, for actively tuning ...
详细信息
The ability to detect concept drift, i.e., a structural change in the acquired datastream, and react accordingly is a major achievement for intelligent sensing units. This ability allows the unit, for actively tuning the application, to maintain high performance, changing online the operational strategy, detecting and isolating possible occurring faults to name a few tasks. In the paper, we consider a just-in-time strategy for adaptation;the sensing unit reacts exactly when needed, i.e., when concept drift is detected. Change detection tests (CDTs), designed to inspect structural changes in industrial and environmental data, are coupled here with adaptive k-nearest neighbor and support vector machine classifiers, and suitably retrained when the change is detected. Computational complexity and memory requirements of the CDT and the classifier, due to precious limited resources in embedded sensing, are taken into account in the application design. We show that a hierarchical CDT coupled with an adaptive resource-aware classifier is a suitable tool for processing and classifying sequential streams of data.
In this paper, the H-infinity optimal control problem for a class of continuous-time nonlinear systems is investigated using event-triggered method. First, the H-infinity optimal control problem is formulated as a two...
详细信息
In this paper, the H-infinity optimal control problem for a class of continuous-time nonlinear systems is investigated using event-triggered method. First, the H-infinity optimal control problem is formulated as a two-player zero-sum (ZS) differential game. Then, an adaptive triggering condition is derived for the ZS game with an event-triggered control policy and a time-triggered disturbance policy. The event-triggered controller is updated only when the triggering condition is not satisfied. Therefore, the communication between the plant and the controller is reduced. Furthermore, a positive lower bound on the minimal intersample time is provided to avoid Zeno behavior. For implementation purpose, the event-triggered concurrent learning algorithm is proposed, where only one critic neural network (NN) is used to approximate the value function, the control policy and the disturbance policy. During the learning process, the traditional persistence of excitation condition is relaxed using the recorded data and instantaneous data together. Meanwhile, the stability of closed-loop system and the uniform ultimate boundedness (UUB) of the critic NN's parameters are proved by using Lyapunov technique. Finally, simulation results verify the feasibility to the ZS game and the corresponding H-infinity control problem.
Graph matching is a fundamental problem in pattern recognition and computer vision. In this paper we introduce a novel graph matching algorithm to find the specified number of best vertex assignments between two label...
详细信息
Graph matching is a fundamental problem in pattern recognition and computer vision. In this paper we introduce a novel graph matching algorithm to find the specified number of best vertex assignments between two labeled weighted graphs. The problem is first explicitly formulated as the minimization of a quadratic objective function and then solved by an optimization algorithm based on the recently proposed graduated nonconvexity and concavity procedure (GNCCP). Simulations on both synthetic data and real world images witness the effectiveness of the proposed method. (C) 2015 Elsevier B.V. All rights reserved
In this paper, the first probably approximately correct (PAC) algorithm for continuous deterministic systems without relying on any system dynamics is proposed. It combines the state aggregation technique and the effi...
详细信息
In this paper, the first probably approximately correct (PAC) algorithm for continuous deterministic systems without relying on any system dynamics is proposed. It combines the state aggregation technique and the efficient exploration principle, and makes high utilization of online observed samples. We use a grid to partition the continuous state space into different cells to save samples. A near-upper Q operator is defined to produce a near-upper Q function using samples in each cell. The corresponding greedy policy effectively balances between exploration and exploitation. With the rigorous analysis, we prove that there is a polynomial time bound of executing nonoptimal actions in our algorithm. After finite steps, the final policy reaches near optimal in the framework of PAC. The implementation requires no knowledge of systems and has less computation complexity. Simulation studies confirm that it is a better performance than other similar PAC algorithms.
Crosstalk is a primary defect in affecting the image quality of stereoscopic three-dimensional (3-D) displays. Until now, the crosstalk reduction methods either require extra devices or need tedious calibration proced...
详细信息
Crosstalk is a primary defect in affecting the image quality of stereoscopic three-dimensional (3-D) displays. Until now, the crosstalk reduction methods either require extra devices or need tedious calibration procedures, which require precise measurement on each display device. We propose herein a new method of synthesizing lenticular 3-D display based on the light field decomposition and optimization to minimize the crosstalk. The light field concept is introduced into lenticular 3-D display. Rays of multiview light field are back-projected to the LCD plane to form a synthetic image, with subpixel resolution. A weighted value considering all arriving rays is assigned for the subpixel to reduce crosstalk. We developed a new algorithm of ray's mergence and assignment for a smooth fusion of different views and crosstalk reduction. We also performed validation experiments which convincingly demonstrated that our new method is capable of reducing the crosstalk on synthetic graph. Compared with existing methods, our proposed new method is simple and effective, and implementation cost is low.
In this paper, we develop an integral reinforcement learning algorithm based on policy iteration to learn online the Nash equilibrium solution for a two-player zero-sum differential game with completely unknown linear...
详细信息
In this paper, we develop an integral reinforcement learning algorithm based on policy iteration to learn online the Nash equilibrium solution for a two-player zero-sum differential game with completely unknown linear continuous-time dynamics. This algorithm is a fully model-free method solving the game algebraic Riccati equation forward in time. The developed algorithm updates value function, control and disturbance policies simultaneously. The convergence of the algorithm is demonstrated to be equivalent to Newton's method. To implement this algorithm, one critic network and two action networks are used to approximate the game value function, control and disturbance policies, respectively, and the least squares method is used to estimate the unknown parameters. The effectiveness of the developed scheme is demonstrated in the simulation by designing an H-infinity state feedback controller for a power system. Note to Practitioners-Noncooperative zero-sum differential game provides an ideal tool to study multiplayer optimal decision and control problems. Existing approaches usually solve the Nash equilibrium solution by means of offline iterative computation, and require the exact knowledge of the system dynamics. However, it is difficult to obtain the exact knowledge of the system dynamics for many real-world industrial systems. The algorithm developed in this paper is a fully model-free method which solves the zero-sum differential game problem forward in time by making use of online measured data. This method is not affected by errors between an identification model and a real system, and responds fast to changes of the system dynamics. Exploration signals are required to satisfy the persistence of excitation condition to update the value function and the policies, and these signals do not affect the convergence of the learning process. The least squares method is used to obtain the approximate solution for the zero-sum games with unknown dynamics. The developed a
In this paper, we establish error bounds of adaptive dynamic programming algorithms for solving undiscounted infinite-horizon optimal control problems of discrete-time deterministic nonlinear systems. We consider appr...
详细信息
In this paper, we establish error bounds of adaptive dynamic programming algorithms for solving undiscounted infinite-horizon optimal control problems of discrete-time deterministic nonlinear systems. We consider approximation errors in the update equations of both value function and control policy. We utilize a new assumption instead of the contraction assumption in discounted optimal control problems. We establish the error bounds for approximate value iteration based on a new error condition. Furthermore, we also establish the error bounds for approximate policy iteration and approximate optimistic policy iteration algorithms. It is shown that the iterative approximate value function can converge to a finite neighborhood of the optimal value function under some conditions. To implement the developed algorithms, critic and action neural networks are used to approximate the value function and control policy, respectively. Finally, a simulation example is given to demonstrate the effectiveness of the developed algorithms.
Parallel-jaw gripper finds wide applications in various industrial sectors. In this paper, we mainly focus on the problem of form closure caging grasps of polygons with a parallel-jaw gripper equipped with four finger...
详细信息
Parallel-jaw gripper finds wide applications in various industrial sectors. In this paper, we mainly focus on the problem of form closure caging grasps of polygons with a parallel-jaw gripper equipped with four fingers. The form closure caging grasp is helpful for the fingers placements and contact region selections of a pneumatic gripper, as it is less sensitive to fingers misplacements. We firstly prove that there is always a path from a cage to a form closure grasp of the object that never breaks the cage, as long as the attractive region constructed in the configuration space has a local minimum. If such a minimum cannot be found, we further adjust the fingers arrangements to produce the form closure grasp. Meanwhile, we also develop an algorithm to compute the initial cage of the form closure grasp. Simulations of the grasping process witness the effectiveness of the above analysis results.
The optimal formation problem of multirobot systems is solved by a recurrent neural network in this paper. The desired formation is described by the shape theory. This theory can generate a set of feasible formations ...
详细信息
The optimal formation problem of multirobot systems is solved by a recurrent neural network in this paper. The desired formation is described by the shape theory. This theory can generate a set of feasible formations that share the same relative relation among robots. An optimal formation means that finding one formation from the feasible formation set, which has the minimum distance to the initial formation of the multirobot system. Then, the formation problem is transformed into an optimization problem. In addition, the orientation, scale, and admissible range of the formation can also be considered as the constraints in the optimization problem. Furthermore, if all robots are identical, their positions in the system are exchangeable. Then, each robot does not necessarily move to one specific position in the formation. In this case, the optimal formation problem becomes a combinational optimization problem, whose optimal solution is very hard to obtain. Inspired by the penalty method, this combinational optimization problem can be approximately transformed into a convex optimization problem. Due to the involvement of the Euclidean norm in the distance, the objective function of these optimization problems are nonsmooth. To solve these nonsmooth optimization problems efficiently, a recurrent neural network approach is employed, owing to its parallel computation ability. Finally, some simulations and experiments are given to validate the effectiveness and efficiency of the proposed optimal formation approach.
In this study, a neural-network-based online learning algorithm is established to solve the finite horizon linear quadratic tracking (FHLQT) problem for partially unknown continuous-time systems. An augmented problem ...
详细信息
In this study, a neural-network-based online learning algorithm is established to solve the finite horizon linear quadratic tracking (FHLQT) problem for partially unknown continuous-time systems. An augmented problem is constructed with an augmented state which consists of the system state and the reference trajectory. The authors obtain a solution for the augmented problem which is equivalent to the standard solution of the FHLQT problem. To solve the augmented problem with partially unknown system dynamics, they develop a time-varying Riccati equation. A critic neural network is used to approximate the value function and an online learning algorithm is established using the policy iteration technique to solve the time-varying Riccati equation. An integral policy iteration method and an online tuning law are used when the algorithm is implemented without the knowledge of the system drift dynamics and the command generator dynamics. A simulation example is given to show the effectiveness of the established algorithm.
暂无评论