This paper considers the value iteration algorithms of stochastic zero-sum linear quadratic games with unkown ***-policy and off-policy learning algorithms are developed to solve the stochastic zero-sum games,where th...
详细信息
This paper considers the value iteration algorithms of stochastic zero-sum linear quadratic games with unkown ***-policy and off-policy learning algorithms are developed to solve the stochastic zero-sum games,where the system dynamics is not *** analyzing the value function iterations,the convergence of the model-based algorithm is *** equivalence of several types of value iteration algorithms is *** effectiveness of model-free algorithms is demonstrated by a numerical example.
In this paper, the attitude tracking and load relief control problems against wind disturbances and uncertain aerodynamics as well as the engine thrust of launch vehicles are ***, a framework of Compensated Accelerati...
详细信息
In this paper, the attitude tracking and load relief control problems against wind disturbances and uncertain aerodynamics as well as the engine thrust of launch vehicles are ***, a framework of Compensated Acceleration Feedback based Active Disturbance Rejection control(CAF-ADRC) is established to achieve both desired attitude tracking and load relief performances. In particular, the total disturbance that includes the effects caused by both aerocoefficient perturbations and disturbances is estimated by constructing an Extended State Observer(ESO) to achieve attitude tracking. Furthermore, combined with the normal acceleration due to the engine thrust, the accelerometer measurement is also compensated to enhance the load relief ***, the quantitative analysis of ESO and the entire closed-loop system are studied. It can be concluded that the desired attitude tracking and load relief performances can be achieved simultaneously under the proposed approach. Besides, tuning laws of the proposed approach are systematically given, which are divided into ESO, Proportional Derivative(PD) and Compensated Acceleration Feedback(CAF) modules. Moreover, the performances under CAF-ADRC approach can be better than those under CAF based PD(CAF-PD) approach by tuning load relief ***, the approach presented is applied to a typical control problem of launch vehicles with wind disturbances and parameter uncertainties.
We consider the sparse identification of multivariate ARX systems, i.e., to recover the zero elements of the unknown parameter matrix. We propose a two-step algorithm, where in the first step the stochastic gradient (...
详细信息
We consider the sparse identification of multivariate ARX systems, i.e., to recover the zero elements of the unknown parameter matrix. We propose a two-step algorithm, where in the first step the stochastic gradient (SG) algorithm is applied to obtain initial estimates of the unknown parameter matrix and in the second step an optimization criterion is introduced for the sparse identification of multivariate ARX systems. Under mild conditions, we prove that by minimizing the criterion function, the zero elements of the unknown parameter matrix can be recovered with a finite number of observations. The performance of the algorithm is testified through a simulation example.
PID(proportional-integral-derivative)control is recognized to be the most widely and successfully employed control strategy by ***,there are limited theoretical investigations explaining the rationale why PID can work...
详细信息
PID(proportional-integral-derivative)control is recognized to be the most widely and successfully employed control strategy by ***,there are limited theoretical investigations explaining the rationale why PID can work so well when dealing with nonlinear uncertain *** paper continues the previous researches towards establishing a theoretical foundation of PID control,by studying the regulation problem of PID control for nonaffine uncertain nonlinear stochastic *** be specific,a three dimensional parameter set will be constructed explicitly based on some prior knowledge on bounds of partial derivatives of both the drift and diffusion *** will be shown that the closed-loop control system will achieve exponential stability in the mean square sense under PID control,whenever the controller parameters are chosen from the constructed parameter ***,similar results can also be obtained for PD(PI)control in some special cases.A numerical example will be provided to illustrate the theoretical results.
Tensor networks have been a powerful tool in simulating many-body physics and have recently gained recognition in the machine learning community due to their remarkable representation capabilities. However, using tens...
详细信息
Tensor networks have been a powerful tool in simulating many-body physics and have recently gained recognition in the machine learning community due to their remarkable representation capabilities. However, using tensor networks to address the problem of clustering with an indeterminate number of clusters has yet to be explored.
Estimation and control problems with binary-valued observations exist widely in practical ***,most of the related works are devoted to finite impulse response(FIR for short)systems,and the theoretical problem of infin...
详细信息
Estimation and control problems with binary-valued observations exist widely in practical ***,most of the related works are devoted to finite impulse response(FIR for short)systems,and the theoretical problem of infinite impulse response(IIR for short)systems has been less *** study the estimation problems of IIR systems with binary-valued observations,the authors introduce a projected recursive estimation algorithm and analyse its global convergence properties,by using the stochastic Lyapunov function methods and the limit theory on double array *** is shown that the estimation algorithm has similar convergence results as those for FIR systems under a weakest possible non-persistent excitation ***,the upper bound for the accumulated regret of adaptive prediction is also established without resorting to any excitation condition.
This paper investigates positivity and stability problems of timescale-type delayed linear singular systems(LSSs). The existing results put an extremely strict constraint on the time-delay function. By introducing a n...
详细信息
This paper investigates positivity and stability problems of timescale-type delayed linear singular systems(LSSs). The existing results put an extremely strict constraint on the time-delay function. By introducing a novel function, this constraint is successfully removed, which generalizes the scope of the considered systems. Then, some necessary and sufficient criteria are proposed for the positivity of LSSs with bounded and infinite time-varying delays. Finally, the exponential(asymptotical) stability of LSSs with bounded(infinite) time-varying delays is analyzed. The derived results are also applicable to timescale-type differential-difference systems(DDSs). Compared with the existing stability criteria of DDSs with bounded time-varying delays, the strict limit on the parameter related to the convergence rate is eliminated. Hence,the conservatism of the existing results can be reduced. Moreover, when investigating stability of DDSs with infinite time-varying delays, this paper proposes a less conservative stability theorem. To illustrate the validity of the derived results, an example is presented regarding LSSs with bounded and infinite time-varying delays.
This paper considers a distributed resource allocation problem over time-varying networks. The objective of each agent in the network is to optimize the sum of separable convex functions subjected to resource constrai...
详细信息
This paper considers a distributed resource allocation problem over time-varying networks. The objective of each agent in the network is to optimize the sum of separable convex functions subjected to resource constraints by observing its local objective function and the information exchanged with its adjacent neighbors. Thus, the problem lies in a distributed framework. In existing literature dealing with similar problems, the measurement of the gradients/subgradients of the objective functions has been applied in the algorithm design. In this paper, by adding stochastic dithers to the local objective functions and constructing randomized differences, we propose a distributed gradient-free algorithm for solving the problem, and show that the algorithm is strongly convergent; that is, the estimates generated from each agent almost certainly converge to the optimal resource allocation solution of the network. Finally, the effectiveness of the algorithm is validated by conducting numerical experiments.
This paper studies the consensus of switched multi-agent systems(MAS) with binary-valued communications. Different from the existing studies on switched MAS considering precise observations,each agent studied in this ...
详细信息
This paper studies the consensus of switched multi-agent systems(MAS) with binary-valued communications. Different from the existing studies on switched MAS considering precise observations,each agent studied in this research only receives binary-valued information with stochastic noises from its neighbors' states. Further, unlike the existing studies on MAS with binary-valued information in a fixed topology, in this paper, we consider the jointly connected undirected graphs, each of which switches with non-zero probability. The consensus algorithm comprises of two stages: first, the connected agents employ a recursive projection algorithm to estimate their neighbors' states based on the binary-valued communications;second, the control law of the connected agents is developed based on the estimations to upgrade their *** is proved that both the speed of the estimation convergence to the real states and the consensus speed of the states can achieve O(1/t) when the iteration step is given a proper value. Furthermore, the results indicate that the larger the value of the lowest probability that a graph emerges with, the more easily the consensus could be achieved. Finally, a simulation is presented to demonstrate the theoretical analysis.
This paper studies a dynamical system that models the free recall dynamics of working *** model is an attractor neural network with n modules,named hypercolumns,and each module consists of m *** mild conditions on the...
详细信息
This paper studies a dynamical system that models the free recall dynamics of working *** model is an attractor neural network with n modules,named hypercolumns,and each module consists of m *** mild conditions on the connection weights between minicolumns,the authors investigate the long-term evolution behavior of the model,namely the existence and stability of equilibria and limit *** authors also give a critical value in which Hopf bifurcation ***,the authors give a sufficient condition under which this model has a globally asymptotically stable equilibrium consisting of synchronized minicolumn states in each hypercolumn,which implies that in this case recalling is *** simulations are provided to illustrate the proposed theoretical ***,a numerical example the authors give suggests that patterns can be stored in not only equilibria and limit cycles,but also strange attractors(or chaos).
暂无评论