检索结果-内蒙古大学图书馆

On-Policy and Off-Policy Value Iteration Algorithms for Stochastic Zero-Sum Dynamic Games

Journal of systems science & Complexity 2025年第1期38卷 421-435页

作者： GUO Liangyuan WANG Bing-Chang ZHANG Ji-Feng School of Control Science and Engineering Shandong UniversityJinan 250063China School of Automation and Electrical Engineering Zhongyuan University of TechnologyZhengzhou 450007China Key Laboratory of Systems and Control Academy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing 100190China

This paper considers the value iteration algorithms of stochastic zero-sum linear quadratic games with unkown ***-policy and off-policy learning algorithms are developed to solve the stochastic zero-sum games,where the system dynamics is not *** analyzing the value function iterations,the convergence of the model-based algorithm is *** equivalence of several types of value iteration algorithms is *** effectiveness of model-free algorithms is demonstrated by a numerical example.

关键词： Approximate dynamic programming on-policy off-policy stochastic zero-sum games valueiteration

来源：评论

学校读者我要写书评

暂无评论

Compensated acceleration feedback based active disturbance rejection control for launch vehicles

引用

Chinese Journal of Aeronautics 2024年第4期37卷 464-478页

作者： Xiaoyan ZHANG Wenchao XUE Zibo LIU Ran ZHANG Huifeng LI Key Laboratory of Systems and Control Institute of Systems ScienceAcademy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing 100190China School of Mathematical Sciences University of Chinese Academy of SciencesBeijing 100049China School of Astronautics Beihang UniversityBeijing 100191China

In this paper, the attitude tracking and load relief control problems against wind disturbances and uncertain aerodynamics as well as the engine thrust of launch vehicles are ***, a framework of Compensated Acceleration Feedback based Active Disturbance Rejection control(CAF-ADRC) is established to achieve both desired attitude tracking and load relief performances. In particular, the total disturbance that includes the effects caused by both aerocoefficient perturbations and disturbances is estimated by constructing an Extended State Observer(ESO) to achieve attitude tracking. Furthermore, combined with the normal acceleration due to the engine thrust, the accelerometer measurement is also compensated to enhance the load relief ***, the quantitative analysis of ESO and the entire closed-loop system are studied. It can be concluded that the desired attitude tracking and load relief performances can be achieved simultaneously under the proposed approach. Besides, tuning laws of the proposed approach are systematically given, which are divided into ESO, Proportional Derivative(PD) and Compensated Acceleration Feedback(CAF) modules. Moreover, the performances under CAF-ADRC approach can be better than those under CAF based PD(CAF-PD) approach by tuning load relief ***, the approach presented is applied to a typical control problem of launch vehicles with wind disturbances and parameter uncertainties.

关键词： Launch vehicles Uncertainty analysis Active disturbance rejection control(ADRC) Load relief control Extended state observer(ESO)

来源：评论

学校读者我要写书评

暂无评论

A stochastic gradient-based two-step sparse identification algorithm for multivariate ARX systems

引用

control Theory and Technology 2024年第2期22卷 213-221页

作者： Yanxin Fu Wenxiao Zhao Key Laboratory of Systems and Control Academy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing100190China School of Mathematical Sciences University of Chinese Academy of SciencesBeijing100049China

We consider the sparse identification of multivariate ARX systems, i.e., to recover the zero elements of the unknown parameter matrix. We propose a two-step algorithm, where in the first step the stochastic gradient (SG) algorithm is applied to obtain initial estimates of the unknown parameter matrix and in the second step an optimization criterion is introduced for the sparse identification of multivariate ARX systems. Under mild conditions, we prove that by minimizing the criterion function, the zero elements of the unknown parameter matrix can be recovered with a finite number of observations. The performance of the algorithm is testified through a simulation example.

关键词： ARX system Stochastic gradient algorithm Sparse identification Support recovery Parameter estimation Strong consistency

来源：评论

学校读者我要写书评

暂无评论

On PID control Theory for Nonaffine Uncertain Stochastic systems

引用

Journal of systems science & Complexity 2023年第1期36卷 165-186页

作者： ZHANG Jinke ZHAO Cheng GUO Lei Key Laboratory of Systems and Control Academy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing 100190China School of Mathematical Science University of Chinese Academy of SciencesBeijing 100049China

PID(proportional-integral-derivative)control is recognized to be the most widely and successfully employed control strategy by ***,there are limited theoretical investigations explaining the rationale why PID can work so well when dealing with nonlinear uncertain *** paper continues the previous researches towards establishing a theoretical foundation of PID control,by studying the regulation problem of PID control for nonaffine uncertain nonlinear stochastic *** be specific,a three dimensional parameter set will be constructed explicitly based on some prior knowledge on bounds of partial derivatives of both the drift and diffusion *** will be shown that the closed-loop control system will achieve exponential stability in the mean square sense under PID control,whenever the controller parameters are chosen from the constructed parameter ***,similar results can also be obtained for PD(PI)control in some special cases.A numerical example will be provided to illustrate the theoretical results.

关键词： Asymptotically regulation global stability nonaffine PID control stochastic systems uncertain structure

来源：评论

学校读者我要写书评

暂无评论

Density peak clustering using tensor network

引用

science China(Information sciences) 2024年第3期67卷 321-322页

作者： Xiao SHI Yun SHANG Institute of Mathematics Academy of Mathematics and Systems ScienceChinese Academy of Sciences School of Mathematical Sciences University of Chinese Academy of Sciences National Center for Mathematics and Interdisciplinary Sciences Key Laboratory of ManagementDecision and Information SystemsAcademy of Mathematics and Systems ScienceChinese Academy of Sciences

Tensor networks have been a powerful tool in simulating many-body physics and have recently gained recognition in the machine learning community due to their remarkable representation capabilities. However, using tens... 详细信息

关键词： Tensors

来源：评论

学校读者我要写书评

暂无评论

Estimation of IIR systems with Binary-Valued Observations

引用

Chinese Annals of mathematics,Series B 2023年第5期44卷 687-702页

作者： Ruifen DAI Lei GUO Data Science Institute Shandong UniversityJinan 250100China Key Laboratory of Systems and Control Academy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing 100190China

Estimation and control problems with binary-valued observations exist widely in practical ***,most of the related works are devoted to finite impulse response(FIR for short)systems,and the theoretical problem of infinite impulse response(IIR for short)systems has been less *** study the estimation problems of IIR systems with binary-valued observations,the authors introduce a projected recursive estimation algorithm and analyse its global convergence properties,by using the stochastic Lyapunov function methods and the limit theory on double array *** is shown that the estimation algorithm has similar convergence results as those for FIR systems under a weakest possible non-persistent excitation ***,the upper bound for the accumulated regret of adaptive prediction is also established without resorting to any excitation condition.

关键词： Binary-valued observations Infinite impulse response Adaptive estimation Double array martingales Adaptive prediction

来源：评论

学校读者我要写书评

暂无评论

Positivity and stability of timescale-type linear singular systems with time delays

引用

science China(Information sciences) 2022年第12期65卷 170-185页

作者： Xiaodong LU Haitao LI Xianfu ZHANG School of Mathematics and Statistics Shandong Normal University Key Laboratory of Systems and Control Institute of Systems Science Academy of Mathematics and Systems Science Chinese Academy of Sciences School of Control Science and Engineering Shandong University

This paper investigates positivity and stability problems of timescale-type delayed linear singular systems(LSSs). The existing results put an extremely strict constraint on the time-delay function. By introducing a novel function, this constraint is successfully removed, which generalizes the scope of the considered systems. Then, some necessary and sufficient criteria are proposed for the positivity of LSSs with bounded and infinite time-varying delays. Finally, the exponential(asymptotical) stability of LSSs with bounded(infinite) time-varying delays is analyzed. The derived results are also applicable to timescale-type differential-difference systems(DDSs). Compared with the existing stability criteria of DDSs with bounded time-varying delays, the strict limit on the parameter related to the convergence rate is eliminated. Hence,the conservatism of the existing results can be reduced. Moreover, when investigating stability of DDSs with infinite time-varying delays, this paper proposes a less conservative stability theorem. To illustrate the validity of the derived results, an example is presented regarding LSSs with bounded and infinite time-varying delays.

关键词： positivity stability time delay linear singular systems timescale-type systems

来源：评论

学校读者我要写书评

暂无评论

Randomized difference-based gradient-free algorithm for distributed resource allocation

引用

science China(Information sciences) 2022年第4期65卷 212-228页

作者： Xiaoxue GENG Wenxiao ZHAO Key Laboratory of Systems and Control Institute of Systems Science Academy of Mathematics and Systems ScienceChinese Academy of Sciences School of Mathematical Sciences University of Chinese Academy of Sciences

This paper considers a distributed resource allocation problem over time-varying networks. The objective of each agent in the network is to optimize the sum of separable convex functions subjected to resource constraints by observing its local objective function and the information exchanged with its adjacent neighbors. Thus, the problem lies in a distributed framework. In existing literature dealing with similar problems, the measurement of the gradients/subgradients of the objective functions has been applied in the algorithm design. In this paper, by adding stochastic dithers to the local objective functions and constructing randomized differences, we propose a distributed gradient-free algorithm for solving the problem, and show that the algorithm is strongly convergent; that is, the estimates generated from each agent almost certainly converge to the optimal resource allocation solution of the network. Finally, the effectiveness of the algorithm is validated by conducting numerical experiments.

关键词： resource allocation distributed algorithm randomized difference

来源：评论

学校读者我要写书评

暂无评论

Consensus of switched multi-agent systems with binary-valued communications

引用

science China(Information sciences) 2022年第6期65卷 189-202页

作者： Min HU Ting WANG Yanlong ZHAO The Key Laboratory of Systems and Control Institute of Systems Science Academy of Mathematics and Systems ScienceChinese Academy of Sciences School of Mathematical Sciences University of Chinese Academy of Sciences

This paper studies the consensus of switched multi-agent systems(MAS) with binary-valued communications. Different from the existing studies on switched MAS considering precise observations,each agent studied in this research only receives binary-valued information with stochastic noises from its neighbors' states. Further, unlike the existing studies on MAS with binary-valued information in a fixed topology, in this paper, we consider the jointly connected undirected graphs, each of which switches with non-zero probability. The consensus algorithm comprises of two stages: first, the connected agents employ a recursive projection algorithm to estimate their neighbors' states based on the binary-valued communications;second, the control law of the connected agents is developed based on the estimations to upgrade their *** is proved that both the speed of the estimation convergence to the real states and the consensus speed of the states can achieve O(1/t) when the iteration step is given a proper value. Furthermore, the results indicate that the larger the value of the lowest probability that a graph emerges with, the more easily the consensus could be achieved. Finally, a simulation is presented to demonstrate the theoretical analysis.

关键词： binary-valued system switched multi-agent system recursive projection algorithm consensus jointly connected undirected topologies

来源：评论

学校读者我要写书评

暂无评论

Steady State Behavior of the Free Recall Dynamics of Working Memory

引用

Journal of systems science & Complexity 2024年第6期37卷 2424-2450页

作者： LI Tianhao LIU Zhixin LIU Lizheng HU Xiaoming Key Laboratory of Systems and Control Academy of Mathematics and Systems ScienceChinese Academy of SciencesSchool of Mathematical SciencesUniversity of Chinese Academy of SciencesBeijing 100190China Academy for Engineering and Technology Fudan UniversityShanghai 200433China Optimization and Systems Theory KTH Royal Institute of TechnologyStockholm 10044Sweden

This paper studies a dynamical system that models the free recall dynamics of working *** model is an attractor neural network with n modules,named hypercolumns,and each module consists of m *** mild conditions on the connection weights between minicolumns,the authors investigate the long-term evolution behavior of the model,namely the existence and stability of equilibria and limit *** authors also give a critical value in which Hopf bifurcation ***,the authors give a sufficient condition under which this model has a globally asymptotically stable equilibrium consisting of synchronized minicolumn states in each hypercolumn,which implies that in this case recalling is *** simulations are provided to illustrate the proposed theoretical ***,a numerical example the authors give suggests that patterns can be stored in not only equilibria and limit cycles,but also strange attractors(or chaos).

关键词： Asymptotic stability bifurcation free recall strange attractor working memory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：