检索结果-内蒙古大学图书馆

On-Policy and Off-Policy Value Iteration Algorithms for Stochastic Zero-Sum Dynamic Games

Journal of Systems science & Complexity 2025年第1期38卷 421-435页

作者： GUO Liangyuan WANG Bing-Chang ZHANG Ji-Feng School of Control Science and Engineering Shandong UniversityJinan 250063China School of Automation and Electrical Engineering Zhongyuan University of TechnologyZhengzhou 450007China Key Laboratory of Systems and Control Academy of Mathematics and Systems ScienceChinese Academy of SciencesBeijing 100190China

This paper considers the value iteration algorithms of stochastic zero-sum linear quadratic games with unkown ***-policy and off-policy learning algorithms are developed to solve the stochastic zero-sum games,where the system dynamics is not *** analyzing the value function iterations,the convergence of the model-based algorithm is *** equivalence of several types of value iteration algorithms is *** effectiveness of model-free algorithms is demonstrated by a numerical example.

关键词： Approximate dynamic programming on-policy off-policy stochastic zero-sum games valueiteration

来源：评论

学校读者我要写书评

暂无评论

Multi-step state-based opacity for unambiguous weighted machines

引用

science China(Information sciences) 2024年第11期67卷 211-221页

作者： Zhipeng ZHANG Chengyi XIA Guoyuan QI Jun FU School of Artificial Intelligence Tiangong University School of Control Science and Engineering Tiangong University State Key Laboratory of Synthetical Automation for Process Industries Northeastern University

Opacity is a central concept in the issue of privacy security and has been studied extensively in fields such as finite automata, probabilistic automata, and stochastic automata. Here, we investigate the problem of validating multi-step opaque properties through unambiguous weighted machines from the perspective of cyber-physical systems. First, the notion of multi-step state-based opacity for unambiguous weighted machines is presented and defined. It includes two variants of delays with a finite K and infinite steps. Subsequently, the weighted state estimate with K(infinite)-step delay is established by abstracting the possible state set that the system could have through these weighted observations. Meanwhile, to keep the observable weighted sequence consistent between the bidirectional observers, the unobserved weights of the reverse weighted machine are assumed to be reserved. Subsequently, the existence conditions are developed, and the corresponding algorithms, termed the weighted bidirectional observer, are generalized to verify these properties. Finally, several numerical examples are illustrated to demonstrate the effectiveness of the proposed method. Taken together, the current approach will be conducive to a deep understanding of the security and privacy of cyber-physical systems.

关键词： logical dynamical systems weighted state machine state estimation opacity cyber physical systems

来源：评论

学校读者我要写书评

暂无评论

Multi-UAV Energy-Efficient Detection Coverage Under Jamming Environment: A Hierarchical Collaborative Learning Approach

引用

IEEE Transactions on Vehicular Technology 2025年第5期74卷 7351-7363.0页

作者： Fang, Chao Feng, Yanxiang Li, Xiaoling Yang, Yikang Xi'an Jiaotong University School of Automation Science and Engineering Xi'an710049 China Chang'an University School of Electronics and Control Engineering Xi'an710064 China

Unmanned aerial vehicles (UAVs) can provide detection coverage service in many scenarios. The fair coverage is achieved by designing carefully UAVs' trajectories, which are established at each step by choosing the moving action and channel access for transmitting data. However, under the jamming environment, the problematic trajectories could lead to mutual interference and malicious jamming, such that the coverage service fails. The selections of moving action and channel access are usually coupled, and so far no work has addressed them jointly for planning trajectory. As such, this paper investigates the multi-UAV joint optimization of moving action and channel access for the energy-efficient detection coverage. To decouple the strongly-coupled moving action and channel access, we model the studied optimization problem as a hierarchical game, where the stochastic game (resp. potential game) is applied for selecting moving action (resp. channel access). Then, we propose a multi-agent hierarchical cooperative learning (MAHCL) algorithm to attain near-optimal solution for the joint optimization. It is proved that the proposed MAHCL algorithm can asymptotically converge to the near-optimal joint strategy with lower computational complexity. Finally, the simulation results show the higher energy efficiency of MAHCL algorithm compared with the benchmarks. © 1967-2012 IEEE.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

Event-triggered predefined-time control for full-state constrained nonlinear systems: A novel command filtering error compensation method

引用

science China(Technological sciences) 2024年第9期67卷 2867-2880页

作者： PAN YingNan CHEN YiLin LIANG HongJing College of Control Science and Engineering Bohai UniversityJinzhou 121013China School of Automation Engineering University of Electronic Science and Technology of ChinaChengdu 611731China Laboratory of Electromagnetic Space Cognition and Intelligent Control Beijing 100089China

In this paper, a command filter-based adaptive fuzzy predefined-time event-triggered tracking control problem is investigated for uncertain nonlinear systems with time-varying full-state constraints. By designing a sliding mode differentiator, the inherent computational complexity problem within the predefined-time backstepping framework is solved. Different from the existing command filter-based finite-time and fixed-time control strategies that the convergence time of the filtering error is adjusted through the system initial value or numerous parameters, a novel command filtering error compensation method is presented,which tunes one control parameter to make the filtering error converge in the predefined time, thereby reducing the complexity of design and analysis of processing the filtering error. Then, an improved event-triggered mechanism(ETM) that builds upon the switching threshold strategy, in which an inverse cotangent function is designed to replace the residual term of the ETM,is proposed to gradually release the controller's dependence on the residual term with increasing time. Furthermore, a tan-type nonlinear mapping technique is applied to tackle the time-varying full-state constraints problem. By the predefined-time stability theory, all signals in the uncertain nonlinear systems exhibit predefined-time stability. Finally, the feasibility of the proposed algorithm is substantiated through two simulation results.

关键词： predefined-time control command filtering error compensation method event-triggered mechanism time-varying full-state constraints uncertain nonlinear systems

来源：评论

学校读者我要写书评

暂无评论

Coalition formation problem: a capability-centric analysis and general model

引用

science China(Information sciences) 2024年第11期67卷 180-193页

作者： Jie CHEN Miao GUO Bin XIN Qing WANG Shengyu LU Yipeng WANG Yulong DING School of Automation Beijing Institute of Technology National Key Lab of Autonomous Intelligent Unmanned Systems Department of Control Science and Engineering Tongji University

Coalition formation(CF) refers to reasonably organizing robots and/or humans to form coalitions that can satisfy mission requirements, attracting more and more attention in many fields such as multirobot collaboration and human-robot collaboration. However, the analysis on CF problems remains *** provide a valuable study reference for researchers interested in CF, this paper proposed a capabilitycentric analysis of the CF problem. The key problem elements of CF are firstly extracted by referencing the concepts of the 5W1H method. That is, objects(who) form coalitions(what) to accomplish missions(why) by aggregating capabilities(how) in a specific environment(where-when). Then, a multi-view analysis of these elements and their correlation in terms of capabilities is proposed through various logic diagrams, structure charts, etc. Finally, to facilitate a deeper understanding of capability-centric CF, a general mathematical model is constructed, demonstrating how the different concepts discussed in this analysis contribute to the overall model.

关键词： coalition formation capability aggregation capability metric mission requirement environmental effect

来源：评论

学校读者我要写书评

暂无评论

An enhanced stochastic error modeling using multi-Gauss–Markov processes for GNSS/INS integration system

引用

Journal of engineering and Applied science 2024年第1期71卷 186页

作者： Wu, Youlong Chen, Shuai School of Intelligent Science and Control Engineering Jinling Institute of Technology Nanjing China School of Automation Nanjing University of Science and Technology Nanjing China

Angular random walk (ARW), rate random walk (RRW), and bias instability (BI) are the main noise types in inertial measurement units (IMUs) and thus determine the navigation performance of IMUs. BI is the flicker noise, which determines the noise level of an inertial sensor. The traditional error modeling approach involves modeling the ARW and BI processes as RRW or Gauss–Markov (GM) processes, and this approach is applied as a suboptimal filter in the global navigation satellite system (GNSS)/inertial navigation system (INS) extended Kalman filter (EKF). In this paper, the random error identification processes for white noise and colored noise for inertial sensors are separated using the Allan variance and power spectral density methods and the equivalence of the stochastic process differential equations of bias instability and a combination of multiple first-order GM processes are derived. A colored noise compensation method is proposed based on the enhanced EKF model. Experimental results demonstrate that, compared to traditional error models, our proposed model reduces positional drift error in dynamic testing from 195 to 49 m, enhancing positional accuracy by 40.2%. These findings confirm the potential and superiority of our method in complex navigation environments. © The Author(s) 2024.

关键词： Markov processes

来源：评论

学校读者我要写书评

暂无评论

Gait Recognition Under Different Clothing Conditions Via Deterministic Learning

引用

IEEE/CAA Journal of Automatica Sinica 2024年第6期11卷 1530-1532页

作者： Muqing Deng Cong Wang School of Automation Guangdong University of TechnologyGuangzhou 510006China School of Control Science and Engineering Shandong UniversityJinan 250100China

Dear Editor,This letter deals with the robustness problem of gait recognition method against maximum number of clothing *** selecting four kinds of time-varying silhouette features,gait dynamics underlying different individuals’gait features is effectively modeled by radial basis function(RBF)neural networks through deterministic *** kind of dynamics information has little sensitivity to the variance between gait patterns under different clothing *** order to eliminate the effect of clothing differences,the training patterns under different clothing conditions further constitute a uniform training dataset,containing all kinds of gait dynamics under different clothing conditions.A rapid recognition scheme is presented on published gait *** experiments demonstrate the efficacy of the proposed method.

关键词： networks neural individual

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning-based unknown reference tracking control of HMASs with nonidentical communication delays

引用

science China(Information sciences) 2023年第7期66卷 46-57页

作者： Yong XU Zheng-Guang WU Wei-Wei CHE Deyuan MENG School of Automation Beijing Institute of Technology Institute of Cyber-Systems and Control Zhejiang University College of Mathematics and Computer Science Zhejiang Normal University Department of Automation Qingdao University School of Automation Science and Electrical Engineering Beihang University(BUAA)

This paper focuses on the optimal output synchronization control problem of heterogeneous multiagent systems(HMASs) subject to nonidentical communication delays by a reinforcement learning *** with existing studies assuming that the precise model of the leader is globally or distributively accessible to all or some of the followers, the leader's precise dynamical model is entirely inaccessible to all the followers in this paper. A data-based learning algorithm is first proposed to reconstruct the leader's unknown system matrix online. A distributed predictor subject to communication delays is further devised to estimate the leader's state, where interaction delays are allowed to be nonidentical. Then, a learning-based local controller, together with a discounted performance function, is projected to reach the optimal output synchronization. Bellman equations and game algebraic Riccati equations are constructed to learn the optimal solution by developing a model-based reinforcement learning(RL) algorithm online without solving regulator equations, which is followed by a model-free off-policy RL algorithm to relax the requirement of all agents' dynamics faced by the model-based RL algorithm. The optimal tracking control of HMASs subject to unknown leader dynamics and communication delays is shown to be solvable under the proposed RL algorithms. Finally, the effectiveness of theoretical analysis is verified by numerical simulations.

关键词： heterogeneous multiagent systems HMAS reinforcement learning RL optimal output synchronization communication delays

来源：评论

学校读者我要写书评

暂无评论

Constrained Networked Predictive control for Nonlinear Systems Using a High-Order Fully Actuated System Approach

引用

IEEE/CAA Journal of Automatica Sinica 2025年第2期12卷 478-480页

作者： Yi Huang Guo-Ping Liu Yi Yu Wenshan Hu the School of Electrical Engineering and Automation Wuhan University IEEE the Center for Control Science and Technology Southern University of Science and Technology the Department of Electrical and Electronic Engineering The Hong Kong Polytechnic University

Dear Editor,In this letter, a constrained networked predictive control strategy is proposed for the optimal control problem of complex nonlinear highorder fully actuated (HOFA) systems with noises. The method can effectively deal with nonlinearities, constraints, and noises in the system, optimize the performance metric, and present an upper bound on the stable output of the system.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Model-Based Online Adaptive Inverse Noncooperative Linear-Quadratic Differential Games via Finite-Time Concurrent Learning

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2024年第8期5卷 4247-4257页

作者： Lin, Jie Wu, Huai-Ning Beihang University School of Automation Science and Electrical Engineering Beijing100191 China Beihang University Science and Technology on Aircraft Control Laboratory School of Automation Science and Electrical Engineering Beijing100191 China Peng Cheng Laboratory Shenzhen518066 China

Noncooperative differential games provide a basis for the study of coordination, conflict, and control for a single dynamical system with multiple players. Within the linear-quadratic differential games (LQDGs), the optimal feedback gain matrix and the weighting matrices of individual cost function depict each player's control policy and the tradeoff of various objectives, respectively. In this article, we investigate the inverse problem of LQDG with partial state observation via a model-based online method, i.e., recover the cost function of each player. First, a state observer is used to estimate the system state. Then, the feedback gain matrices of players are learned using a finite-time concurrent learning (FTCL)-based adaptive law relaxing the persistent excitation (PE) condition which is required in traditional adaptive estimation methods. With the learned feedback gain matrices, a semidefinite programming (SDP) problem with the quadratic objective function can be set up for determining the weighting matrices of the cost function. The applicability and effectiveness of the proposed method are demonstrated with a numerical example and a shared steering control simulation. © 2020 IEEE.

关键词： Cost functions

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：