We consider word-of-mouth social learning involving m Kalman filter agents that operate sequentially. The first Kalman filter receives the raw observations, while each subsequent Kalman filter receives a noisy measure...
详细信息
Nonlinear differential equations are encountered as models of fluid flow, spiking neurons, and many other systems of interest in the real world. Common features of these systems are that their behaviors are difficult ...
详细信息
In this paper,we propose a game theory framework to solve advanced persistent threat problems,especially considering two types of insider threats:malicious and *** this framework,we establish a unified three-player ga...
详细信息
In this paper,we propose a game theory framework to solve advanced persistent threat problems,especially considering two types of insider threats:malicious and *** this framework,we establish a unified three-player game model and derive Nash equilibria in response to different types of insider *** analyzing these Nash equilibria,we provide quantitative solutions to advanced persistent threat problems pertaining to insider ***,we have conducted a comparative assessment of the optimal defense strategy and corresponding defender's costs between two types of insider ***,our findings advocate a more proactive defense strategy against inadvertent insider threats in contrast to malicious ones,despite the latter imposing a higher burden on the *** theoretical results are substantiated by numerical results,which additionally include a detailed exploration of the conditions under which different insiders adopt risky *** conditions can serve as guiding indicators for the defender when calibrating their monitoring intensities and devising defensive strategies.
Offline safe reinforcement learning (RL) aims to train a constraint satisfaction policy from a fixed dataset. Current state-of-the-art approaches are based on supervised learning with a conditioned policy. However, th...
详细信息
Offline safe reinforcement learning (RL) aims to train a constraint satisfaction policy from a fixed dataset. Current state-of-the-art approaches are based on supervised learning with a conditioned policy. However, these approaches fall short in real-world applications that involve complex tasks with rich temporal and logical structures. In this paper, we propose temporal logic Specification-conditioned Decision Transformer (SDT), a novel framework that harnesses the expressive power of signal temporal logic (STL) to specify complex temporal rules that an agent should follow and the sequential modeling capability of Decision Transformer (DT). Empirical evaluations on the DSRL benchmarks demonstrate the better capacity of SDT in learning safe and high-reward policies compared with existing approaches. In addition, SDT shows good alignment with respect to different desired degrees of satisfaction of the STL specification that it is conditioned on. Copyright 2024 by the author(s)
Dear Editor,This letter is concerned with prescribed-time Nash equilibrium(PTNE)seeking problem in a pursuit-evasion game(PEG)involving agents with second-order *** order to achieve the prior-given and user-defined co...
详细信息
Dear Editor,This letter is concerned with prescribed-time Nash equilibrium(PTNE)seeking problem in a pursuit-evasion game(PEG)involving agents with second-order *** order to achieve the prior-given and user-defined convergence time for the PEG,a PTNE seeking algorithm has been developed to facilitate collaboration among multiple pursuers for capturing the evader without the need for any global ***,it is theoretically proved that the prescribedtime convergence of the designed algorithm for achieving Nash equilibrium of ***,the effectiveness of the PTNE method was validated by numerical simulation results.A PEG consists of two groups of agents:evaders and *** pursuers aim to capture the evaders through cooperative efforts,while the evaders strive to evade *** is a classic noncooperative *** has attracted plenty of attention due to its wide application scenarios,such as smart grids[1],formation control[2],[3],and spacecraft rendezvous[4].It is noteworthy that most previous research on seeking the Nash equilibrium of the game,where no agent has an incentive to change its actions,has focused on asymptotic and exponential convergence[5]-[7].
This paper studies the formation of final opinions for the Friedkin-Johnsen (FJ) model with a community of partially stubborn agents. The underlying network of the FJ model is symmetric and generated from a random gra...
详细信息
controlling an active distribution network(ADN)from a single PCC has been advantageous for improving the performance of coordinated Intermittent RESs(IRESs).Recent studies have proposed a constant PQ regulation approa...
详细信息
controlling an active distribution network(ADN)from a single PCC has been advantageous for improving the performance of coordinated Intermittent RESs(IRESs).Recent studies have proposed a constant PQ regulation approach at the PCC of ADNs using coordination of non-MPPT based ***,due to the intermittent nature of DGs coupled with PCC through uni-directional broadcast communication,the PCC becomes vulnerable to transient *** address this challenge,this study first presents a detailed mathematical model of an ADN from the perspective of PCC regulation to realize rigidness of PCC against ***,an H_(∞)controller is formulated and employed to achieve optimal performance against disturbances,consequently,ensuring the least oscillations during transients at ***,an eigenvalue analysis is presented to analyze convergence speed limitations of the newly derived system ***,simulation results show the proposed method offers superior performance as compared to the state-of-the-art methods.
Formal verification of safety properties is critical in many application areas. In this paper a survey of the most common and efficient methods is given. The different methods are compared for some typical scalable ex...
详细信息
Learning algorithms have become an integral component to modern engineering solutions. Examples range from self-driving cars and recommender systems to finance and even critical infrastructure, many of which are typic...
详细信息
Learning algorithms have become an integral component to modern engineering solutions. Examples range from self-driving cars and recommender systems to finance and even critical infrastructure, many of which are typically under the purview of control theory. While these algorithms have already shown tremendous promise in certain applications [1], there are considerable challenges, in particular, with respect to guaranteeing safety and gauging fundamental limits of operation. Thus, as we integrate tools from machine learning into our systems, we also require an integrated theoretical understanding of how they operate in the presence of dynamic and system-theoretic phenomena. Over the past few years, intense efforts toward this goal - an integrated theoretical understanding of learning, dynamics, and control - have been made. While much work remains to be done, a relatively clear and complete picture has begun to emerge for (fully observed) linear dynamical systems. These systems already allow for reasoning about concrete failure modes, thus helping to indicate a path forward. Moreover, while simple at a glance, these systems can be challenging to analyze. Recently, a host of methods from learning theory and high-dimensional statistics, not typically in the control-theoretic toolbox, have been introduced to our community. This tutorial survey serves as an introduction to these results for learning in the context of unknown linear dynamical systems (see 'Summary'). We review the current state of the art and emphasize which tools are needed to arrive at these results. Our focus is on characterizing the sample efficiency and fundamental limits of learning algorithms. Along the way, we also delineate a number of open problems. More concretely, this article is structured as follows. We begin by revisiting recent advances in the finite-sample analysis of system identification. Next, we discuss how these finite-sample bounds can be used downstream to give guaranteed performa
We introduce a novel differentially private algorithm for online federated learning that employs temporally correlated noise to enhance utility while ensuring privacy of continuously released models. To address challe...
详细信息
暂无评论