A model-based offline policy iteration(PI) algorithm and a model-free online Q-learning algorithm are proposed for solving fully cooperative linear quadratic dynamic games. The PI-based adaptive Q-learning method can ...
详细信息
A model-based offline policy iteration(PI) algorithm and a model-free online Q-learning algorithm are proposed for solving fully cooperative linear quadratic dynamic games. The PI-based adaptive Q-learning method can learn the feedback Nash equilibrium online using the state samples generated by behavior policies, without sending inquiries to the system model. Unlike the existing Q-learning methods, this novel Q-learning algorithm executes both policy evaluation and policy improvement in an adaptive *** prove the convergence of the offline PI algorithm by proving its equivalence to Newton's method while solving the game algebraic Riccati equation(GARE). Furthermore, we prove that the proposed Q-learning method will converge to the Nash equilibrium under a small learning rate if the method satisfies certain persistence of excitation conditions, which can be easily met by suitable behavior policies. Our simulation results demonstrate the good performance of the proposed online adaptive Q-learning algorithm.
In recent years,Siamese-based trackers have achieved excellent performance,most of them usually calculate the similarity of each position on the search region to the object through the cross-correlation layer for trac...
详细信息
ISBN:
(数字)9789887581536
ISBN:
(纸本)9781665482561
In recent years,Siamese-based trackers have achieved excellent performance,most of them usually calculate the similarity of each position on the search region to the object through the cross-correlation layer for tracking obj *** solve the problem that the above method neglects the correspondence of the local information between the object and the search region and cannot adapt to the object deformation well,we propose a Siamese network-based tracker with position attention network(SiamPA).First,we use Siamese backbone network to extract template and search region ***,we adopt the boxguided object feature selection strategy to avoid similarity calculations for background *** addition,we introduce the position attention network instead of the cross-correlation layer to learn the part-level relationship between the object and the search region ***,the classification-regression sub-network is used to decode the similarity respond map obtained by the position attention network and predict the position of the *** contribution,one is to propose a box-guided method for refining object features,and the other is to introduce a position attention network for information *** on three challenging benchmarks including GOT-10 k,UAV123 and OTB-100 demonstrate that our SiamPA achieves excellent tracking performance with a real-time speed.
This paper present a prediction model for three different objection(the airflow rate,the carbonaceous biochemical oxygen demand(CBOD) of the effluent,and the total suspend solids(TSS) of the *** model is built by the ...
详细信息
ISBN:
(纸本)9781538629185
This paper present a prediction model for three different objection(the airflow rate,the carbonaceous biochemical oxygen demand(CBOD) of the effluent,and the total suspend solids(TSS) of the *** model is built by the MLP neural *** accurancy of the prediction result of MLP neural network is compared with the accurancy of the result of trational stational autoregressive model(AR).The conclution is that the percentage error prediction model of the MLP neural network(PE),the fractional deviation(FB),normalized mean square error(NMSE),the mean absolute error(MAE) and mean square error(MSE) prediction model of evaluation index is better than the AR *** other words,the prediction model based on MLP neural network provides a reliable basis for reducing the energy consumption in the activated sludge process of industrial waste water treatment and further improving its effect on the treatment of industrial waste water.
This paper is devoted to further investigating the cloud controlsystems(CCSs). The benefits and challenges of CCSs are provided. Both new research results of ours and some typical work made by other researchers are p...
详细信息
This paper is devoted to further investigating the cloud controlsystems(CCSs). The benefits and challenges of CCSs are provided. Both new research results of ours and some typical work made by other researchers are presented. It is believed that the CCSs can have huge and promising effects due to their potential advantages.
A cooperative multi-agent system entitles some independent agents to complete complex tasks through coordination and *** the dynamics of physical agents are so complex that the environment of learning is indeed stocha...
详细信息
ISBN:
(纸本)9781538629185
A cooperative multi-agent system entitles some independent agents to complete complex tasks through coordination and *** the dynamics of physical agents are so complex that the environment of learning is indeed stochastic,the paper introduces the decentralized multi-agent reinforcement learning(MARL) algorithm,named as Decentralized Concurrent Learning with Cooperative Policy Exploration(DCL-CPE),in order to solve cooperative learning within stochastic *** investigate its feasibility in practical multi-agent systems,the box-pushing test with DCL-CPE is designed with a group of two-wheel driven robots acting as learning *** to physical properties,such as nonholonomic dynamics,rolling and sliding frictions,unreliable sense,rigid body collision,etc.,the cooperative learning is a high stochastic learning *** simulation test in Webots shows that DCL-CPE is good at exploring best cooperative policy in a decentralized way,even as state transition and rewards are all stochastic.
This article studies the almost-sure and the mean-square consensus control problems of second-order stochastic discrete-time multi-agent systems with multiplicative ***,a control law based on the absolute velocity and...
详细信息
This article studies the almost-sure and the mean-square consensus control problems of second-order stochastic discrete-time multi-agent systems with multiplicative ***,a control law based on the absolute velocity and relative position information is ***,considering the existence of multiplicative noises and nonlinear terms with Lipschitz constants,the consensus control problem is solved through the use of a degenerated Lyapunov ***,for the linear second-order multi-agent systems,some explicit consensus conditions are ***,two sets of numerical simulations are performed.
A new kind of group coordination control problemgroup hybrid coordination control is investigated in this *** group hybrid coordination control means that in a whole multi-agent system(MAS)that consists of two subgrou...
详细信息
A new kind of group coordination control problemgroup hybrid coordination control is investigated in this *** group hybrid coordination control means that in a whole multi-agent system(MAS)that consists of two subgroups with communications between them,agents in the two subgroups achieve consensus and containment,*** MASs with both time-delays and additive noises,two group control protocols are proposed to solve this problem for the containment-oriented case and consensus-oriented case,*** developing a new analysis idea,some sufficient conditions and necessary conditions related to the communication intensity betw een the two subgroups are obtained for the following two types of group hybrid coordination behavior:1)Agents in one subgroup and in another subgroup achieve weak consensus and containment,respectively;2)Agents in one subgroup and in another subgroup achieve strong consensus and containment,*** is revealed that the decay of the communication impact betw een the two subgroups is necessary for the consensus-oriented ***,the validity of the group control results is verified by several simulation examples.
In this paper,we propose a fully Soft Bionic Grasping Device(SBGD),which has advantages in automatically adjusting the grasping range,variable stiffness,and controllable bending *** device consists of soft gripper str...
详细信息
In this paper,we propose a fully Soft Bionic Grasping Device(SBGD),which has advantages in automatically adjusting the grasping range,variable stiffness,and controllable bending *** device consists of soft gripper structures and a soft bionic bracket *** adopt the local thin-walled design in the soft gripper *** design improves the grippers’bending efficiency,and imitate human finger’s segmental bending *** addition,this work also proposes a pneumatic soft bionic bracket structure,which not only can fix grippers,but also can automatically adjust the grasping space by imitating the human adjacent fingers’opening and closing *** to the above advantages,the SBGD can grasp larger or smaller objects than the regular grasping ***,to grasp small objects reliably,we further present a new Pinching Grasping(PG)*** great performance of the fully SBGD is verified by *** work will promote innovative development of the soft bionic grasping robots,and greatly meet the applications of dexterous grasping multi-size and multi-shape objects.
This study presents a novel impact time and angle constrained guidance law for homing missiles. The guidance law is first developed with the prior-assumption of a stationary target, which is followed by the practical ...
详细信息
This study presents a novel impact time and angle constrained guidance law for homing missiles. The guidance law is first developed with the prior-assumption of a stationary target, which is followed by the practical extension to a maneuvering target scenario. To derive the closed-form guidance law, the trajectory reshaping technique is utilized and it results in defining a specific polynomial function with two unknown coefficients. These coefficients are determined to satisfy the impact time and angle constraints as well as the zero miss distance. Furthermore, the proposed guidance law has three additional guidance gains as design parameters which make it possible to adjust the guided trajectory according to the operational conditions and missile's capability. Numerical simulations are presented to validate the effectiveness of the proposed guidance law. (C) 2016 Chinese Society of Aeronautics and Astronautics. Production and hosting by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license.
This paper investigates the containment problem of continuous-time multi-agent systems with multiplicative noises,where the first-order and second-order multi-agent systems are studied *** on stochastic analysis tools...
详细信息
This paper investigates the containment problem of continuous-time multi-agent systems with multiplicative noises,where the first-order and second-order multi-agent systems are studied *** on stochastic analysis tools,algebraic graph theory,and Lyapunov function method,the containment protocols based the relative states measurement with multiplicative noises are developed to guarantee the mean square and almost sure ***,the sufficient conditions and necessary conditions related to the control gains are derived for achieving mean square and almost sure *** is also shown that multiplicative noises may works positively for the almost sure containment of the first-order multi-agent *** examples are also introduced to illustrate the effectiveness of the theoretical results.
暂无评论