Finding Nash equilibria in non-cooperative games can be, in general, an exceptionally challenging task. This is owed to various factors, including but not limited to the cost functions of the game being nonconvex/nonc...
Finding Nash equilibria in non-cooperative games can be, in general, an exceptionally challenging task. This is owed to various factors, including but not limited to the cost functions of the game being nonconvex/nonconcave, the players of the game having limited information about one another, or even due to issues of computational complexity. The present tutorial draws motivation from this harsh reality and provides methods to approximate Nash or min-max equilibria in non-ideal settings using both optimization- and learning-based techniques. The tutorial acknowledges, however, that such techniques may not always converge, but instead lead to oscillations or even chaos. In that respect, tools from passivity and dissipativity theory are provided, which can offer explanations about these divergent behaviors. Finally, the tutorial highlights that, more frequently than often thought, the search for equilibrium policies is simply vain; instead, bounded rationality and non-equilibrium policies can be more realistic to employ owing to some players’ learning imperfectly or being relatively naive – "bounded rational." The efficacy of such plays is demonstrated in the context of autonomous driving systems, where it is explicitly shown that they can guarantee vehicle safety.
In this paper, we propose a novel infrastructure-dependent ramp-metering control for the recently proposed METANET with service station (METANET-s) model, i.e., a second-order macroscopic traffic model that, compared ...
详细信息
ISBN:
(数字)9798350316339
ISBN:
(纸本)9798350316346
In this paper, we propose a novel infrastructure-dependent ramp-metering control for the recently proposed METANET with service station (METANET-s) model, i.e., a second-order macroscopic traffic model that, compared to the classical METANET, incorporates the dynamics of service stations on highways. We study the effect of a ramp-metering control scheme on a highway stretch with a service station and show that it is capable of actively regulate internal traffic demand attempting to exit the service station via its on-ramp, on top of contributing to decrease the traffic congestion on the mainstream. In fact, the proposed control scheme effectively prevents the backlog of vehicles attempting to merge back onto the mainstream. This dynamic control mechanism is further endowed by a route guidance control strategy increasing the share of vehicles stopping at the service station during main-stream congestion periods, e.g. via incentives. The combined effect of our control schemes allows to take full advantage of the presence of service stations, reducing the overall traffic congestion. Simulation results demonstrate the effectiveness of the proposed control strategies.
EEG continues to find a multitude of uses in both neuroscience research and medical practice, and independent component analysis (ICA) continues to be an important tool for analyzing EEG. A multitude of ICA algorithms...
详细信息
Quantum protocols including quantum key distribution and blind quantum computing often require the preparation of quantum states of known dimensions. Here, we show that, rather surprisingly, hidden multidimensional mo...
详细信息
Quantum protocols including quantum key distribution and blind quantum computing often require the preparation of quantum states of known dimensions. Here, we show that, rather surprisingly, hidden multidimensional modulation is often performed by practical devices. This violates the dimensional assumption in quantum protocols, thus creating side channels and security loopholes. Our work has important impacts on the security of quantum cryptographic protocols.
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks based upon non-Euclidean contraction theory. The basic idea is to cast the robustness ...
详细信息
Recently, self-supervised neural networks have shown excellent image denoising performance. However, current dataset free methods are either computationally expensive, require a noise model, or have inadequate image q...
详细信息
Nowadays, application of automated intelligent robot arm devices to improve industrial production efficiency has become a popular research field in the world. The previous off-line path planning method of robotic arm ...
Nowadays, application of automated intelligent robot arm devices to improve industrial production efficiency has become a popular research field in the world. The previous off-line path planning method of robotic arm has the inadequacies of low efficiency and slow speed. Although the deep reinforcement learning has accomplished many achievements in the path planning of control manipulator, there are still some problems such as long training time and low planning accuracy. To solve the abovementioned issues, we propose an improved Twin Delayed Deep Deterministic policy gradient (TD3) algorithm (Improved Cross-Entropy Method-TD3: ICEM-TD3) for the path planning of the robotic arm. First, this paper combines evolutionary strategies with TD3 to generate action networks. Then the exploration of TD3 algorithm in the action space is replaced by the exploration in the parameter space. In addition, this paper designs a new reward function to weaken the redundancy of planning and accelerate the convergence speed of training. Finally, the Gazebo simulator is adopted to verify the proposed algorithm, and the results illustrate that the proposed algorithm can greatly improve the accuracy of the path planning of the manipulator using deep reinforcement learning.
We initiate the study of parameterized complexity of QMA problems in terms of the number of non-Clifford gates in the problem description. We show that for the problem of parameterized quantum circuit satisfiability, ...
详细信息
In this paper we investigate the design of optimal spatially distributed controllers for a linear and spatially invariant reaction-diffusion process over the real line. The controller receives state measurements from ...
详细信息
ISBN:
(数字)9783907144107
ISBN:
(纸本)9798331540920
In this paper we investigate the design of optimal spatially distributed controllers for a linear and spatially invariant reaction-diffusion process over the real line. The controller receives state measurements from different spatial locations with non-negligible delays. In this set-up and for the class of proportional spatially invariant state feedback controllers, the optimal control synthesis problem is equivalent to a feedback gain optimization for a spatially distributed delay system. We show that the spatial locality of optimal feedback gains is affected not only by diffusion and reaction coefficients, but also by the parameter representing communication time-delay that causes a sharp flattening of the control gains. In the expensive control regime, the optimal controller is solved analytically, yielding some practical design guidelines.
Deep Q-learning (DQN) has shown recent success on a wide range of complicated sequential decision-making issues, especially in the classic control area. However, in most DQN training, the sampling policies, particular...
详细信息
暂无评论