This paper presents a reinforcement learning algorithm and provides conditions for global convergence to Nash equilibria. For several reinforcement learning schemes, including the ones proposed here, excluding converg...
详细信息
ISBN:
(纸本)9781612848006
This paper presents a reinforcement learning algorithm and provides conditions for global convergence to Nash equilibria. For several reinforcement learning schemes, including the ones proposed here, excluding convergence to action profiles which are not Nash equilibria may not be trivial, unless the step-size sequence is appropriately tailored to the specifics of the game. In this paper, we sidestep these issues by introducing a new class of reinforcement learning schemes where the strategy of each agent is perturbed by a state-dependent perturbation function. Contrary to prior work on equilibrium selection in games, where perturbation functions are globally state dependent, the perturbation function here is assumed to be local, i.e., it only depends on the strategy of each agent. We provide conditions under which the strategies of the agents will converge to an arbitrarily small neighborhood of the set of Nash equilibria almost surely. We further specialize the results to a class of potential games.
In this paper, we propose a driven by the robotics field method for revealing global clusters over a fast, huge and volatile stream of robotic data. The stream comes from a mobile robot which autonomously navigates in...
详细信息
In this paper, we propose a driven by the robotics field method for revealing global clusters over a fast, huge and volatile stream of robotic data. The stream comes from a mobile robot which autonomously navigates in an unknown environment perceiving it through its sensors. The sensor data arrives fast, is huge and evolves quickly over time as the robot explores the environment and observes new objects or new parts of already observed objects. To deal with the nature of data, we propose a grid-based algorithm that updates the grid structure and adjusts the so far built clusters online. Our method is capable of detecting object formations over time based on the partial observations of the robot at each time point. Experiments on real data verify the usefulness and efficiency of our method.
A novel subspace learning algorithm named neighborhood discriminant nearest feature line analysis (NDNFLA) is proposed in this paper. NDNFLA aims to find the discriminant feature of samples by maximizing the between-c...
详细信息
A novel subspace learning algorithm named neighborhood discriminant nearest feature line analysis (NDNFLA) is proposed in this paper. NDNFLA aims to find the discriminant feature of samples by maximizing the between-class feature line (FL) distances and minimizing the within-class FL distance. At the same time, the neighborhood is preserved in the feature space. Experimental results demonstrate the efficiency of the proposed algorithm.
To improve the classification accuracy of both the conventional Euclidean KNN algorithm and the improved KNN algorithm based on information entropy, this paper proposes an improved KNN algorithm based on multi-attribu...
详细信息
Abstract We propose an algorithm for optimal input design in nonlinear stochastic dynamic systems. The approach relies on minimizing a function of the covariance of the parameter estimates of the system with respect t...
Abstract We propose an algorithm for optimal input design in nonlinear stochastic dynamic systems. The approach relies on minimizing a function of the covariance of the parameter estimates of the system with respect to the input. The covariance matrix is approximated using a joint likelihood function of hidden states and measurements, and a combination of state filters and smoothers. The input is parametrized using an autoregressive model. The proposed approach is illustrated through a simulation example.
Telerobotics is one of the most traditional fields of robotics and it played a crucial role in the history of robotics and of the mankind, especially in the areas of space and undersea exploration and of remote materi...
Telerobotics is one of the most traditional fields of robotics and it played a crucial role in the history of robotics and of the mankind, especially in the areas of space and undersea exploration and of remote material handling. On the other hand, teleoperation is still a very active research area and many problems are still open. In particular, the design of the control strategy for coupling local and remote site is of paramount importance for implementing telepresence, namely the feeling of being directly interacting with the remote environment. The IEEE RAS Technical Committee on Telerobotics would like to propose a half-day tutorial for illustrating several successful control strategies for implementing high performance bilateral teleoperation systems.
With the virtue of accuracy, fast and convenience, electronic truck scale is more and more applied to weighing measurement system of all walks of life. Since it is the very important measurement equipment in trade set...
详细信息
A simulation approach was proposed to analyze the spectra of Doppler signals in local expansion artery, which can provide a useful guidance for detecting the formation and growth progress of the aneurysms using the Do...
详细信息
It is the mainstream method that in human face detection and recognition with AdaBoost as the representative based on statistical learning method. Detection rates have reached a high level, and can achieve real-time d...
详细信息
The chaotic suppression problem for a class of typical discrete-time hyperchaotic systems is investigated in this paper. A T-S fuzzy model-based control method is employed to eliminate hyperchaos in the considered dis...
详细信息
ISBN:
(纸本)9787894631046
The chaotic suppression problem for a class of typical discrete-time hyperchaotic systems is investigated in this paper. A T-S fuzzy model-based control method is employed to eliminate hyperchaos in the considered discrete-time system. Firstly, the T-S fuzzy representation for the discrete-time hyperchaotic system is presented. Then, a state feedback controller is designed based on the parallel distributed compensation technique. According to the linear matrix inequality method and exact linearization technique, two kinds of approaches are given to solve the controller gain matrices. Finally, the validity of the proposed chaotic suppression method is illustrated by a numerical example.
暂无评论