Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator algorithms (EAs) are certainly the fastest, and of ...
详细信息
Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator algorithms (EAs) are certainly the fastest, and of these, the family of discretized algorithms are proven to converge even faster than their continuous counterparts. However, it has recently been reported that the previous proofs for oee--optimality for all the reported algorithms for the past three decades have been flawed. We applaud the researchers who discovered this flaw, and who further proceeded to rectify the proof for the Continuous pursuit Algorithm (CPA). The latter proof examines the monotonicity property of the probability of selecting the optimal action, and requires the learning parameter to be continuously changing. In this paper, we provide a new method to prove the oee--optimality of the Discretized pursuit Algorithm (DPA) which does not require this constraint, by virtue of the fact that the DPA has, in and of itself, absorbing barriers to which the LA can jump in a discretized manner. Unlike the proof given (Zhang et al., Appl Intell 41:974-985, 3) for an absorbing version of the CPA, which utilizes the single-action Hoeffding's inequality, the current proof invokes what we shall refer to as the "multi-action" version of the Hoeffding's inequality. We believe that our proof is both unique and pioneering. It can also form the basis for formally showing the oee--optimality of the other EAs that possess absorbing states.
The most difficult part in the design and analysis of Learning Automata (LA) consists of the formal proofs of their convergence accuracies. The mathematical techniques used for the different families (Fixed Structure,...
详细信息
The most difficult part in the design and analysis of Learning Automata (LA) consists of the formal proofs of their convergence accuracies. The mathematical techniques used for the different families (Fixed Structure, Variable Structure, Discretized etc.) are quite distinct. Among the families of LA, Estimator algorithms (EAs) are certainly the fastest, and within this family, the set of pursuit algorithms have been considered to be the pioneering schemes. Informally, if the environment is stationary, their epsilon-optimality is defined as their ability to converge to the optimal action with an arbitrarily large probability, if the learning parameter is sufficiently small/large. The existing proofs of all the reported EAs follow the same fundamental principles, and to clarify this, in the interest of simplicity, we shall concentrate on the family of pursuit algorithms. Recently, it has been reported Ryan and Omkar (J Appl Probab 49(3):795-805, 2012) that the previous proofs for epsilon-optimality of all the reported EAs have a common flaw. The flaw lies in the condition which apparently supports the so-called "monotonicity" property of the probability of selecting the optimal action, which states that after some time instant t (0), the reward probability estimates will be ordered correctly forever. The authors of the various proofs have rather offered a proof for the fact that the reward probability estimates are ordered correctly at a single point of time after t (0), which, in turn, does not guarantee the ordering forever, rendering the previous proofs incorrect. While in Ryan and Omkar (J Appl Probab 49(3):795-805, 2012), a rectified proof was presented to prove the epsilon-optimality of the Continuous pursuit Algorithm (CPA), which was the pioneering EA, in this paper, a new proof is provided for the Absorbing CPA (ACPA), i.e., an algorithm which follows the CPA paradigm but which artificially has absorbing states whenever any action probability is arbitrarily c
A technique for designing frames to use with vector selection algorithms, for example matching pursuits (MP), is presented. The design algorithm is iterative and requires a training set of signal vectors. An MP algori...
详细信息
A technique for designing frames to use with vector selection algorithms, for example matching pursuits (MP), is presented. The design algorithm is iterative and requires a training set of signal vectors. An MP algorithm chooses frame vectors to approximate each training vector. Each vector in the frame is then adjusted by using the residuals for the training vectors which used that particular frame vector in their expansion. The frame design algorithm is applied to speech and electrocardiogram (ECG) signals, and the designed frames are tested on signals outside the training sets. Experiments demonstrate that the approximation capabilities, in terms of mean square error (MSE), of the optimized frames are significantly better than those found using frames designed by ad hoc techniques. Experiments show typical reduction in MSE by 20-50%.
This paper investigates the applicability of pursuit Algorithm (PA), including Classic pursuit Algorithm in Circle (ClaPAIC) and Cyclic pursuit Algorithm (CyPA) into the field of space missions. The implementation of ...
详细信息
This paper investigates the applicability of pursuit Algorithm (PA), including Classic pursuit Algorithm in Circle (ClaPAIC) and Cyclic pursuit Algorithm (CyPA) into the field of space missions. The implementation of PA has been applied to a number of typical scenarios: formation replenishment, rendezvous and docking and formation reconfiguration. Simulation results show the effectiveness when ClaPAIC and CyPA are accurately designed for the mission, and indicate that PA may be a promising tool in the design of spacecraft maneuvers. The small fuel cost in the case of TPF deployment and formation maintenance shows that PA control is also effective when ClaPAIC and CyPA are arranged, and control gain k alpha is selected properly.
Compared with small vehicles, the bus body is longer and the parking path is relatively shorter. If there are some errors in the path tracking process, there will be the risk of crossing the parking line. Therefore, t...
详细信息
Compared with small vehicles, the bus body is longer and the parking path is relatively shorter. If there are some errors in the path tracking process, there will be the risk of crossing the parking line. Therefore, the parking system of bus needs a high precision path tracking algorithm. Pure pursuit algorithm is widely used in vehicle or robot path tracking problem, but its performance depends on the selection of look-ahead distance which needs to be determined by scene in practical application. Aiming at the performance requirement of bus parking system, this paper improves the pure pursuit algorithm by analyzing the stability and error sources of the pure pursuit algorithm. In the improvement of the algorithm, speed is used as one of the parameters in the calculation of look-ahead distance to improve the stability of the system, and PI (Proportion Integration) control is added to the calculation of steering angle of pure pursuit algorithm to improve the tracking accuracy of the algorithm in the curve stage. The results of experiments show that the proposed algorithm can effectively improve the parking accuracy and stability of buses.
This paper addresses the sparse representation (SR) problem within a general Bayesian framework. We show that the Lagrangian formulation of the standard SR problem, i.e., x~* = arg min_x{‖y - Dx‖2/2+λ‖x‖_0}, can ...
详细信息
This paper addresses the sparse representation (SR) problem within a general Bayesian framework. We show that the Lagrangian formulation of the standard SR problem, i.e., x~* = arg min_x{‖y - Dx‖2/2+λ‖x‖_0}, can be regarded as a limit case of a general maximum a posteriori (MAP) problem involving Bernoulli-Gaussian variables. We then propose different tractable implementations of this MAP problem and explain several well-known pursuit algorithms (e.g., MP, OMP, StOMP, CoSaMP, SP) as particular cases of the proposed Bayesian formulation.
It has been proposed that the segmental spinal nervous system may organize movement using a collection of force-field primitives. The temporal organization of primitives has not been examined in detail. Recent data ex...
详细信息
It has been proposed that the segmental spinal nervous system may organize movement using a collection of force-field primitives. The temporal organization of primitives has not been examined in detail. Recent data examining muscle activity underlying corrections of motor patterns suggested that primitives might be recruited into motor programs as waveforms with a constant duration. Here we test the idea that each primitive or premotor drive comprising part of the motor patterns might be expressed as the combination of a small number of time-frequency atoms from some orthonormal basis. We analyze the temporal organization of pre-motor drives extracted from the motor pattern by the Bell-Sejnowski algorithm for independent component analysis. We then use matching pursuit cosine packet analysis to examine the time series of the activation waveforms of each of the independent components. The analysis confirms that the motor pattern can be described as a combination of a small number of time-frequency atoms. These atoms combine to generate the temporal structure and activation of the individual components or premotor drives that generate individual muscle activity.
In video background modeling, ghosting occurs when an object that belongs to the background is assigned to the foreground. In the context of Principal Component pursuit, this usually occurs when a moving object occlud...
详细信息
ISBN:
(纸本)9781509045464
In video background modeling, ghosting occurs when an object that belongs to the background is assigned to the foreground. In the context of Principal Component pursuit, this usually occurs when a moving object occludes a high contrast background object, a moving object suddenly stops, or a stationary object suddenly starts moving. Based on a previously developed incremental PCP method, we propose a novel algorithm that uses two simultaneous background estimates based on observations over the previous n 1 and n 2 (n 1 C n 2 ) frames in order to identify and diminish the ghosting effect. Our computational results show that the proposed method greatly improves both the subjective quality and accuracy as determined by the F-measure.
Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator algorithms (EAs) are certainly the fastest, and of ...
详细信息
ISBN:
(纸本)9783319074559;9783319074542
Learning Automata (LA) can be reckoned to be the founding algorithms on which the field of Reinforcement Learning has been built. Among the families of LA, Estimator algorithms (EAs) are certainly the fastest, and of these, the family of pursuit algorithms (PAs) are the pioneering work. It has recently been reported that the previous proofs for s-optimality for all the reported algorithms in the family of PAs have been flawed1. We applaud the researchers who discovered this flaw, and who further proceeded to rectify the proof for the Continuous pursuit Algorithm (CPA). The latter proof, though requires the learning parameter to be continuously changing, is, to the best of our knowledge, the current best and only way to prove CPA's s-optimality. However, for all the algorithms with absorbing states, for example, the Absorbing Continuous pursuit Algorithm (ACPA) and the Discretized pursuit Algorithm (DPA), the constrain of a continuously changing learning parameter can be removed. In this paper, we provide a new method to prove the s-optimality of the Discretized pursuit Algorithm which does not require this constraint. We believe that our proof is both unique and pioneering. It can also form the basis for formally showing the s-optimality of the other EAs with absorbing states.
This paper presents a new efficient solution to the dynamic single source shortest path routing problem, using the principles of generalized pursuit learning. It involves finding the shortest path in a stochastic netw...
详细信息
This paper presents a new efficient solution to the dynamic single source shortest path routing problem, using the principles of generalized pursuit learning. It involves finding the shortest path in a stochastic network, where there are continuous probabilistically based updates in link-costs. The algorithm has been rigorously experimentally evaluated and has been found to be a few orders of magnitude superior to the algorithms available in the literature. It can be used to find the shortest path within the "statistical" average network, which converges irrespective of whether there are new changes in link-costs or not. On the other hand, the existing algorithms would fail to exhibit such a behavior and would recalculate the affected shortest paths after each link-cost update.
暂无评论