In this work, we first establish exponential inequalities for the robbins-monro's algorithm under psi-mixing random errors. Then, we present a numerical application that uses the main result of this work to approx...
详细信息
In this work, we first establish exponential inequalities for the robbins-monro's algorithm under psi-mixing random errors. Then, we present a numerical application that uses the main result of this work to approximate the theoretical solution of the objective function.
An adaptive design for a clinical trial with prognostic factors and more than two treatments is described using a generalised urn model in a random environment. The evolution of the urn composition is expressed by a r...
详细信息
An adaptive design for a clinical trial with prognostic factors and more than two treatments is described using a generalised urn model in a random environment. The evolution of the urn composition is expressed by a recurrence equation that fits the robbins-monro scheme of stochastic approximation. Then, the Ordinary Differential Equation method is used to obtain strong laws. Besides, central limit theorems are also obtained. These results are useful to make inference about the parameters of the clinical trial. (C) 2003 Elsevier B.V. All rights reserved.
We generalize the notion and some properties of the conic function introduced by Vincze and Nagy in 2012. We provide a stochastic algorithm for computing the global minimizer of generalized conic functions, we prove a...
详细信息
We generalize the notion and some properties of the conic function introduced by Vincze and Nagy in 2012. We provide a stochastic algorithm for computing the global minimizer of generalized conic functions, we prove almost sure and L-q-convergence of this algorithm.
This paper investigates the superposition problem of two or more individual semi-Markov decision processes (SMDPs). The new sequential decision process superposed by individual SMDPs is no longer an SMDP and cannot be...
详细信息
This paper investigates the superposition problem of two or more individual semi-Markov decision processes (SMDPs). The new sequential decision process superposed by individual SMDPs is no longer an SMDP and cannot be handled by routine iterative algorithms, but we can expand its state spaces to obtain a hybrid-state SMDP. Using this hybrid-state SMDP as an auxiliary and inspired by the robbins-monro algorithm underlying the reinforcement learning method, we propose an iteration algorithm based on a combination of dynamic programming and reinforcement learning to numerically solve the superposed sequential decision problem. As an illustration example, we apply our superposition model and algorithm to solve the optimal maintenance problem of a two-component independent parallel system.
The stochastic approximation problem is to find some root or minimum of a nonlinear function in the presence of noisy measurements. The classical algorithm for stochastic approximation problem is the robbins-monro (RM...
详细信息
The stochastic approximation problem is to find some root or minimum of a nonlinear function in the presence of noisy measurements. The classical algorithm for stochastic approximation problem is the robbins-monro (RM) algorithm, which uses the noisy negative gradient direction as the iterative direction. In order to accelerate the classical RM algorithm, this paper gives a new combined direction stochastic approximation algorithm which employs a weighted combination of the current noisy negative gradient and some former noisy negative gradient as iterative direction. Both the almost sure convergence and the asymptotic rate of convergence of the new algorithm are established. Numerical experiments show that the new algorithm outperforms the classical RM algorithm.
We present a general framework for applying simulation to optimize the behavior of discrete event systems. Our approach involves modeling the discrete event system under study as a general state space Markov chain who...
详细信息
We present a general framework for applying simulation to optimize the behavior of discrete event systems. Our approach involves modeling the discrete event system under study as a general state space Markov chain whose distribution depends on the decision parameters. We then show how simulation and the likelihood ratio method can be used to evaluate the performance measure of interest and its gradient, and we present conditions that guarantee that the robbins-monro stochastic approximation algorithm will converge almost surely to the optimal values of the decision parameters. Both transient and steady-state performance measures are considered. For steady-state performance measures, we consider both the case when the Markov chain of interest is regenerative in the standard sense, as well as the case when this Markov chain is Harris recurrent, and thereby regenerative in a wider sense.
Non-orthogonal multiple access (NOMA) is one of the promising radio access techniques for next generation wireless networks. Opportunistic multi-user scheduling is necessary to fully exploit multiplexing gains in NOMA...
详细信息
Non-orthogonal multiple access (NOMA) is one of the promising radio access techniques for next generation wireless networks. Opportunistic multi-user scheduling is necessary to fully exploit multiplexing gains in NOMA systems, but compared with traditional scheduling, inter-relations between users' throughputs induced by multi-user interference poses new challenges in the design of NOMA schedulers. A successful NOMA scheduler has to carefully balance the following three objectives: Maximizing average system utility, satisfying desired fairness constraints among the users and enabling real time, and low computational cost implementations. In this paper, scheduling for NOMA systems under temporal fairness constraints is considered. Temporal fair scheduling leads to communication systems with predictable latency as opposed to utilitarian fair schedulers for which latency can be highly variable. It is shown that under temporal fairness constraints, optimal system utility is achieved using a class of opportunistic scheduling schemes called threshold based strategies (TBS). One of the challenges in heterogeneous NOMA scenarios-where only specific users may be activated simultaneously-is to determine the set of feasible temporal shares. A variable elimination algorithm is proposed to accomplish this task. Furthermore, an (online) iterative algorithm based on the robbins-monro method is proposed to construct a TBS by finding the optimal thresholds for a given system utility metric. The algorithm does not require knowledge of the users' channel statistics. Rather, at each time slot, it has access to the channel realizations in the previous time slots. Various numerical simulations of practical scenarios are provided to illustrate the effectiveness of the proposed NOMA scheduling in static and mobile scenarios.
Cottrell-Fort's model and some self-organizing neural algorithms are analyzed in this paper In the one-dimensional case, the almost sure convergence for Cottrell-Fort's algorithm is obtained when the probabili...
详细信息
Cottrell-Fort's model and some self-organizing neural algorithms are analyzed in this paper In the one-dimensional case, the almost sure convergence for Cottrell-Fort's algorithm is obtained when the probability distribution of stimulus center is not uniform. A new algorithm is designed which is the combination of Kohonen and Cottrell-Fort's algorithms. In the two-dimensional case, another new training algorithm is provided which is different from Cottrell-Fort's. It also has almost sure convergence for rather general stimulus distributions. An interaction parameter is introduced to make the model more flexible. For any boundary condition, the algorithm is convergent when suitable stimulus distributions are applied. The self-organized map depends on the stimulus distribution and the boundary conditions but not on the initial map. It shows some relation between the statistical distribution of the stimulus and the connection structure in the neural maps. The system is robust and reliable. An example in the one-dimensional case shows that even if some nodes have zero stimulus probability the system can still form an ordered map from an unordered initial map.
A number of network applications require stable transport throughput for tasks such as control and coordination operations over wide-area networks. We present a window-based method that achieves stable throughput at a...
详细信息
A number of network applications require stable transport throughput for tasks such as control and coordination operations over wide-area networks. We present a window-based method that achieves stable throughput at a target level by utilizing a variation of the classical robbins-monro stochastic approximation algorithm. We analytically show the stability of this method under very mild conditions on the network, which are justified by Internet measurements. Our User Datagram Protocol (UDP)-based implementation provides stable throughput over the Internet under various traffic conditions.
In this paper, we present the method for on-line tuning of a design weighting polynomial parameters of a multivariable self-tuning controller which adapts to changes in the higher-order nonminimum phase system paramet...
详细信息
In this paper, we present the method for on-line tuning of a design weighting polynomial parameters of a multivariable self-tuning controller which adapts to changes in the higher-order nonminimum phase system parameters with time delays and noises. The algorithm effect is achieved through the recursive least square algorithm at the parameter estimation stage and also through the robbins-monro, algorithm at the stage of optimizing the design weighting polynomial parameters of the controller. The proposed method is simple and effective compared with the pole restriction method. The computer simulation results are presented to adapt the higher-order multivariable system with nonminimum phase and with changeable system parameters. (C) 2002 Published by Elsevier Science Ltd.
暂无评论