检索结果-内蒙古大学图书馆

Analysis and optimization of service availability in an HA cluster with load-dependent machine availability

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 2007年第9期18卷 1307-1319页

作者： Ang, Chee-Wei Tham, Chen-Khong Inst Infocomm Res Singapore 119613 Singapore Natl Univ Singapore Dept Elect & Comp Engn Singapore 119260 Singapore

Calculations of service availability of a High-Availability (HA) cluster are usually based on the assumption of load-independent machine availabilities. In this paper, we study the issues and show how the service availabilities can be calculated under the assumption that machine availabilities are load dependent. We present a Markov chain analysis to derive the steady-state service availabilities of a load-dependent machine availability HA cluster. We show that with a load-dependent machine availability, the attained service availability is now policy dependent. After formulating the problem as a Markov Decision Process, we proceed to determine the optimal policy to achieve the maximum service availabilities by using the method of policy iteration. Two greedy assignment algorithms are studied: least load and first derivative length (FDL) based, where least load corresponds to some load balancing algorithms. We carry out the analysis and simulations on two cases of load profiles: In the first profile, a single machine has the capacity to host all services in the HA cluster;in the second profile, a single machine does not have enough capacity to host all services. We show that the service availabilities achieved under the first load profile are the same, whereas the service availabilities achieved under the second load profile are different. Since the service availabilities achieved are different in the second load profile, we proceed to investigate how the distribution of service availabilities across the services can be controlled by adjusting the rewards vector.

关键词： high availability cluster computing Markov chains Markov decision processes dynamic programming neuro-dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes

引用

MACHINE LEARNING 2006年第2期63卷 107-133页

作者： Tadic, VB Univ Sheffield Dept Automat Control & Syst Engn Sheffield S1 3JD S Yorkshire England

The mean-square asymptotic behavior of temporal-difference learning algorithrns with constant step-sizes and linear function approximation is analyzed in this paper. The analysis is carried out for the case of discounted cost function associated with a Markov chain with a finite dimensional state-space. Under mild conditions, an upper bound for the asymptotic mean-square error of these algorithms is determined as a function of the step-size. Moreover, under the same assumptions, it is also shown that this bound is linear in the step size. The main results of the paper are illustrated with examples related to M/G/1 queues and nonlinear AR models with Markov switching.

关键词： temporal-difference learning neuro-dynamic programming reinforcement learning stochastic approximation Markov chains

来源：评论

学校读者我要写书评

暂无评论

Relative value function approximation for the capacitated re-entrant line scheduling problem

引用

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2005年第3期2卷 285-299页

作者： Choi, JY Reveliotis, S Georgia Inst Technol Sch Ind & Syst Engn Atlanta GA 30332 USA

The problem addressed in this study is that of determining how to allocate the workstation processing and buffering capacity in a capacitated re-entrant line to the job instances competing for it, in order to maximize its long-run/steady-state throughput, while maintaining the logical correctness of the underlying material flow, i.e., deadlock-free operations. An approximation scheme for the optimal policy that is based on neuro-dynamic programming theory is proposed, and its performance is assessed through a numerical experiment. The derived results indicate that the proposed method holds considerable promise for providing a viable, computationally efficient approach to the problem and highlight directions for further investigation.

关键词： capacitated re-entrant line (CRL) neuro-dynamic programming relative value function approximation scheduling

来源：评论

学校读者我要写书评

暂无评论

Approximate dynamic programming strategies and their applicability for process control: A review and future directions

引用

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS 2004年第3期2卷 263-278页

作者： Lee, JM Lee, JH Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

This paper reviews dynamic programming (DP), surveys approximate solution methods for it, and considers their applicability to process control problems. Reinforcement Learning (RL) and neuro-dynamic programming (NDP), which can be viewed as approximate DP techniques, are already established techniques for solving difficult multi-stage decision problems in the fields of operations research, computer science, and robotics. Owing to the significant disparity of problem formulations and objective, however, the algorithms and techniques available from these fields are not directly applicable to process control problems, and reformulations based on accurate understanding of these techniques are needed. We categorize the currently available approximate solution techniques for dynamic programming and identify those most suitable for process control problems. Several open issues are also identified and discussed.

关键词： approximate dynamic programming reinforcement learning neuro-dynamic programming optimal control function approximation

来源：评论

学校读者我要写书评

暂无评论

Valuation of American options via basis functions

引用

IEEE TRANSACTIONS ON AUTOMATIC CONTROL 2004年第3期49卷 374-385页

作者： Lai, TL Wong, SPS Stanford Univ Dept Stat Stanford CA 94305 USA Hong Kong Univ Sci & Technol Dept Informat & Syst Management Hong Kong Hong Kong Peoples R China

After a brief review of recent developments in the pricing and hedging of American options, this paper modifies the basis function approach to adaptive control and neuro-dynamic programming, and applies it to develop: 1) nonparametric pricing formulas for actively traded American options and 2) simulation-based optimization strategies for complex over-the-counter options, whose optimal stopping problems are prohibitively difficult to solve numerically by standard backward induction algorithms because of the curse of dimensionality. An important issue in this approach is the choice of basis functions, for which some guidelines and their underlying theory are provided.

关键词： function approximation neuro-dynamic programming optimal stopping option pricing spline basis

来源：评论

学校读者我要写书评

暂无评论

Simulation-based learning of cost-to-go for control of nonlinear processes

引用

KOREAN JOURNAL OF CHEMICAL ENGINEERING 2004年第2期21卷 338-344页

作者： Lee, JM Lee, JH Georgia Inst Technol Sch Chem & Biomol Engn Atlanta GA 30332 USA

In this paper, we present a simulation-based dynamic programming method that learns the 'cost-to-go' function in an iterative manner. The method is intended to combat two important drawbacks of the conventional Model Predictive Control (MPC) formulation, which are the potentially exorbitant online computational requirement and the inability to consider the future interplay between uncertainty and estimation in the optimal control calculation. We use a nonlinear Van de Vusse reactor to investigate the efficacy of the proposed approach and identify further research issues.

关键词： nonlinear Model Predictive Control dynamic programming stochastic optimal control reinforcement learning neuro-dynamic programming function approximation

来源：评论

学校读者我要写书评

暂无评论

neuro-dynamic programming method for MPC 1

引用

IFAC Proceedings Volumes 2001年第25期34卷 143-148页

作者： Jong Min Lee Jay H. Lee School of Chemical Engineering Georgia Institute of Technology Atlanta GA 30332 U.S.A

In this paper, we present how the approach of neuro-dynamic programming(NDP) can be used to combat two important deficiencies of the conventional Model Predictive Control (MPC) formulation, the sometimes exorbitant on-line computational requirement and the inability to consider the evolution of uncertainty in the optimal control calculation. We use a simple Van de Vusse reactor to investigate the feasibility of the proposed approach and identify further research issues.

关键词： neuro-dynamic programming Nonlinear Model Predictive Control dynamic programming Stochastic Optimal Control Dual Control

来源：评论

学校读者我要写书评

暂无评论

Optimization of a Fed-Batch Bioreactor Using Simulation-Based Approach

引用

IFAC Proceedings Volumes 2004年第1期37卷 347-352页

作者： Catalina Valencia Peroni Jay H. Lee Niket S. Kaisare Universitat Rovira I Virgili Tarragona Catalunya Spain School of Chemical Engineering Georgia Institute of Technology Atlanta CA 30332 U.S.A.

We use simulation-based approach to find the optimal feeding strategy for cloned invertase expression in Saccharomyces cerevisiae in a fed-batch bioreactor. The optimal strategy maximizes the productivity and minimizes the fermentation time. This procedure is motivated from neuro dynamic programming (NDP) literature. wherein the optimal solution is parameterized in the form of a cost-to-go or profit-to-go functions. The proposed approach uses simulations from a heuristic feeding policy as a starting point to generate the profit-to-go vs state data. An artificial neural network is used to obtain profit-to-go as a function of system state. Iterations of Bellman equation are used to improve the profit function . The profit-to-go function thus obtained, is then implemented in an online controller, which essentially converts infinite horizon problem into an equivalent one-step-ahead problem.

关键词： Fed-batch Reactor Optimal Control neuro-dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Simulation based strategy for nonlinear optimal control: application to a microbial cell reactor

引用

INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL 2003年第3-4期13卷 347-363页

作者： Kaisare, NS Lee, JM Lee, JH Georgia Inst Technol Sch Chem Engn Atlanta GA 30332 USA

Optimal control of systems with complex nonlinear behaviour such as steady state multiplicity results in a nonlinear optimization problem that needs to be solved online at each sample time. We present an approach based on simulation, function approximation and evolutionary improvement aimed towards simplifying online optimization. Closed loop data from a suboptimal control law, such as MPC based on successive linearization, are used to obtain an approximation of the 'cost-to-go' function, which is subsequently improved through iterations of the Bellman equation. Using this offline-computed cost approximation, an infinite horizon problem is converted to an equivalent single stage problem-substantially reducing the computational burden. This approach is tested on continuous culture of microbes growing on a nutrient medium containing two substrates that exhibits steady state multiplicity. Extrapolation of the cost-to-go function approximator can lead to deterioration of online performance. Some remedies to prevent such problems caused by extrapolation are proposed. Copyright (C) 2003 John Wiley Sons, Ltd.

关键词： optimal control continuous bioreactor multiple steady states neuro-dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Markov decision processes with delays and asynchronous cost collection

引用

IEEE TRANSACTIONS ON AUTOMATIC CONTROL 2003年第4期48卷 568-574页

作者： Katsikopoulos, KV Engelbrecht, SE Univ Massachusetts Dept Mech & Ind Engn Amherst MA 01003 USA Univ Massachusetts Dept Comp Sci Amherst MA 01003 USA

Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect at a later decision stage rather than immediately (action delay). Third, the cost induced by an action may be collected after a number of stages (cost delay). We derive two results, one for constant and one for random delays, for reducing an MDP with delays to an MDP without delays, which, differs only in the size of the state space. The results are based on the intuition that costs may be collected asynchronously, i.e., at a stage other than the one in which they are induced, as long as they are discounted properly.

关键词： asynchrony delays Markov decision processes (MDP's) neuro-dynamic programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：