检索结果-内蒙古大学图书馆

A simulation-and-regression approach for stochastic dynamic programs with endogenous state variables

COMPUTERS & OPERATIONS RESEARCH 2013年第11期40卷 2760-2769页

作者： Denault, Michel Simonato, Jean-Guy Stentoft, Lars HEC Montreal Management Sci Montreal PQ Canada HEC Montreal Finance Montreal PQ Canada

We investigate the optimum control of a stochastic system, in the presence of both exogenous (control-independent) stochastic state variables and endogenous (control-dependent) state variables. Our solution approach relies on simulations and regressions with respect to the state variables, but also grafts the endogenous state variable into the simulation paths. That is, unlike most other simulation approaches found in the literature, no discretization of the endogenous variable is required. The approach is meant to handle several stochastic variables, offers a high level of flexibility in their modeling, and should be at its best in non time-homogenous cases, when the optimal policy structure changes with time. We provide numerical results for a dam-based hydropower application, where the exogenous variable is the stochastic spot price of power, and the endogenous variable is the water level in the reservoir. (C) 2013 Elsevier Ltd. All rights reserved.

关键词： Stochastic control approximate dynamic programming Simulation and regression Least-squares Monte Carlo Hydropower management

来源：评论

学校读者我要写书评

暂无评论

A switching robust model predictive control approach for nonlinear systems

引用

JOURNAL OF PROCESS CONTROL 2013年第6期23卷 852-860页

作者： Yang, Yu Lee, Jong Min Univ Alberta Edmonton AB T6G 2V4 Canada Seoul Natl Univ Inst Chem Proc Sch Chem & Biol Engn Seoul 151744 South Korea

This work considers enhancing the stability and improving the economic performance of nonlinear model predictive control in the presence of disturbances or model uncertainties. First, a robust control Lyapunov function (RCLF)-based predictive control strategy is proposed. Second, the approximate dynamic programming (ADP) is employed to further improve regulation performance. Finally, the ADP and RCLF-MPC are combined to provide a switching control scheme, which is illustrated on a CSTR example to show its effectiveness. (C) 2013 Elsevier Ltd. All rights reserved.

关键词： Nonlinear model predictive control Robust control Lyapunov function approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Scenario-based, closed-loop model predictive control with application to emergency vehicle scheduling

引用

INTERNATIONAL JOURNAL OF CONTROL 2013年第8期86卷 1338-1348页

作者： Goodwin, Graham. C. Medioli, Adrian. M. Univ Newcastle Sch Elect Engn & Comp Sci Callaghan NSW 2308 Australia

Model predictive control has been a major success story in process control. More recently, the methodology has been used in other contexts, including automotive engine control, power electronics and telecommunications. Most applications focus on set-point tracking and use single-sequence optimisation. Here we consider an alternative class of problems motivated by the scheduling of emergency vehicles. Here disturbances are the dominant feature. We develop a novel closed-loop model predictive control strategy aimed at this class of problems. We motivate, and illustrate, the ideas via the problem of fluid deployment of ambulance resources.

关键词： scenario-based MPC stochastic control emergency vehicles re-deployment approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

A dynamic programming approximation for downlink channel allocation in cognitive femtocell networks

引用

COMPUTER NETWORKS 2013年第15期57卷 2976-2991页

作者： Xiang, Xudong Wan, Jianxiong Lin, Chuang Chen, Xin USTB Dept Comp Sci & Technol Beijing 100083 Peoples R China Tsinghua Univ Dept Comp Sci & Technol Beijing 100084 Peoples R China Inner Mongolia Univ Technol Coll Informat Engn Hohhot 010080 Inner Mongolia Peoples R China Beijing Informat Sci & Technol Univ Comp Sch Beijing 100101 Peoples R China

Both femtocells and cognitive radio (CR) are envisioned as promising technologies for the NeXt Generation (xG) cellular networks. Cognitive femtocell networks (CogFem) incorporate CR technology into femtocell deployment to reduce its demand for more spectrum bands, thereby improving the spectrum utilization. In this paper, we focus on the channel allocation problem in CogFem, and formulate it as a stochastic dynamic programming (SDP) problem aiming at optimizing the long-term cumulative system throughput of individual femtocells. However, the multi-dimensional state variables resulted from complex exogenous stochastic information make the SDP problem computationally intractable using standard value iteration algorithms. To address this issue, we propose an approximate dynamic programming (ADP) algorithm in pursuit of an approximate solution to the SDP problem. The proposed ADP algorithm relies on an efficient value function approximation (VFA) architecture that we design and a stochastic gradient learning strategy to function, enabling each femtocell to learn and improve its own channel allocation policy. The algorithm is computationally attractive for large-scale downlink channel allocation problems in CogFem since its time complexity does not grow exponentially with the number of femtocells. Simulation results have shown that the proposed ADP algorithm exhibits great advantages: (1) it is feasible for online implementation with a fair rate of convergence and adaptability to both long-term and short-term network dynamics;and (2) it produces high-quality solutions fast, reaching approximately 80% of the upper bounds provided by optimal backward dynamic programming (DP) solutions to a set of deterministic counterparts of the formulated SDP problem. (C) 2013 Elsevier B.V. All rights reserved.

关键词： Cognitive-radio approximate dynamic programming Femtocell Channel allocation

来源：评论

学校读者我要写书评

暂无评论

Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique

引用

NEUROCOMPUTING 2013年 121卷 218-225页

作者： Wang, Ding Liu, Derong Chinese Acad Sci State Key Lab Management & Control Complex Syst Inst Automat Beijing 100190 Peoples R China

In this paper, the adaptive dynamic programming (ADP) approach is utilized to design a neural-network-based optimal controller for a class of unknown discrete-time nonlinear systems with quadratic cost function. To begin with, a neural network identifier is constructed to learn the unknown dynamic system with stability proof. Then, the iterative ADP algorithm is developed to handle the nonlinear optimal control problem with convergence analysis. Moreover, the single network dual heuristic dynamic programming (SN-DHP) technique, which eliminates the use of action network, is introduced to implement the iterative ADP algorithm. Finally, two simulation examples are included to illustrate the effectiveness of the present approach. (C) 2013 Elsevier B.V. All rights reserved.

关键词： Adaptive critic designs Adaptive dynamic programming approximate dynamic programming Neural networks Optimal control Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Adaptive dynamic programming Algorithm for Renewable Energy Scheduling and Battery Management

引用

COGNITIVE COMPUTATION 2013年第2期5卷 264-277页

作者： Boaro, Matteo Fuselli, Danilo De Angelis, Francesco Liu, Derong Wei, Qinglai Piazza, Francesco Univ Politecn Marche Dipartimento Ingn Informaz I-60131 Ancona Italy Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

The employment of intelligent energy management systems likely allows reducing consumptions and thus saving money for consumers. The residential load demand must be met, and some advantages can be obtained if specific optimization policies are taken. With an efficient use of renewable sources and power imported from the grid, an intelligent and adaptive system which manages the battery is able to satisfy the load demand and minimize the entire energy cost related to the scenario under study. In this paper, an adaptive dynamic programming-based algorithm is presented to face dynamic situations, in which some conditions of the environment or habits of customer may vary with time, especially using renewable energy. Based on the idea of smart grid, we propose an intelligent management scheme for renewable resources combined with battery implemented with a faster and simpler scheme of dynamic programming, by considering only one critic network and some optimization policies in order to satisfy the load demand. Since this kind of problem is suitable to avoid the training of an action network, the training loop among the two neural networks is deleted and the training process is greatly simplified. Computer simulations confirm the effectiveness of this self-learning design in a typical residential scenario.

关键词： Adaptive dynamic programming approximate dynamic programming Neural networks Energy scheduling Battery management

来源：评论

学校读者我要写书评

暂无评论

Assessing the Value of dynamic Pricing in Network Revenue Management

引用

INFORMS JOURNAL ON COMPUTING 2013年第1期25卷 102-115页

作者： Zhang, Dan Lu, Zhaosong Univ Colorado Leeds Sch Business Boulder CO 80309 USA Simon Fraser Univ Dept Math Burnaby BC V5A 1S6 Canada

dynamic pricing for a network of resources over a finite selling horizon has received considerable attention in recent years, yet few papers provide effective computational approaches to solve the problem. We consider a resource decomposition approach to solve the problem and investigate the performance of the approach in a computational study. We compare the performance of the approach to static pricing and choice-based availability control. Our numerical results show that dynamic pricing policies from network resource decomposition can achieve significant revenue lift compared with choice-based availability control and static pricing, even when the latter is frequently resolved. As a by-product of our approach, network decomposition provides an upper bound in revenue, which is provably tighter than the well-known upper bound from a deterministic approximation.

关键词： revenue management dynamic pricing approximate dynamic programming choice models

来源：评论

学校读者我要写书评

暂无评论

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

引用

AUTOMATICA 2013年第1期49卷 82-92页

作者： Bhasin, S. Kamalapurkar, R. Johnson, M. Vamvoudakis, K. G. Lewis, F. L. Dixon, W. E. Indian Inst Technol Dept Elect Engn Delhi India Univ Florida Dept Mech & Aerosp Engn Gainesville FL USA Univ Calif Santa Barbara Ctr Control Dynam Syst & Computat CCDC Santa Barbara CA 93106 USA Univ Texas Arlington Automat & Robot Res Inst Ft Worth TX 76118 USA

An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor-critic-identifier (ACI) is proposed to approximate the Hamilton-Jacobi-Bellman equation using three neural network (NN) structures actor and critic NNs approximate the optimal control and the optimal value function, respectively, and a robust dynamic neural network identifier asymptotically approximates the uncertain system dynamics. An advantage of using the ACI architecture is that learning by the actor, critic, and identifier is continuous and simultaneous, without requiring knowledge of system drift dynamics. Convergence of the algorithm is analyzed using Lyapunov-based adaptive control methods. A persistence of excitation condition is required to guarantee exponential convergence to a bounded region in the neighborhood of the optimal control and uniformly ultimately bounded (UUB) stability of the closed-loop system. Simulation results demonstrate the performance of the actor-critic-identifier method for approximate optimal control. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： Learning control Adaptive control Optimal control approximate dynamic programming Actor-critic-identifier

来源：评论

学校读者我要写书评

暂无评论

Analysis of cross-price effects on markdown policies by using function approximation techniques

引用

KNOWLEDGE-BASED SYSTEMS 2013年第Nov.期53卷 173-184页

作者： Cosgun, Ozlem Kula, Ufuk Kahraman, Cengiz Fatih Univ Dept Ind Engn TR-34500 Istanbul Turkey Sakarya Univ Dept Ind Engn Sakarya Turkey Istanbul Tech Univ Dept Ind Engn TR-80626 Istanbul Turkey

Markdown policies for product groups having significant cross-price elasticity among each other should be jointly determined. However, finding optimal policies for product groups becomes computationally intractable as the number of products increases. Therefore, we formulate the problem as a Markov decision process and use approximate dynamic programming approach to solve it. Since the state space is multidimensional and very large, the number of iterations required to learn the state values is enormous. Therefore, we use aggregation and neural networks in order to approximate the value function and to determine the optimal markdown policies approximately. In a numerical study, we provide insights on the behavior of markdown policies when one product is expensive, the other is cheap and both have the same price. We also provide insights and compare the markdown policies for the cases in which there is a substitution effect between products and the products are independent. (C) 2013 Elsevier B.V. All rights reserved.

关键词： approximate dynamic programming Artificial neural networks Cross-price elasticity Markdown optimization dynamic pricing Aggregation

来源：评论

学校读者我要写书评

暂无评论

Multi-objective optimal control for a class of nonlinear time-delay systems via adaptive dynamic programming

引用

SOFT COMPUTING 2013年第11期17卷 2109-2115页

作者： Song, Ruizhuo Xiao, Wendong Wei, Qinglai Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Chinese Acad Sci State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

A novel multi-objective adaptive dynamic programming (ADP) method is constructed to obtain the optimal controller of a class of nonlinear time-delay systems in this paper. Using the weighted sum technology, the original multi-objective optimal control problem is transformed to the single one. An ADP method is established for nonlinear time-delay systems to solve the optimal control problem. To demonstrate that the presented iterative performance index function sequence is convergent and the closed-loop system is asymptotically stable, the convergence analysis is also given. The neural networks are used to get the approximative control policy and the approximative performance index function, respectively. Two simulation examples are presented to illustrate the performance of the presented optimal control method.

关键词： Optimal control Adaptive dynamic programming Multi-objective approximate dynamic programming Time-delay

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：