检索结果-内蒙古大学图书馆

Neural-network-based synchronous iteration learning method for multi-player zero-sum games

NEUROCOMPUTING 2017年 242卷 73-82页

作者： Song, Ruizhuo Wei, Qinglai Song, Biao Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this paper, a synchronous solution method for multi-player zero-sum games without system dynamics is established based on neural network. The policy iteration (PI) algorithm is presented to solve the Hamilton-Jacobi-Bellman (HJB) equation. It is proven that the obtained iterative cost function is convergent to the optimal game value. For avoiding system dynamics, off-policy learning method is given to obtain the iterative cost function, controls and disturbances based on Pl. Critic neural network (CNN), action neural networks (ANNs) and disturbance neural networks (DNNs) are used to approximate the cost function, controls and disturbances. The weights of neural networks compose the synchronous weight matrix, and the uniformly ultimately bounded (UUB) of the synchronous weight matrix is proven. Two examples are given to show that the effectiveness of the proposed synchronous solution method for multi-player ZS games. (C) 2017 Elsevier B.V. All rights reserved.

关键词： Adaptive dynamic programming approximate dynamic programming Adaptive critic designs Multi-player Iteration learning Neural network

来源：评论

学校读者我要写书评

暂无评论

Drift counteraction optimal control for deterministic systems and enhancing convergence of value iteration

引用

AUTOMATICA 2017年 83卷 108-115页

作者： Zidek, Robert A. E. Kolmanovsky, Ilya V. Univ Michigan Dept Aerosp Engn Ann Arbor MI 48109 USA

The paper treats a class of optimal control problems for deterministic nonlinear discrete-time systems with the objective of maximizing the time or total yield until prescribed constraints are violated. Such problems are referred to as drift counteraction optimal control (DCOC) problems as the corresponding control policy may be viewed as optimally counteracting drift imposed by disturbances or system dynamics. We derive conditions for the existence of an optimal solution. The optimal control policy is characterized by the value function and a new algorithm based on proportional feedback is presented that obtains the value function faster than conventional dynamic programming algorithms. In addition, an approximate dynamic programming (ADP) approach using Gaussian process regression is formulated based on the new algorithm. Two numerical examples are reported, a time maximization problem for a van der Pol oscillator and a satellite life extension problem. (C) 2017 Elsevier Ltd. All rights reserved.

关键词： Optimal control Constrained nonlinear systems approximate dynamic programming Aerospace applications

来源：评论

学校读者我要写书评

暂无评论

Meeting Inelastic Demand in Systems With Storage and Renewable Sources

引用

IEEE TRANSACTIONS ON SMART GRID 2017年第4期8卷 1619-1629页

作者： Kwon, Soongeol Xu, Yunjian Gautam, Natarajan Texas A&M Univ Dept Ind & Syst Engn College Stn TX 77843 USA Singapore Univ Technol & Design Engn Syst & Design Pillar Singapore 591401 Singapore

We consider a system where inelastic demand for electric power is met from three sources: 1) the grid;2) in-house renewables such as solar panels;and 3) an in-house energy storage device. In our setting, energy demand, renewable power supply, and cost for grid power are all time-varying and stochastic. Furthermore, there are limits and inefficiency associated with charging and discharging the energy storage device. We formulate the storage operation problem as a dynamic program with parameters estimated from real-world demand, supply, and cost data. As the dynamic program is computationally intensive for large-scale problems, we explore algorithms based on approximate dynamic programming (ADP) and apply them to a test data set. Using the real-world test data, we numerically compare the performance of two ADP-based algorithms against Lyapunov optimization-based algorithms that require no statistical knowledge. Our results ascertain the value of storage and the value of installing a renewable source.

关键词： approximate dynamic programming energy storage look-ahead policies renewable generation solar PV

来源：评论

学校读者我要写书评

暂无评论

Data-driven approximate value iteration with optimality error bound analysis

引用

AUTOMATICA 2017年 78卷 79-87页

作者： Li, Yongqiang Hou, Zhongsheng Feng, Yuanjing Chi, Ronghu Zhejiang Univ Technol Coll Informat Engn Hangzhou Zhejiang Peoples R China Beijing Jiaotong Univ Sch Elect & Informat Engn Adv Control Syst Lab Beijing Peoples R China Qingdao Univ Sci & Technol Sch Automat & Elect Engn Qingdao Peoples R China

Features of the data-driven approximate value iteration (AVI) algorithm, proposed in Li et al. (2014) for dealing with the optimal stabilization problem, include that only process data is required and that the estimate of the domain of attraction for the closed-loop is enlarged. However, the controller generated by the data-driven AVI algorithm is an approximate solution for the optimal control problem. In this work, a quantitative analysis result on the error bound between the optimal cost and the cost under the designed controller is given. This error bound is determined by the approximation error of the estimation for the optimal cost and the approximation error of the controller function estimator. The first one is concretely determined by the approximation error of the data-driven dynamic programming (DP) operator to the DP operator and the approximation error of the value function estimator. These three approximation errors are zeros when the data set of the plant is sufficient and infinitely complete, and the number of samples in the interested state space is infinite. This means that the cost under the designed controller equals to the optimal cost when the number of iterations is infinite. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Data-driven control approximate dynamic programming Domain of attraction Asymptotic stabilization

来源：评论

学校读者我要写书评

暂无评论

Managing Patient Admissions in a Neurology Ward

引用

OPERATIONS RESEARCH 2017年第3期65卷 635-656页

作者： Samiedaluie, Saied Kucukyazici, Beste Verter, Vedat Zhang, Dan Univ Alberta Alberta Sch Business Edmonton AB T6G 2R6 Canada McGill Univ Desautels Fac Management Montreal PQ H3A 1G5 Canada Univ Colorado Boulder Leeds Sch Business Boulder CO 80309 USA

We study patient admission policies in a neurology ward where there are multiple types of patients with different medical characteristics. Patients receive specialized care inside the neurology ward and delays in admission to the ward will have negative impact on their health status. The level of this impact varies among patient types and depends on the severity of patients. Patients are also different in terms of arrival rate and length of stay at the ward. The patients normally wait in the emergency department until a ward bed is assigned to them. We formulate this problem as an infinite-horizon average cost dynamic program and propose an efficient approximation scheme to solve large-scale problem instances. The computational results from applying our model to a neurology ward show that dynamic policies generated by our approach can reduce the overall deterioration in patients' health status compared to several alternative policies.

关键词： patient admission neurology ward approximate dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Mixed Iterative Adaptive dynamic programming for Optimal Battery Energy Control in Smart Residential Microgrids

引用

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2017年第5期64卷 4110-4120页

作者： Wei, Qinglai Liu, Derong Lewis, Frank L. Liu, Yu Zhang, Jie Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Univ Texas Arlington UTA Res Inst Arlington TX 76118 USA Chinese Acad Sci Inst Automat Beijing 100190 Peoples R China Qingdao Acad Intelligent Ind Qingdao 266000 Shandong Peoples R China

In this paper, a novel mixed iterative adaptive dynamic programming (ADP) algorithm is developed to solve the optimal battery energy management and control problem in smart residential microgrid systems. Based on the data of the load and electricity rate, two iterations are constructed, which are P-iteration and V-iteration, respectively. The V-iteration is implemented based on value iteration, which aims to obtain the iterative control law sequence in each period. The P-iteration is implemented based on policy iteration, which updates the iterative value function according to the iterative control law sequence. Properties of the developed mixed iterative ADP algorithm are analyzed. It is shown that the iterative value function is monotonically nonincreasing and converges to the solution of the Bellman equation. In each iteration, it is proven that the performance index function is finite under the iterative control law sequence. Finally, numerical results and comparisons are given to illustrate the performance of the developed algorithm.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming mixed iteration optimal control policy iteration smart grid value iteration

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time Optimal Control via Local Policy Iteration Adaptive dynamic programming

引用

IEEE TRANSACTIONS ON CYBERNETICS 2017年第10期47卷 3367-3379页

作者： Wei, Qinglai Liu, Derong Lin, Qiao Song, Ruizhuo Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China

In this paper, a discrete-time optimal control scheme is developed via a novel local policy iteration adaptive dynamic programming algorithm. In the discrete-time local policy iteration algorithm, the iterative value function and iterative control law can be updated in a subset of the state space, where the computational burden is relaxed compared with the traditional policy iteration algorithm. Convergence properties of the local policy iteration algorithm are presented to show that the iterative value function is monotonically nonincreasing and converges to the optimum under some mild conditions. The admissibility of the iterative control law is proven, which shows that the control system can be stabilized under any of the iterative control laws, even if the iterative control law is updated in a subset of the state space. Finally, two simulation examples are given to illustrate the performance of the developed method.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming local policy iteration neuro-dynamic programming nonlinear systems optimal control

来源：评论

学校读者我要写书评

暂无评论

dynamic pricing and reservation for intelligent urban parking management

引用

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES 2017年 77卷 226-244页

作者： Lei, Chao Ouyang, Yanfeng Univ Illinois Dept Civil & Environm Engn Urbana IL 61801 USA

Despite rapid advances of information technologies for intelligent parking systems, it remains a challenge to optimally manage limited parking resources in busy urban neighborhoods. In this paper, we use dynamic location-dependent parking pricing and reservation to improve system-wide performance of an intelligent parking system. With this system, the parking agency is able to decide the spatial and temporal distribution of parking prices to achieve a variety of objectives, while drivers with different origins and destinations compete for limited parking spaces via online reservation. We develop a multi period non-cooperative bi-level model to capture the complex interactions among the parking agency and multiple drivers, as well as a non-myopic approximate dynamic programming (ADP) approach to solve the model. It is shown with numerical examples that the ADP-based pricing policy consistently outperforms alternative policies in achieving greater performance of the parking system, and shows reliability in handling the spatial and temporal variations in parking demand. (C) 2017 Elsevier Ltd. All rights reserved.

关键词： Parking management dynamic pricing approximate dynamic programming Equilibrium MPEC

来源：评论

学校读者我要写书评

暂无评论

dynamic Patient Scheduling for Multi-Appointment Health Care Programs

引用

PRODUCTION AND OPERATIONS MANAGEMENT 2018年第1期27卷 58-79页

作者： Diamant, Adam Milner, Joseph Quereshy, Fayez York Univ Schulich Sch Business 111 Ian Macdonald Blvd Toronto ON M3J 1P3 Canada Univ Toronto Rotman Sch Management 105 St George St Toronto ON M5S 3E6 Canada Toronto Western Hosp Gen Surg 399 Bathurst St Toronto ON M5T 2S8 Canada

We investigate the scheduling practices of a multidisciplinary, multistage, outpatient health care program. Patients undergo a series of assessments before being eligible for elective surgery. Such systems often suffer from high rates of attrition and appointment no-shows leading to capacity underutilization and treatment delays. We propose a new scheduling model where the clinic assigns patients to an appointment day but postpones the decision of which assessments patients undergo pending the observation of who arrives. In doing so, the clinic gains flexibility to improve system performance. We formulate the scheduling problem as a Markov decision process and use approximate dynamic programming to solve it. We apply our approach to a dataset collected from a bariatric surgery program at a large tertiary hospital in Toronto, Canada. We examine the quality of our solutions via structural results and compare them with heuristic scheduling practices using a discrete-event simulation. By allowing multiple assessments, delaying their scheduling, and by optimizing over an appointment book, we show significant improvements in patient throughput, clinic profit, use of overtime, and staff utilization.

关键词： service systems appointment scheduling approximate dynamic programming no-shows multiple appointments

来源：评论

学校读者我要写书评

暂无评论

Natural gas storage valuation via least squares Monte Carlo and support vector regression

引用

ENERGY SYSTEMS-OPTIMIZATION MODELING SIMULATION AND ECONOMIC ASPECTS 2017年第4期8卷 815-855页

作者： Malyscheff, Alexander M. Trafalis, Theodore B. Univ Oklahoma Sch Elect & Comp Engn 110 W Boyd Norman OK 73019 USA Univ Oklahoma Sch Ind & Syst Engn 202 W Boyd Norman OK 73019 USA

Least squares Monte Carlo (LSMC) approaches represent a computationally efficient method for the valuation of natural gas storage facilities. LSMC methods are computationally tractable while they simultaneously allow for a decoupling of tlie price path simulation from the optimization of the decision vector. However, selecting the appropriate features using traditional regression techniques can be challenging, particularly when several factors of uncertainty are assumed to drive the price process. In this paper we analyze a natural gas storage contract using a two factor forward model whose parameters can be easily calibrated. For a forward curve derived from monthly averages of the NBP day-ahead contract from 2004 to 2009 we compute storage values based on a collection of spot price paths and price paths of a daily forward contract with a time to maturity of 30 days. We study the impact of additional pricing information in the form of a forward contract on the value of a gas storage facility. A comparison to the corresponding one factor model is also included in our experiments. Value function approximation is carried out by employing a kernel-based regression technique in the form of support vector machine regression (SVR). We report out-of-sample results by simulating the targets for the next stage. We also carry out a search in the space of SVR parameters to identify the appropriate parameters for our experiments. Applying a spot trading strategy we observe a higher storage value for the one factor model when compared to the corresponding two factor model. With respect to the two factor model we report that an approximation of the value function over both a spot and a forward contract increases storage value compared to a value function that is computed over a spot contract only.

关键词： Natural gas storage valuation Least squares Monte Carlo approximate dynamic programming Support vector machine regression Kernel methods

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：