An approach to the integrated water resources management based on Neuro-dynamicprogramming (NDP) with an improved technique for fastening its Artificial Neural Network (ANN) training phase will be presented. When dea...
详细信息
An approach to the integrated water resources management based on Neuro-dynamicprogramming (NDP) with an improved technique for fastening its Artificial Neural Network (ANN) training phase will be presented. When dealing with networks of water resources, stochastic dynamic programming provides an effective solution methodology but suffers from the so-called “curse of dimensionality”, that rapidly leads to the problem intractability. NDP can sensibly mitigate this drawback by approximating the solution with ANNs. However in the real world applications NDP shows to be considerably slowed just by this ANN training phase. To overcome this limit a new training architecture (SIEVE: Selective Improvement by Evolutionary Variance Extinction) has been developed. In this paper this new approach is theoretically introduced and some preliminary results obtained on a real world case study are presented.
A short product design cycle is critical for the success of companies in the era of time-based competition especially when companies are distributing design activities across regions to better penetrate local markets....
详细信息
A short product design cycle is critical for the success of companies in the era of time-based competition especially when companies are distributing design activities across regions to better penetrate local markets. Effectively managing distributed design activities, however, is extremely difficult because of intrinsic uncertainties of the design process compounding with complicating factors such as high re-design efforts caused by the early beginning of activities with preliminary information, and individual organization's proprietary information and decision-making autonomy. This paper studies the scheduling and coordination of distributed design projects. The interrelationship between re-design efforts and activities' beginning times is captured as the “progress dependent re-design probability” in a decentralized optimization model. A new stochastic dynamic programming algorithm is developed to handle the re-design probabilities under a two-level Lagrangian relaxation framework. Numerical results demonstrate that an appropriate tradeoff between early start and low re-design efforts can be achieved, and near optimal solutions are obtained without accessing individual's proprietary information nor intruding their decision-making autonomy.
An approximate dynamicprogramming (ADP) strategy for a dual adaptive control problem is presented. An optimal control policy of a dual adaptive control problem can be derived by solving a stochasticdynamic programmi...
详细信息
An approximate dynamicprogramming (ADP) strategy for a dual adaptive control problem is presented. An optimal control policy of a dual adaptive control problem can be derived by solving a stochastic dynamic programming problem, which is computationally intractable using conventional solution methods that involve sampling of a complete hyperstate space. To solve the problem in a computationally amenable manner, we perform closed-loop simulations with different control policies to generate a data set that defines a subset of a hyperstate within which the Bellman equation is iterated. A local approximator with a penalty function is designed for estimation of cost-to-go values over the continuous hyperstate space. An integrating process with an unknown gain is used for illustration.
In this paper, we investigate an optimal investment strategy for defined-contribution (DC) pension plan under hybrid stochastic volatility (Heston-Hull-White) model, taking account of the inflation risk and the stocha...
详细信息
In this paper, we investigate an optimal investment strategy for defined-contribution (DC) pension plan under hybrid stochastic volatility (Heston-Hull-White) model, taking account of the inflation risk and the stochastic salary. The fund wealth is invested in financial market consisting of a risk-free asset, an inflation-indexed bond and a stock with hybrid Heston-Hull-White model. The goal of the pension fund manager is to maximize the expected utility of the terminal real wealth. We derive the Hamilton-Jacobi-Bellman (HJB) equation through the dynamicprogramming principle, under the constant relative risk aversion (CRRA) utility function, the optimal investment strategy is obtained. Finally, a numerical example is presented to characterize the impacts of financial parameters on the optimal investment strategy.
Aquifer pumping represents, in many geographical locations, an alternative and/or a complementary source of water to surface water supplies. Several Catalonian coastal towns in the northeastern corner of Spain are in ...
详细信息
Aquifer pumping represents, in many geographical locations, an alternative and/or a complementary source of water to surface water supplies. Several Catalonian coastal towns in the northeastern corner of Spain are in this situation. Also, since pumped water is used to supply drinking water, the main purpose in managing these water resources is to supply, no matter the cost, the amount needed at every moment. In other words, the managers of these aquifers attempt to optimize firm water yield. If we think of these aquifers as underground reservoirs with fixed storage capacity, most of the techniques which are applied to surface reservoirs can be implemented. In this paper we use a stochastic dynamic programming model to optimize the yield from the aquifer of the Ridaura River. The objective function in this model was chosen with the aim of maximizing the reliability of the target yields in each of four seasons.
A simplified inspection scenario is considered where a Micro Air Vehicle with limited endurance is tasked with search and classification in a multi-target environment and where false, that is, clutter, targets are pre...
详细信息
A simplified inspection scenario is considered where a Micro Air Vehicle with limited endurance is tasked with search and classification in a multi-target environment and where false, that is, clutter, targets are present. The sequential inspection operation, which includes a human operator for classification, is modelled, and a nonlinear discrete-time stochastic control problem is formulated. An analytic, closed-form, optimal control law is derived.
This paper deals with the jointed decision question on ordering and pricing for a short-life-cycle product under stochastic multiplicative demand depended selling price. According to the marketing practices, which ret...
详细信息
This paper deals with the jointed decision question on ordering and pricing for a short-life-cycle product under stochastic multiplicative demand depended selling price. According to the marketing practices, which retailers sell their products in different periods with the different marketing policies, we depict the jointed decision question with a stochastic dynamic programming model from the view of the centralized system. Then, we prove that the expected profit function are concave on decision vectors respectively, and develop the decision method for ordering and pricing. Lastly, we design the iterative search arithmetic to find the optimal decision vectors.
Farm size and production costs are varied in a six state variable stochastic dynamic programming model that quantifies monthly hedging, storage, and cash cotton sale decisions for an Alabama cotton producer. State var...
详细信息
Farm size and production costs are varied in a six state variable stochastic dynamic programming model that quantifies monthly hedging, storage, and cash cotton sale decisions for an Alabama cotton producer. State variables considered are: (1) cash cotton price; (2) basis level; (3) before-tax income level; (4) cotton holdings; (5) futures position; and (6) value of futures position. Results indicate that when farm size and production cost level differ, marketing decisions diverge the most for cash cotton sales at the end of the tax year and lower range of cash price (less than $.65/lb.), basis (less than -$.05/lb.), and before-tax income (less than $0.00) states.
Vehicle-based GLOSA (Green Light Optimal Speed Advisory) systems use information about the next switching time of the traffic lights to calculate fuel-efficient position and velocity profiles for connected vehicles, a...
详细信息
Vehicle-based GLOSA (Green Light Optimal Speed Advisory) systems use information about the next switching time of the traffic lights to calculate fuel-efficient position and velocity profiles for connected vehicles, according to their current state (position and speed). A stochastic optimal control problem was recently proposed to address the GLOSA problem in cases where the next switching time is decided in real time and is therefore uncertain in advance. The corresponding numerical solution via SDP (stochastic dynamic programming) calls for substantial computational time (few minutes), which excludes problem solution in the vehicle’s computer in real time. This work considers the same stochastic problem of optimal trajectory specification for vehicles approaching a signalized junction with traffic signals operated in real-time (adaptive) mode, due to which the next switching time is stochastic. However, a modified version of dynamicprogramming, known as Discrete Differential dynamicprogramming (DDDP), is used for numerical solution of the stochastic optimal control problem. It is demonstrated, based on a realistic example, that the DDDP algorithm achieves results equivalent to those obtained with the ordinary SDP algorithm, albeit with significantly better performance in terms of computational time. Specifically, the solution is typically obtained in around 1 CPUs, which is real-time feasible and would allow for the DDDP calculations to be executed in the vehicle’s on-board computer.
The paper considers optimal control of vehicle speed when the vehicle is driven in a particular geographic region with specific terrain and traffic patterns. The vehicle route is assumed to be unknown in advance. The ...
详细信息
The paper considers optimal control of vehicle speed when the vehicle is driven in a particular geographic region with specific terrain and traffic patterns. The vehicle route is assumed to be unknown in advance. The properties of the terrain and traffic flow are modeled stochastically. A method is proposed for constructing a control policy off-line to optimally prescribe vehicle speed set-point as a function of current driving conditions, for best on average fuel economy and travel speed performance. A related method is proposed to evaluate expected average fuel economy and travel speed performance of sub-optimal control policies, such as the policies which use constant speed offset relative to average traffic speed. The optimal control law which prescribes vehicle speed set-point can be deployed in advanced vehicle cruise control systems or incorporated into a driver advisory function. In addition, the value function of optimal or suboptimal control policies may be used as a terminal cost in a receding horizon optimization of vehicle speed over routes with known initial segments, or for fuel efficient vehicle routing.
暂无评论