In this paper, we propose new integer optimization models for the lot-sizing and scheduling problem with sequence-dependent setups, based on the general lot-sizing and scheduling problem. To incorporate setup crossove...
详细信息
In this paper, we propose new integer optimization models for the lot-sizing and scheduling problem with sequence-dependent setups, based on the general lot-sizing and scheduling problem. To incorporate setup crossover and carryover, we first propose a standard model that straightforwardly adapts a formulation technique from the literature. Then, as the main contribution, we propose a novel optimization model that incorporates the notion of time flow. We derive a family of valid inequalities with which to compare the tightness of the models' linear programming relaxations. In addition, we provide an approximate dynamic programming algorithm that estimates the value of a state using its lower and upper bounds. Then, we conduct computational experiments to demonstrate the competitiveness of the proposed models and the solution algorithm. The test results show that the newly proposed time-flow model has considerable advantages compared with the standard model in terms of tightness and solvability. The proposed algorithm also shows computational benefits over the standard mixed integer programming solver.
Emergency service providers are supposed to locate ambulances such that in case of emergency patients can be reached in a time-efficient manner. Two fundamental decisions and choices need to be made real-time. First o...
详细信息
Emergency service providers are supposed to locate ambulances such that in case of emergency patients can be reached in a time-efficient manner. Two fundamental decisions and choices need to be made real-time. First of all immediately after a request emerges an appropriate vehicle needs to be dispatched and send to the requests' site. After having served a request the vehicle needs to be relocated to its next waiting location. We are going to propose a model and solve the underlying optimization problem using approximate dynamic programming (ADP), an emerging and powerful tool for solving stochastic and dynamic problems typically arising in the field of operations research. Empirical tests based on real data from the city of Vienna indicate that by deviating from the classical dispatching rules the average response time can be decreased from 4.60 to 4.01 minutes, which corresponds to an improvement of 12.89%. Furthermore we are going to show that it is essential to consider time-dependent information such as travel times and changes with respect to the request volume explicitly. Ignoring the current time and its consequences thereafter during the stage of modeling and optimization leads to suboptimal decisions. (C) 2011 Elsevier B.V. All rights reserved.
The programmability and the virtualisation of network resources are crucial to deploy scalable Information and Communications Technology (ICT) services. The increasing demand of cloud services, mainly devoted to the s...
详细信息
The programmability and the virtualisation of network resources are crucial to deploy scalable Information and Communications Technology (ICT) services. The increasing demand of cloud services, mainly devoted to the storage and computing, requires a new functional element, the Cloud Management Broker (CMB), aimed at managing multiple cloud resources to meet the customers' requirements and, simultaneously, to optimise their usage. This paper proposes a multi-cloud resource allocation algorithm that manages the resource requests with the aim of maximising the CMB revenue over time. The algorithm is based on Markov decision process modelling and relies on reinforcement learning techniques to find online an approximate solution.
Military air battle managers face several challenges when directing operations during quickly evolving combat scenarios. These scenarios require rapid assignment decisions to engage moving targets having dynamic fligh...
详细信息
Military air battle managers face several challenges when directing operations during quickly evolving combat scenarios. These scenarios require rapid assignment decisions to engage moving targets having dynamic flight paths. In defensive operations, the success of a sequence of air battle management de-cisions is reflected by the friendly force's ability to maintain air superiority and defend friendly assets. We develop a Markov decision process (MDP) model of a stochastic dynamic assignment problem, named the Air Battle Management Problem (ABMP), wherein a set of unmanned combat aerial vehicles (UCAV) must defend an asset from cruise missiles arriving stochastically over time. Attaining an exact solution using traditional dynamicprogramming techniques is computationally intractable. Hence, we utilize an approximate dynamic programming (ADP) technique known as approximate policy iteration with least squares temporal differences (API-LSTD) learning to find high-quality solutions to the ABMP. We create a simulation environment in conjunction with a generic yet representative combat scenario to illustrate how the ADP solution compares in quality to a reasonable, closest-intercept benchmark policy. Our API-LSTD policy improves mean success rate by 2.8% compared to the benchmark policy and offers an 81.7% increase in the frequency with which the policy performs perfectly. Moreover, we find the increased suc-cess rate of the ADP policy is, on average, equivalent to the success rate attained by the benchmark policy when using a 20% faster UCAV. These results inform military force management and defense acquisition decisions and aid in the development of more effective tactics, techniques, and procedures. Published by Elsevier B.V.
This paper investigates the use of lattice point sets as an efficient method to sample uniformly the state space of discrete-time dynamic systems for the solution of finite-horizon optimal control problems using appro...
详细信息
This paper investigates the use of lattice point sets as an efficient method to sample uniformly the state space of discrete-time dynamic systems for the solution of finite-horizon optimal control problems using approximate dynamic programming. Lattice point sets are a kind of discretization method, commonly employed for efficient numerical integration, providing a regular and balanced sampling of the state space based on the repetition of elementary unit cells. A convergence analysis of the approximate solution of the control problem to the optimal one is provided, pointing out that such sampling schemes allow one to efficiently exploit possible regularities of the cost-to-go functions. Furthermore, it is shown that a higher accuracy may be obtained through suitable transformations of the state vector of the dynamic system. Another advantage of lattice point sets over other sampling schemes is the possibility of evaluating a priori the goodness of a given set over another through the explicit computation of a specific parameter. Simulation results concerning the optimal control of a water reservoirs system are presented to show the effectiveness of the proposed approach.
This paper examines approximate dynamic programming algorithms for the single-vehicle routing problem with stochastic demands from a dynamic or reoptimization perspective. The methods extend the rollout algorithm by i...
详细信息
This paper examines approximate dynamic programming algorithms for the single-vehicle routing problem with stochastic demands from a dynamic or reoptimization perspective. The methods extend the rollout algorithm by implementing different base sequences (i.e. a priori solutions), look-ahead policies, and pruning schemes. The paper also considers computing the cost-to-go with Monte Carlo simulation in addition to direct approaches. The best new method found is a two-step lookahead rollout started with a stochastic base sequence. The routing cost is about 4.8% less than the one-step rollout algorithm started with a deterministic sequence. Results also show that Monte Carlo cost-to-go estimation reduces computation time 65% in large instances with little or no loss in solution quality. Moreover, the paper compares results to the perfect information case from solving exact a posteriori solutions for sampled vehicle routing problems. The confidence interval for the overall mean difference is (3.56%, 4.11%). (C) 2008 Elsevier B.V. All rights reserved.
Project scheduling problems with both resource constraints and uncertain task durations have applications in a variety of industries. While the existing research literature has been focusing on finding an a priori ope...
详细信息
Project scheduling problems with both resource constraints and uncertain task durations have applications in a variety of industries. While the existing research literature has been focusing on finding an a priori open-loop task sequence that minimizes the expected makespan, finding a dynamic and adaptive closed-loop policy has been regarded as being computationally intractable. In this research, we develop effective and efficient approximate dynamic programming (ADP) algorithms based on the rollout policy for this category of stochastic scheduling problems. To enhance performance of the rollout algorithm, we employ constraint programming (CP) to improve the performance of base policy offered by a priority-rule heuristic. We further devise a hybrid ADP framework that integrates both the look-back and look-ahead approximation architectures, to simultaneously achieve both the quality of a rollout (look-ahead) policy to sequentially improve a task sequence, and the efficiency of a lookup table (look-back) approach. Computational results on the benchmark instances show that our hybrid ADP algorithm is able to obtain competitive solutions with the state-of-the-art algorithms in reasonable computational time. It performs particularly well for instances with non-symmetric probability distribution of task durations. (C) 2015 Elsevier B.V. and Association of European Operational Research Societies (EURO) within the International Federation of Operational Research Societies (IFORS). All rights reserved.
We describe a general method to transform a non-Markovian sequential decision problem into a supervised learning problem using a K-best-paths algorithm. We consider an application in financial portfolio management whe...
详细信息
We describe a general method to transform a non-Markovian sequential decision problem into a supervised learning problem using a K-best-paths algorithm. We consider an application in financial portfolio management where we can train a controller to directly optimize a Sharpe Ratio (or other risk-averse non-additive) utility function. We illustrate the approach by demonstrating experimental results using a kernel-based controller architecture that would not normally be considered in traditional reinforcement learning or approximate dynamic programming. We further show that using a non-additive criterion (incremental Sharpe Ratio) yields a noisy K-best-paths extraction problem, that can give substantially improved performance.
This study presents an adaptive railway traffic controller for real-time operations based on approximate dynamic programming (ADP). By assessing requirements and opportunities, the controller aims to limit consecutive...
详细信息
This study presents an adaptive railway traffic controller for real-time operations based on approximate dynamic programming (ADP). By assessing requirements and opportunities, the controller aims to limit consecutive delays resulting from trains that entered a control area behind schedule by sequencing them at a critical location in a timely manner, thus representing the practical requirements of railway operations. This approach depends on an approximation to the value function of dynamicprogramming after optimisation from a specified state, which is estimated dynamically from operational experience using reinforcement learning techniques. By using this approximation, the ADP avoids extensive explicit evaluation of performance and so reduces the computational burden substantially. In this investigation, we explore formulations of the approximation function and variants of the learning techniques used to estimate it. Evaluation of the ADP methods in a stochastic simulation environment shows considerable improvements in consecutive delays by comparison with the current industry practice of First-Come-First-Served sequencing. We also found that estimates of parameters of the approximate value function are similar across a range of test scenarios with different mean train entry delays.
Governments and manufacturers are starting to enforce the European transport industry's transition to sustainable mobility. Meanwhile, transport companies have begun to set their own emissions goals. To achieve th...
详细信息
Governments and manufacturers are starting to enforce the European transport industry's transition to sustainable mobility. Meanwhile, transport companies have begun to set their own emissions goals. To achieve these sustainably, they must develop efficient policies to renew their fleets with alternative-fuel vehicles. However, since future trends in relevant parameters are highly uncertain, fleet managers struggle to make informed decisions. We formulate fleet renewal as a sequential optimization problem, considering multiple technologies and operational clusters. Vehicle purchase, sales, depreciation, fuel, carbon, and electric battery prices are modeled as stochastic variables. We propose approximate dynamic programming to calculate fleet renewal policies that achieve emissions goals while optimizing total costs of ownership. This approach is tested in a case study of a German logistics service provider. We investigate optimal timings of purchases and sales for a heavy-duty truck fleet, considering four drivetrain technologies. Our approach can guide decision making in various fleet renewal settings. By applying it to the case study, we derive important managerial implications. The mobility transition will significantly increase transport fleets' total cost of ownership. To minimize costs, companies should not move prematurely to low-emissions technologies, but hold vehicles for as long as possible to benefit from fewer purchases and sinking prices. The optimal policy depends on the distance driven. For short-distance operations, diesel trucks will remain the dominant technology in the next years, but will be replaced by battery electric trucks in the medium term. In the far future, trucks powered by electricity and hydrogen will be equally important.
暂无评论