The flexibility of deployment strategies combined with the low cost of individual sensor nodes allow wireless sensor networks (WSNs) to be integrated into a variety of applications. Network operations degrade over tim...
详细信息
The flexibility of deployment strategies combined with the low cost of individual sensor nodes allow wireless sensor networks (WSNs) to be integrated into a variety of applications. Network operations degrade over time as sensors consume a finite power supply and begin to fail. In this work we address the selective maintenance of a WSN through a condition-based deployment policy (CBDP) in which sensors are deployed over a series of missions. The main contribution is a Markov decision process (MDP) model to maintain a reliable WSN with respect to region coverage. Due to the resulting high dimensional state and outcome space, we explore approximate dynamic programming (ADP) methodology in the search for high quality CBDPs. Our model is one of the first related to the selective maintenance of a large-scale WSN through the repeated deployment of new sensor nodes with a reliability objective, and one of the first ADP applications for the maintenance of a complex WSN. Additionally, our methodology incorporates a destruction spectrum reliability estimate which has received significant attention with respect to network reliability, but its value in a maintenance setting has not been widely explored. We conclude with a discussion on CBDPs in a range of test instances, and compare the performance to alternative deployment strategies.
We study a variant of dynamic pickup and delivery crowd-shipping operation for delivering online orders within a few hours from a brick-and-mortar store. This crowd-shipping operation is subject to a high degree of un...
详细信息
We study a variant of dynamic pickup and delivery crowd-shipping operation for delivering online orders within a few hours from a brick-and-mortar store. This crowd-shipping operation is subject to a high degree of uncertainty due to the stochastic arrival of online orders and crowd- shippers that impose several challenges for efficient matching of orders to crowd-shippers. We formulate the problem as a Markov decision process and develop an approximate dynamic programming (ADP) policy using value function approximation for obtaining a highly scalable and real-time matching strategy while considering temporal and spatial uncertainty in arrivals of online orders and crowd-shippers. We incorporate several algorithmic enhancements to the ADP algorithm, which significantly improve the convergence. We compare the ADP policy with an optimization-based myopic policy using various performance measures. Our numerical analysis with varying parameter settings shows that ADP policies can lead to up to 25.2% cost savings and a 9.8% increase in the number of served orders. Overall, we find that our proposed framework can guide crowd-shipping platforms for efficient real-time matching decisions and enhance the platform delivery capacity.
Urban metro systems continuously face high travel demand during rush hours, which brings excessive energy waste and high risk to passengers. In order to alleviate passenger congestion, improve train service levels and...
详细信息
Urban metro systems continuously face high travel demand during rush hours, which brings excessive energy waste and high risk to passengers. In order to alleviate passenger congestion, improve train service levels and reduce energy consumption, a nonlinear dynamicprogramming (DP) model of efficient metro train timetabling and passenger flow control strategy with stop-skipping is presented, which consists of state transition equations concerning train traffic and passenger load. To overcome the curse of dimensionality, the formulated nonlinear DP problem is transformed into a discrete Markov decision process, and a novel approximate dynamic programming (ADP) approach is designed based on the lookahead policy and linear parametric value function approximation. Finally, the effectiveness of this method is verified by three groups of numerical experiments. Compared with Particle Swarm Optimization (PSO) and Simulated Annealing (SA), the designed ADP approach could obtain high-quality solutions quickly, which makes it applicable to the practical implementation of metro operations.
This paper proposes an innovative approximate dynamic programming (ADP) method for distributed energy resource coordination with the loss of life of battery energy storage system (BESS) explicitly modeled. The dispatc...
详细信息
This paper proposes an innovative approximate dynamic programming (ADP) method for distributed energy resource coordination with the loss of life of battery energy storage system (BESS) explicitly modeled. The dispatch policy is designed to account for both calendrical and cyclical aging effects on BESS, explicitly modeling the impacts of ambient temperature on BESS lifespan. The proposed ADP employs an adaptive critic method and enhanced off-policy deterministic policy gradient (DPG) strategy, addressing the limitations of the on-policy gradient-based ADP approaches, including inadequate exploration, low data usage, and computational complexity. In particular, a customized policy is proposed to guide the algorithm to explore some promising decisions and thereby improve exploration capability and learning efficiency compared to conventional DPG-based learning approaches, which may struggle to find a global optimum due to random noisy action-based exploration or require expert demonstration with extra effort. The proposed method is illustrated using the IEEE 123-node system and compared with the existing ADP methods to prove solution accuracy and demonstrate the effects of incorporating degradation models into control design. Case studies showed that the proposed ADP effectively coordinates DERs with a 10 times smaller optimization gap compared to existing methods, and the incorporation of the BESS life loss model into the proposed control ensures the expected lifespan and results in significant cost savings.
The global containerised trade heavily relies on liner shipping services, facilitating the worldwide movement of large cargo volumes along fixed routes and schedules. The profitability of shipping companies hinges on ...
详细信息
The global containerised trade heavily relies on liner shipping services, facilitating the worldwide movement of large cargo volumes along fixed routes and schedules. The profitability of shipping companies hinges on how efficiently they design their shipping network;a complex optimization problem known as the liner shipping network design problem (LSNDP). In recent years, approximate dynamic programming (ADP), also known as reinforcement learning, has emerged as a promising approach for large-scale optimisation. This paper introduces a novel Markov decision process for the LSNDP and investigates the potential of ADP. We show that ADP methods based on value iteration produce optimal solutions to small instances, but their scalability is hindered by high memory demands. An ADP method based on a deep neural network requires less memory and successfully obtains feasible solutions. The quality of solutions, however, declines for larger instances, possibly due to the discrete nature of high-dimensional state and action spaces.
At some point during transport, intermodal containers will be stored at a terminal, where they are typi-cally stacked on top of each other. Stacking yields a higher utilization of the area but may lead to unpro-ductiv...
详细信息
At some point during transport, intermodal containers will be stored at a terminal, where they are typi-cally stacked on top of each other. Stacking yields a higher utilization of the area but may lead to unpro-ductive reshuffle moves when containers below another need to be retrieved. Preventing reshuffles has a financial benefit, as it not only avoids the costs of executing the reshuffle but also decreases the time needed to retrieve a container. Typically, researchers consider only the retrieval of containers and assume the retrieval order is fully known. In addition, existing studies do not consider the stacking restrictions imposed by a reach stacker, which is commonly used in smaller inland terminals. This research aims to design decision support for determining real-life applicable container stack allocations so that the ex-pected number of reshuffles is minimized. We propose a model that includes both arrivals and departures of containers as well as a certain level of uncertainty in the order thereof. The problem is modeled as a Markov Decision Process and solved using approximate dynamic programming (ADP). Through numerical experiments on real-life problem instances, we conclude that the ADP approach drastically outperforms a benchmark heuristic from the literature. All data used as well as the source code has been made publicly available. (c) 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license ( http://***/licenses/by/4.0/ )
In this paper, we propose an approximate dynamic programming approach for an energy-efficient unrelated parallel machine scheduling problem. In this scheduling problem, jobs arrive at the system randomly, and each job...
详细信息
In this paper, we propose an approximate dynamic programming approach for an energy-efficient unrelated parallel machine scheduling problem. In this scheduling problem, jobs arrive at the system randomly, and each job's ready and processing times become available when an order is placed. Therefore, we consider the online version of the problem. Our objective is to minimize a combination of makespan and the total energy costs. The energy costs include cost of energy consumption of machines for switching on, processing, and idleness. We propose a binary program to solve the optimization problem at each stage of the approximatedynamic program. We compare the results of the approximateprogramming approach against an integer linear programming formulation of the offline version of the scheduling problem and an existing heuristic method suitable for scheduling problem with ready times. The results show that the approximate dynamic programming algorithm outperforms the two off-line methods in terms of solution quality and computational time. (c) 2021 Elsevier B.V. All rights reserved.
This article investigates the optimal control problem (OCP) for a class of discrete-time nonlinear systems with state constraints. First, to overcome the challenge caused by the constraints, the original constrained O...
详细信息
This article investigates the optimal control problem (OCP) for a class of discrete-time nonlinear systems with state constraints. First, to overcome the challenge caused by the constraints, the original constrained OCP is transformed into an unconstrained OCP by utilizing the system transformation technique. Second, a new cost function is designed to alleviate the effect of system transformation on the optimality of the original system. Further, a novel off-policy deterministic approximate dynamic programming (ADP) scheme is developed to obtain a near-optimal solution for the transformed OCP. Compared to existing off-policy deterministic ADP schemes, the developed scheme relaxes the requirement on the learning data and saves computing resources from the perspective of training neural networks. Third, considering approximation errors, we analyze the convergence and stability of the developed ADP scheme. Finally, the developed ADP with the designed cost function is tested in two numerical cases, and simulation results confirm its effectiveness.
A solution approach is proposed for the interday problem of assigning chemotherapy sessions at a network of treatment centres with the goal of increasing the cost-efficiency of system-wide capacity use. This network-b...
详细信息
A solution approach is proposed for the interday problem of assigning chemotherapy sessions at a network of treatment centres with the goal of increasing the cost-efficiency of system-wide capacity use. This network-based scheduling procedure is subject to the condition that both the first and last sessions of a patient's treatment protocol are administered at the same centre the patient is referred to by their oncologist. All intermediate sessions may be administered at other centres. It provides a systematic way of identifying effective multi-appointment scheduling policies that exploit the total capacity of a networked system, allowing patients to be treated at centres other than their home centre. The problem is modelled as a Markov decision process which is then solved approximately using techniques of approximate dynamic programming. The benefits of the approach are evaluated and compared through simulation with the existing manual scheduling procedures at two treatment centres in Santiago, Chile. The results suggest that the approach would obtain a 20% reduction in operating costs for the whole system and cut existing first-session waiting times by half. A key conclusion, however, is that a network-based scheduling procedure brings no real benefits if it is not implemented in conjunction with a proactive assignment policy like the one proposed in this paper.
作者:
Sardarmehni, TohidSong, XingyongCalif State Univ
Coll Engn & Comp Sci Dept Mech Engn Northridge CA 77843 USA Texas A&M Univ
Coll Engn Dept Engn Technol & Ind Distribut College Stn TX 77843 USA Texas A&M Univ
Coll Engn Dept Mech Engn College Stn TX 77843 USA Texas A&M Univ
Coll Engn Dept Elect & Comp Engn College Stn TX 77843 USA Texas A&M Univ
Coll Engn Dept Mech Engn Dept Elect & Comp EngnDept Engn Technol & Ind Dis College Stn TX 77843 USA
This paper introduces a region-based approximation method to solve optimal control problems with approximate dynamic programming (ADP). The backbone of the proposed solution is partitioning the domain of training to s...
详细信息
This paper introduces a region-based approximation method to solve optimal control problems with approximate dynamic programming (ADP). The backbone of the proposed solution is partitioning the domain of training to smaller regions in which the value function varies slowly. Afterward, for each region, a Linear in Parameter Neural Network (LIPNN) is trained to capture the behaviour of the value function in that region. It is shown that the method improves the precision in value function approximation, which leads to improvement in the performance of the closed-loop system. Meanwhile, the possibility of expanding the domain of training in ADP solutions by region-based approximation is discussed. At last, it is shown how the method can potentially eliminate the need for trial & error to select a proper neural network in classical ADP solutions.
暂无评论