Opaque selling, in which a seller offers opaque goods (OGs), in addition to physical goods, has been shown to be an effective strategy to segment a market and improve the seller's profit. This article studies opaq...
详细信息
Opaque selling, in which a seller offers opaque goods (OGs), in addition to physical goods, has been shown to be an effective strategy to segment a market and improve the seller's profit. This article studies opaque selling with stochastic demand and fixed initial inventories of multiple products, where the seller dynamically controls the product offers and determines the product assignment to fulfill the demand for OGs over time. The problem is formulated as a stochastic dynamic program. Due to the curse of dimensionality, we study the fluid control problem that gives a time-based fluid policy and a stationary probabilistic fulfillment strategy. We show that the fluid policy is asymptotically optimal when the arrival rates and initial inventory level are scaled up linearly. Furthermore, we propose a decomposition heuristic based on the corresponding fluid solution. The decomposition heuristic is shown to provide a tighter upper bound than the fluid control problem. Numerical study on a set of test instances illustrates the performance and efficacy of opaque selling.
In this paper, an event-triggered adaptive approximately optimal tracking control approach is proposed for a class of non-affine nonlinear single-input single-output (SISO) systems via output feedback. With the help o...
详细信息
In this paper, an event-triggered adaptive approximately optimal tracking control approach is proposed for a class of non-affine nonlinear single-input single-output (SISO) systems via output feedback. With the help of fuzzy logic systems (FLSs), a fuzzy state observer is designed for the estimation of internal states by approximating unknown nonlinear functions, where a low-pass filter (LPF) is added for the algebraic loop problem. Then, the output feedback control approach, in the backstepping framework and the event-triggered mechanism, is presented. It contains the adaptive backstepping control and the approximately optimal control via adaptive dynamicprogramming (ADP) technology, and the computation cost and transmission load are reduced while guaranteeing that the performance index of the system is approximately minimised. Finally, two simulation examples are provided to verify the effectiveness of the proposed approach.
We consider the patient-to-bed assignment problem that arises in hospitals. Both emergency patients who require hospital admission and elective patients who have had surgery need to be found a bed in the most appropri...
详细信息
We consider the patient-to-bed assignment problem that arises in hospitals. Both emergency patients who require hospital admission and elective patients who have had surgery need to be found a bed in the most appropriate ward. The patient-to-bed assignment problem arises when a bed request is made, but a bed in the most appropriate ward is unavailable. In this case, the next-best decision out of a many alternatives has to be made, according to some suitable decision making algorithm. We construct a Markov chain to model this problem in which we consider the effect on the length of stay of a patient whose treatment and recovery consists of several stages, and can be affected by stays in or transfers to less suitable wards. We formulate a dynamic program recursion to optimise an objective function and calculate the optimal decision variables, and discuss simulation techniques that are useful when the size of the problem is too large. We illustrate the theory with some numerical examples.
We present a novel linear program for the approximation of the dynamicprogramming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natura...
详细信息
We present a novel linear program for the approximation of the dynamicprogramming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural "projection" of a well-studied linear program for exact dynamicprogramming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program-the "smoothed approximate linear program"-is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. These bounds are, in general, no worse than those available for extant LP approaches and for specific problem instances can be shown to be arbitrarily stronger. Second, experiments with our approach on a pair of challenging problems (the game of Tetris and a queueing network control problem) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by a substantial margin.
The thesis focuses on a model that seeks to address patient scheduling step of the surgical scheduling process to determine the number of surgeries to perform in a given day. Specifically, provided a master schedule t...
详细信息
The thesis focuses on a model that seeks to address patient scheduling step of the surgical scheduling process to determine the number of surgeries to perform in a given day. Specifically, provided a master schedule that provides a cyclic breakdown of total OR availability into specific daily allocations to each surgical specialty, we look to provide a scheduling policy for all surgeries that minimizes a combination of the lead time between patient request and surgery date, overtime in the ORs and congestion in the wards. We cast the problem of generating optimal control strategies into the framework of Markov Decision Process (MDP). The approximate dynamic programming (ADP) approach has been employed to solving the model which would otherwise be intractable due to the size of the state space. We assess performance of resulting policy and quality of the driven policy through simulation and we provide our policy insights and conclusions
In this letter, we discuss the problem of optimal control for affine systems in the context of data-driven linear programming. First, we introduce a unified framework for the fixed point characterization of the value ...
详细信息
In this letter, we discuss the problem of optimal control for affine systems in the context of data-driven linear programming. First, we introduce a unified framework for the fixed point characterization of the value function, Q-function and relaxed Bellman operators. Then, in a model-free setting, we show how to synthesize and estimate Bellman inequalities from a small but sufficiently rich dataset. To guarantee exploration richness, we complete the extension of Willems' fundamental lemma to affine systems.
作者:
Cire, Andre A.Diamant, AdamUniv Toronto
Scarborough & Rotman Sch Management Dept Management Toronto ON Canada York Univ
Schulich Sch Business 111 Ian Macdonald Blvd Toronto ON M3J 1P3 Canada
Home care provides personalized medical care and social support to patients within their own homes. Our work proposes a dynamic scheduling framework to assist in the assignment of health practitioners (HPs) to patient...
详细信息
Home care provides personalized medical care and social support to patients within their own homes. Our work proposes a dynamic scheduling framework to assist in the assignment of health practitioners (HPs) to patients who arrive stochastically over time and are heterogeneous with respect to their health requirements, service duration, and region of residence. We model the decision of which patients to assign to HPs as a discrete-time, rolling-horizon, infinite-stage Markov decision process. Due to the curse of dimensionality and the combinatorial structure associated with an HP's travel, we propose an approximate dynamic programming (ADP) approach based on a one-step policy improvement heuristic. Four policies are investigated: The first two prioritize HP fairness by balancing service and travel times, respectively, while the other two are based on fluid approximations of the system. We show that the first fluid model is optimal if the number of patient arrivals is sufficiently large while the second performs better experimentally;both approaches leverage pricing and decomposition strategies. We compare our framework to more commonly implemented policies-constrained versions of the classical vehicle routing problem-in a simulation study using data collected from a Canadian home care provider. We show that, in contrast to these approaches, by accounting for future uncertainty, substantial cost savings can be obtained while a fewer number of referrals are rejected. We also find that well-performing policies assign patients to HPs operating within a small set of adjacent regions while considering the number of periods that a patient requires care for. Otherwise, HP workload may not be appropriately balanced over the long-term even if travel time is minimized.
We consider a problem where different classes of customers can book different types of service in advance and the service company has to respond immediately to the booking request confirming or rejecting it. The objec...
详细信息
We consider a problem where different classes of customers can book different types of service in advance and the service company has to respond immediately to the booking request confirming or rejecting it. The objective of the service company is to maximize profit made of class-type specific revenues, refunds for cancellations or no-shows as well as cost of overtime. For the calculation of the latter, information on the underlying appointment schedule is required. In contrast to most models in the literature we assume that the service time of clients is stochastic and that clients might be unpunctual. Throughout the paper we will relate the problem to capacity allocation in radiology services. The problem is modeled as a continuous-time Markov decision process and solved using simulation-based approximate dynamic programming (ADP) combined with a discrete event simulation of the service period. We employ an adapted heuristic ADP algorithm from the literature and investigate on the benefits of applying ADP to this type of problem. First, we study a simplified problem with deterministic service times and punctual arrival of clients and compare the solution from the ADP algorithm to the optimal solution. We find that the heuristic ADP algorithm performs very well in terms of objective function value, solution time, and memory requirements. Second, we study the problem with stochastic service times and unpunctuality. It is then shown that the resulting policy constitutes a large improvement over an "optimal" policy that is deduced using restrictive, simplifying assumptions. (C) 2011 Elsevier B.V. All rights reserved.
In the controlled ovarian hyperstimulation (COH) treatment, clinicians monitor the patients' physiological responses to gonadotropin administration to tradeoff between pregnancy probability and ovarian hyperstimul...
详细信息
In the controlled ovarian hyperstimulation (COH) treatment, clinicians monitor the patients' physiological responses to gonadotropin administration to tradeoff between pregnancy probability and ovarian hyperstimulation syndrome (OHSS). We formulate the dosage control problem in the COH treatment as a stochastic dynamic program and design approximate dynamic programming (ADP) algorithms to overcome the well-known curses of dimensionality in Markov decision processes (MDP). Our numerical experiments indicate that the piecewise linear (PWL) approximation ADP algorithms can obtain policies that are very close to the one obtained by the MDP benchmark with significantly less solution time. (c) 2012 Elsevier B.V. All rights reserved.
approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) of intractable Markov decision processes (MDPs). VFAs from ALPs have desirable theoretical properties, define...
详细信息
approximate linear programs (ALPs) are well-known models for computing value function approximations (VFAs) of intractable Markov decision processes (MDPs). VFAs from ALPs have desirable theoretical properties, define an operating policy, and provide a lower bound on the optimal policy cost. However, solving ALPs near-optimally remains challenging, for example, when approximating MDPs with nonlinear cost functions and transition dynamics or when rich basis functions are required to obtain a good VFA. We address this tension between theory and solvability by proposing a convex saddle-point reformulation of an ALP that includes as primal and dual variables, respectively, a vector of basis function weights and a constraint violation density function over the state-action space. To solve this reformulation, we develop a proximal stochastic mirror descent (PSMD) method that learns regions of high ALP constraint violation via its dual update. We establish that PSMD returns a near-optimal ALP solution and a lower bound on the optimal policy cost in a finite number of iterations with high probability. We numerically compare PSMD with several benchmarks on inventory control and energy storage applications. We find that the PSMD lower bound is tighter than a perfect information bound. In contrast, the constraint-sampling approach to solve ALPs may not provide a lower bound, and applying row generation to tackle ALPs is not computationally viable. PSMD policies outperform problem-specific heuristics and are comparable or better than the policies obtained using constraint sampling. Overall, our ALP reformulation and solution approach broadens the applicability of approximate linear programming.
暂无评论