Ye [2011] showed recently that the simplex method with Dantzig's pivoting rule, as well as Howard's policy iteration algorithm, solve discounted Markov decision processes (MDPs), with a constant discount facto...
详细信息
Ye [2011] showed recently that the simplex method with Dantzig's pivoting rule, as well as Howard's policy iteration algorithm, solve discounted Markov decision processes (MDPs), with a constant discount factor, in stronglypolynomial time. More precisely, Ye showed that both algorithms terminate after at most O (mn/1-gamma log n/1-gamma) iterations, where n is the number of states, m is the total number of actions in the MDP, and 0 < gamma < 1 is the discount factor. We improve Ye's analysis in two respects. First, we improve the bound given by Ye and show that Howard's policy iteration algorithm actually terminates after at most O(m/1-gamma log n/1-gamma) iterations. Second, and more importantly, we show that the same bound applies to the number of iterations performed by the strategy iteration (or strategy improvement) algorithm, a generalization of Howard's policy iteration algorithm used for solving 2-player turn-based stochastic games with discounted zero-sum rewards. This provides the first stronglypolynomial algorithm for solving these games, solving a long standing open problem. Combined with other recent results, this provides a complete characterization of the complexity the standard strategy iteration algorithm for 2-player turn-based stochastic games;it is stronglypolynomial for a fixed discount factor, and exponential otherwise.
Fruit and vegetable supply chains require tight integration due to freshness-keeping requirements during the post-harvest, production, and distribution stages. This paper addresses the scheduling problem of a two -sta...
详细信息
Fruit and vegetable supply chains require tight integration due to freshness-keeping requirements during the post-harvest, production, and distribution stages. This paper addresses the scheduling problem of a two -stage flow shop in the fruit and vegetable supply chain involving procurement, production, distribution, and order assignment. The supply chain consists of contract growers, processors, and demand points. Growers and processors maintain limited capacity. Growers have processing lines to sort and grade fruits and vegetables, which are then shipped to processors for further processing. The first stage of the problem involves determining the production and distribution schedules of growers to processors. The second stage concerns assigning demand points to processors, scheduling production at processors, and distributing from processors to demand points. This paper considers its two sub-problems and proposes a mixed-integer linear programming problem: growers' and processors' production and distribution scheduling with order assignment. Also, we present two respective strongly polynomial algorithms for solving the two sub-problems. Besides, we develop a two-phase heuristic algorithm for the integrated problem by solving sub-problems together. We conduct computational experiments to justify the proposed algorithms and integration value. Computational results reveal that the proposed algorithms outperform the CPLEX on large-scale problem instances. Finally, we analyze the impacts of waste reduction, unit processing time and cost, and inbound and outbound shipping rates on supply chain performance, along with managerial insights.
暂无评论