This paper presents a convergence analysis of particle swarm optimization system by treating it as a discrete-time linear time-variant system firstly. And then, based on the results of system convergence conditions, d...
详细信息
This paper presents a convergence analysis of particle swarm optimization system by treating it as a discrete-time linear time-variant system firstly. And then, based on the results of system convergence conditions, dynamic optimal control of a deterministic PSO system for parameters optimization is studied by using dynamicprogramming;and an approximate dynamic programming algorithm - swarm-based approximate dynamic programming (swarm-ADP) is proposed in this paper. Finally, numerical simulations proved the validated of this presented dynamic optimization method.
A new approach for engine calibration and control is proposed. In this paper, we present our research results on the implementation of adaptive critic designs for self-learning control of automotive engines. A class o...
详细信息
A new approach for engine calibration and control is proposed. In this paper, we present our research results on the implementation of adaptive critic designs for self-learning control of automotive engines. A class of adaptive critic designs that can be classified as'(model-free) action-dependent heuristic dynamicprogramming is used in this research project. The goals of the present learning control design for automotive engines include improved performance, reduced emissions, and maintained optimum performance under various operating conditions. Using the data from a test vehicle with a V8 engine, we developed a neural network model of the engine and neural network controllers based on the idea of approximate dynamic programming to achieve optimal control. We have developed and simulated self-learning neural network controllers for both engine torque (TRQ) and exhaust air-fuel ratio (AFR) control. The goal of TRQ control and AFR control is to track the commanded values. For both control problems;excellent neural network controller transient performance has been achieved.
The flow of packages and documents in collective groups, called splits, of an express package carrier consists of picking up the packages by a courier at customers' locations and bringing them to a station for sor...
详细信息
The flow of packages and documents in collective groups, called splits, of an express package carrier consists of picking up the packages by a courier at customers' locations and bringing them to a station for sorting. Next the splits are transported, either in bulk or containerized conveyances, to a major regional sorting facility called the ramp. In this work we focus on the afternoon and evening operations concerned with stations and the ramp. We deal with the sorting decisions at the stations and the ramp, as well as the transportation decisions among these locations. We model these processes by means of a dynamic program where time periods represent time slices in the afternoon and evening. The resulting myopic problem is a linear mixed-integer program. The overall model is solved by approximate dynamic programming where the value function is approximated by a linear function. Further strategies are developed to speed up the algorithm and decrease the time needed to find feasible solutions. The methodology is tested on several instances from an international express package carrier. Our solutions are substantially better than the current best practice.
We present a method to dynamically schedule patients with different priorities to a diagnostic facility in a public health-care setting. Rather than maximizing revenue, the challenge facing the resource manager is to ...
详细信息
We present a method to dynamically schedule patients with different priorities to a diagnostic facility in a public health-care setting. Rather than maximizing revenue, the challenge facing the resource manager is to dynamically allocate available capacity to incoming demand to achieve wait-time targets in a cost-effective manner. We model the scheduling process as a Markov decision process. Because the state space is too large for a direct solution, we solve the equivalent linear program through approximate dynamic programming. For a broad range of cost parameter values, we present analytical results that give the form of the optimal linear value function approximation and the resulting policy. We investigate the practical implications and the quality of the policy through simulation.
We consider the problem of estimating the value of a multiattribute resource, where the attributes are categorical or discrete in nature and the number of potential attribute vectors is very large. The problem arises ...
详细信息
We consider the problem of estimating the value of a multiattribute resource, where the attributes are categorical or discrete in nature and the number of potential attribute vectors is very large. The problem arises in approximate dynamic programming when we need to estimate the value of a multiattribute resource from estimates based on Monte-Carlo simulation. These problems have been traditionally solved using aggregation, but choosing the right level of aggregation requires resolving the classic tradeoff between aggregation error and sampling error. We propose a method that estimates the value of a resource at different levels of aggregation simultaneously, and then uses a weighted combination of the estimates. Using the optimal weights, which minimizes the variance of the estimate while accounting for correlations between the estimates, is computationally too expensive for practical applications. We have found that a simple inverse variance formula (adjusted for bias), which effectively assumes the estimates are independent, produces near-optimal estimates. We use the setting of two levels of aggregation to explain why this approximation works so well.
Autonomous wheeled mobile robot (WMR) needs implementing velocity and path tracking control subject to complex dynamical constraints. Conventionally, this control design is obtained by analysis and synthesis or by dom...
详细信息
Autonomous wheeled mobile robot (WMR) needs implementing velocity and path tracking control subject to complex dynamical constraints. Conventionally, this control design is obtained by analysis and synthesis or by domain expert to build control rules. This paper presents an adaptive critic motion control design, which enables WMR to autonomously generate the control ability by learning through trials. The design consists of an adaptive critic velocity control loop and a self-learning posture control loop. The neural networks in the velocity neuro-controller (VNC) are corrected with the dual heuristic programming (DHP) adaptive critic method. Designer simply expresses the control objective by specifying the primary utility function then VNC will attempt to fulfill it through incremental optimization. The posture neuro-controller (PNC) learns by approximating the specialized inverse velocity model of WMR so as to map planned positions to suitable velocity commands. Supervised drive supplies variant velocity commands for PNC and VNC to set up their neural weights. During autonomous drive, while PNC halts learning VNC keeps on correcting its neural weights to optimize the control performance. The proposed design is evaluated on an experimental WMR. The results show that the DHP adaptive critic design is a useful base of autonomous control. (C) 2008 Elsevier Ltd. All rights reserved.
We consider a broad class of stochastic dynamicprogramming problems that are amenable to relaxation via decomposition. These problems comprise multiple subproblems that are independent of each other except for a coll...
详细信息
We consider a broad class of stochastic dynamicprogramming problems that are amenable to relaxation via decomposition. These problems comprise multiple subproblems that are independent of each other except for a collection of coupling constraints on the action space. We fit an additively separable value function approximation using two techniques, namely, Lagrangian relaxation and the linear programming (LP) approach to approximate dynamic programming. We prove various results comparing the relaxations to each other and to the optimal problem value. We also provide a column generation algorithm for solving the LP-based relaxation to any desired optimality tolerance, and we report on numerical experiments on bandit-like problems. Our results provide insight into the complexity versus quality trade-off when choosing which of these relaxations to implement.
W e consider a scenario in which a large equipment manufacturer wishes to outsource the work involved in repairing purchased goods while under warranty. Several external service vendors are available for this work. We...
详细信息
W e consider a scenario in which a large equipment manufacturer wishes to outsource the work involved in repairing purchased goods while under warranty. Several external service vendors are available for this work. We develop models and analyses to support decisions concerning how responsibility for the warranty population should be divided between them. These also allow the manufacturer to resolve related questions concerning, for example, whether the service capacities of the contracted vendors are sufficient to deliver an effective post-sales service. Static allocation models yield information concerning the proportions of the warranty population for which the vendors should be responsible overall. dynamic allocation models enable consideration of how such overall workloads might be delivered to the vendors over time in a way which avoids excessive variability in the repair burden. We apply dynamicprogramming policy improvement to develop an effective dynamic allocation heuristic. This is evaluated numerically and is also used as a yardstick to assess two simple allocation heuristics suggested by static models. A dynamic greedy allocation heuristic is found to perform well. Dividing the workload equally among vendors with different service capacities can lead to serious losses.
Both gain scheduling and multiple model based control approaches are considered to be practical approaches for control of industrial nonlinear processes. However, the former ignores system dynamics and the latter is s...
详细信息
Both gain scheduling and multiple model based control approaches are considered to be practical approaches for control of industrial nonlinear processes. However, the former ignores system dynamics and the latter is specific to the type of controller design and limited in its scope of application as practiced in industry. This paper proposes a value function-based strategy for switching among local controllers, thereby providing an effective global control policy for the entire operating regions. The suggested method selects the best one among a set of available control policies at each time step by evaluating the "value" function associated with the successive state when a particular control action instructed by a candidate policy is taken for a give state. The value function, which maps a state to its associated discounted infinite horizon cost-to-go, is obtained by solving the dynamicprogramming in an approximate way using closed-loop simulation or operational data and a function approximator. The proposed approach has the advantages that candidate controllers are general and the switching is performed not by a fixed heuristic rule but rigorously via dynamicprogramming. From the viewpoint of dynamicprogramming, the approach helps alleviate the curse of dimensionality with respect to the state space and action space. Optimal or approximately optimal switching rules can be learned without a model, which defines the state transitional rule. The approach is demonstrated on several different nonlinear control examples. (C) 2007 Elsevier Ltd. All rights reserved.
Accurate, reliable, and timely traffic information is critical for deployment and operation of intelligent transportation systems (ITSs). Traffic forecasting for travelers and traffic operators should become at least ...
详细信息
Accurate, reliable, and timely traffic information is critical for deployment and operation of intelligent transportation systems (ITSs). Traffic forecasting for travelers and traffic operators should become at least as useful and convenient as weather reports. In the US, the Federal Highway Administration (FHWA) has envisioned a real-time traffic estimation and prediction system (TrEPS) as an ITS Support platform that resides at traffic management centers (TMCs) for dynamic route assignment (DRA) and other transportation operations. To enable ITS deployment for urban traffic control and management in China, in 1999 the Chinese Academy of Sciences Outlined a research agenda to develop related intelligent systems and technology.(1) A central component of this agenda was a REPS called DynaCAS (dynamic traffic assignment based on complex adaptive systems). Here, we briefly introduce DynaCAS and its open source counterpart DynaChina, emphasizing how they differ from other TrEPS projects.
暂无评论