In order to solve the electro-hydraulic system position tracking control problem, which caused by the nonlinear system friction torque disturbance, a model-free algorithm for the friction torque adaptive identificatio...
详细信息
ISBN:
(纸本)9780878492459
In order to solve the electro-hydraulic system position tracking control problem, which caused by the nonlinear system friction torque disturbance, a model-free algorithm for the friction torque adaptive identification and compensation was put forward. The algorithm is based on the application mathematics knowledge and matching & following principle. It can accommodate to all situations with the friction torque (force) variety. The simulation result indicates that the algorithm can restrains the interference of the friction torque (force) effectively, and the system's low speed character and tracking performance were been improved.
In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This pa-per considers the peak Constrained Markov Decisio...
详细信息
In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This pa-per considers the peak Constrained Markov Decision Process (PCMDP), where the agent chooses the policy to maximize total reward in the finite horizon as well as satisfy con-straints at each epoch with probability 1. We propose a model-free algorithm that converts PCMDP problem to an unconstrained problem and a Q-learning based approach is ap-plied. We define the concept of probably approximately correct (PAC) to the proposed PCMDP problem. The proposed algorithm is proved to achieve an (epsilon, p)-PAC policy when the episode K >= ohm(I2H6SAl epsilon 2 ), where S and A are the number of states and actions, respec-tively. H is the number of epochs per episode. I is the number of constraint functions, and l = log(SAT p ). We note that this is the first result on PAC kind of analysis for PCMDP with peak constraints, where the transition dynamics are not known apriori. We demonstrate the proposed algorithm on an energy harvesting problem and a single machine scheduling problem, where it performs close to the theoretical upper bound of the studied optimization problem.
In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak Constrained Markov Decision...
详细信息
In the optimization of dynamic systems, the variables typically have constraints. Such problems can be modeled as a Constrained Markov Decision Process (CMDP). This paper considers the peak Constrained Markov Decision Process (PCMDP), where the agent chooses the policy to maximize total reward in the finite horizon as well as satisfy constraints at each epoch with probability 1. We propose a model-free algorithm that converts PCMDP problem to an unconstrained problem and a Q-learning based approach is applied. We define the concept of probably approximately correct (PAC) to the proposed PCMDP problem. The proposed algorithm is proved to achieve an (ε, p)-PAC policy when the episode $K\geq\Omega(\frac{I^2H^6SA\ell}{\epsilon^2})$, where S and A are the number of states and actions, respectively. H is the number of epochs per episode. I is the number of constraint functions, and $\ell=\log(\frac{SAT}{p})$. We note that this is the first result on PAC kind of analysis for PCMDP with peak constraints, where the transition dynamics are not known apriori. We demonstrate the proposed algorithm on an energy harvesting problem and a single machine scheduling problem, where it performs close to the theoretical upper bound of the studied optimization problem.
Grasping deformable objects remains a challenging operational task for robots in diverse industrial applications. Different characteristics of deformable objects to be gripped need to be considered in the mechanical d...
详细信息
Grasping deformable objects remains a challenging operational task for robots in diverse industrial applications. Different characteristics of deformable objects to be gripped need to be considered in the mechanical design of the gripper. Mechanical grippers often rely on sensors and appropriate control strategies to grasp deformable objects. This study classifies deformable objects, grippers and gripper manufacturers, and their corresponding gripping strategies. In the study of control strategies, model-based algorithm control strategies are often ineffective as often the objects to be gripped are unknown in terms of its rigidity and other morphological characteristics. In contrast, model-free algorithms do not need parametric information of the objects as only input-output signal is required. This allows the model-free controlled grippers adapt to diverse and unstructured environments. Finally, the advantages and disadvantages of current deformable object-grasping techniques are discussed and summarized. The challenges and future directions of robots grasping deformable objects are pointed out.
In this article, we study the consensus issues of multiagent systems (MASs) without any information of the system model by using the reinforcement learning (RL) method and event-based control strategy. First, we desig...
详细信息
In this article, we study the consensus issues of multiagent systems (MASs) without any information of the system model by using the reinforcement learning (RL) method and event-based control strategy. First, we design an adaptive event-based consensus control protocol using the local sampled state information so that the consensus errors of all agents are uniformly ultimately bounded. The validity of the above event-triggered adaptive control protocol is confirmed by excluding the Zeno behavior within finite time. Then, based on the RL approach, we present a model-free algorithm to get the feedback gain matrix, and accomplish constructing the adaptive event-triggered control strategy without the knowledge of model information. Distinct with the existing related works, this RL-based event-triggered adaptive control algorithm only relies on the local sampled state information, irrelevant to any model information or global network information. Finally, we provide some examples to demonstrate the validity of the above adaptive event-based consensus algorithm.
In this study, a novel algorithm has been developed to solve a trajectory optimization problem of a model-free black box dynamical system. The proposed algorithm does not need an explicit dynamic model of the system b...
详细信息
In this study, a novel algorithm has been developed to solve a trajectory optimization problem of a model-free black box dynamical system. The proposed algorithm does not need an explicit dynamic model of the system but computes partial derivatives of the dynamic function numerically from the time series data of observation to estimate the adjoint variable and the Hamiltonian. The additional necessary conditions for optimality, constant Hamiltonian over time span, are used as the tracking condition to find an optimal trajectory. A candidate optimal trajectory is searched by the Legendre transformation which interprets the geometric information of the current control trajectory on the Lagrangian surface. The implication of this approach is the elimination of the need for the dynamic model or the system identification process as we only derive necessary partial derivatives out of current observations. This enables us to find a near optimal trajectory quickly without the explicit dynamic model or the full system identification process. The estimated Hamiltonian approach is verified first with several problems whose dynamic models are known. After then, the model-free algorithm is applied for several problems where the dynamics are still unclear. First case is real world applications where the observation data is obtained by experiments or from historical record. These applications include a recent hot manufacturing process called Field Assisted Sintering Technology (FAST) and a socio-economic policy problem of water usage management by price controls. In this case, approximated dynamic models based on collected empirical data are used for the simulated iterations to validate the effectiveness of the proposed algorithm. The proposed algorithm only use the observation output and shows iterative candidate searching history which converges toward an exact solution or a certain trajectory with decreasing total cost. Second case is a simulated feedback control algorithm call
In this article, a distributed adaptive model-free control algorithm is proposed for consensus and formation-tracking problems in a network of agents with completely unknown nonlinear dynamic systems. The specificatio...
详细信息
In this article, a distributed adaptive model-free control algorithm is proposed for consensus and formation-tracking problems in a network of agents with completely unknown nonlinear dynamic systems. The specification of the communication graph in the network is incorporated in the adaptive laws for estimation of the unknown linear and nonlinear terms, and in the online updating of the elements in the main controller gain matrix. The decentralized control signal at each agent in the network requires information about the states of the leader agent, as well as the desired formation variables of the agents in a local coordinate frame. These two sets of variables are provided at each agent by utilizing two recently proposed distributed observers. It is shown that only a spanning-tree rooted at the leader agent is enough for the convergence and stability of the proposed cooperative control and observer algorithms. Two simulation studies are provided to evaluate the performance of the proposed algorithm in comparison with two state-of-the-art distributed model-free control algorithms. With lower control effort as well as fewer offline gain tuning, the same level of consensus errors is achieved. Finally, the application of the proposed solution is studied in the formation-tracking control of a team of autonomous aerial mobile robots via simulation results.
The application of optimal control theory in practical engineering is often limited by the modeling cost and complexity of the mathematical model of the controlled plant, and various constraints. To bridge the gap bet...
详细信息
The application of optimal control theory in practical engineering is often limited by the modeling cost and complexity of the mathematical model of the controlled plant, and various constraints. To bridge the gap between the theory and practice, this paper proposes a model-free direct method based on the sequential sampling and updating of surrogate model, and extends the ability of direct method to solve model-free optimal control problems with general constraints. The algorithm selects sample points from the current actual trajectory data to update the surrogate model of controlled plant, and solve the optimal control problem of the constantly refined surrogate model until the result converges. The presented initial and subsequent sampling strategies eliminate the dependence on the model. Furthermore, the new stopping criteria ensure the overlap of final actual and planned trajectories. The several examples illustrate that the presented algorithm can obtain constrained solutions with greater accuracy and require fewer sample data.
In this paper, H-infinity control problem is investigated by off-policy integral reinforcement learning (IRL) method for the nonlinear systems with completely unknown dynamics, disturbances, and constrained-input. Fir...
详细信息
In this paper, H-infinity control problem is investigated by off-policy integral reinforcement learning (IRL) method for the nonlinear systems with completely unknown dynamics, disturbances, and constrained-input. Firstly, according to a model-based policy iteration (PI) algorithm, a model-free algorithm is proposed based on the derived iterative equation, and the equivalence of model-based PI algorithm and model-free algorithm is proven. Then, the model-free algorithm is implemented by off-policy IRL technology to solve the Hamilton-Jacobi-Isaacs (HJI) equation with the collected system data by the least-square approach, where three neural networks (NNs) are constructed to approximate the value function, control and the disturbance. Finally, our proposed methods are applied to stabilize an autonomous third-order Chua's chaotic circuit system and a non-autonomous second-order memristive chaotic circuit system to illustrate the efficiency of the proposed method.
Wind turbine blade failure can be catastrophic and lead to unexpected power interruptions. In this paper, a Structural Health Monitoring (SHM) algorithm is presented for wireless monitoring of wind turbine blades. The...
详细信息
Wind turbine blade failure can be catastrophic and lead to unexpected power interruptions. In this paper, a Structural Health Monitoring (SHM) algorithm is presented for wireless monitoring of wind turbine blades. The SHM algorithm utilizes accumulated strain energy data, such as would be acquired by piezoelectric materials. The SHM algorithm compares the accumulated strain energy at the same position on the three blades. This exploits the inherent triple redundancy of the blades and avoids the need for a structural model of the blade. The performance of the algorithm is evaluated using probabilistic metrics such as detection probability (True Positive) and false alarm rate (False Positive). The decision time is chosen to be sufficiently long that a particular damage level can be detected even in the presence of system sensor noise and wind variations. Finally, the proposed algorithm is evaluated with a case study of a utility-scale turbine. The noise level is based on measurements acquired from strain sensors mounted on the blades of a Clipper Liberty C96 turbine. Strain energy changes associated with damage from matrix cracking and delamination are simulated with a finite element model. The case study demonstrates that the proposed algorithm can detect damage with a high probability based on a decision time period of approximately 50-200days. Copyright (c) 2016 John Wiley & Sons, Ltd.
暂无评论