检索结果-内蒙古大学图书馆

Generalized Policy Iteration Adaptive Dynamic Programming for Discrete-Time Nonlinear systems

IEEE TRANSACTIONS ON systems MAN CYBERNETICS-systems 2015年第12期45卷 1577-1591页

作者： Liu, Derong Wei, Qinglai Yan, Pengfei Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

This paper is concerned with a novel generalized policy iteration algorithm for solving optimal control problems for discrete-time nonlinear systems. The idea is to use an iterative adaptive dynamic programming algorithm to obtain iterative control laws which make the iterative value functions converge to the optimum. Initialized by an admissible control law, it is shown that the iterative value functions are monotonically nonincreasing and converge to the optimal solution of Hamilton-Jacobi-Bellman equation, under the assumption that a perfect function approximation is employed. The admissibility property is analyzed, which shows that any of the iterative control laws can stabilize the nonlinear system. Neural networks are utilized to implement the generalized policy iteration algorithm, by approximating the iterative value function and computing the iterative control law, respectively, to achieve approximate optimal control. Finally, numerical examples are presented to verify the effectiveness of the present generalized policy iteration algorithm.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming generalized policy iteration neural networks neuro-dynamic programming nonlinear systems optimal control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Vision-based initial weld point positioning using the geometric relationship between two seams

引用

INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY 2013年第9-12期66卷 1535-1543页

作者： Fang, Zaojun Xu, De Tan, Min Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

This paper presents a novel vision-based initial weld point positioning method for the welding systems of container manufacture. The new method is based on the geometric relationship between the two seams at the two different stages of the whole welding task such as the initialization stage and the welding stage. The torch is aligned with the initial weld point manually at the first stage, and the image feature and the parameters of the seam line are computed. At the second stage, the target image feature of the seam line is firstly computed using the geometric relationship, then the alignment of the torch is automated based on the difference between the target and the current image features. The geometric relationship between the two seams is analyzed, and then the realization of the new method including the image processing, the computation of the parameters of the seam line, and the control system design is given in detail. Finally, experiments are well conducted to prove the effectiveness of the proposed initial weld point positioning method.

关键词： Vision Vision control Welding robot Initial weld point positioning Geometric relationship

来源：评论

学校读者我要写书评

暂无评论

Adaptive Dynamic Programming for Optimal Tracking control of Unknown Nonlinear systems With Application to Coal Gasification

引用

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2014年第4期11卷 1020-1036页

作者： Wei, Qinglai Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this paper, we establish a new data-based iterative optimal learning control scheme for discrete-time nonlinear systems using iterative adaptive dynamic programming (ADP) approach and apply the developed control scheme to solve a coal gasification optimal tracking control problem. According to the system data, neural networks (NNs) are used to construct the dynamics of coal gasification process, coal quality and reference control, respectively, where the mathematical model of the system is unnecessary. The approximation errors from neural network construction of the disturbance and the controls are both considered. Via system transformation, the optimal tracking control problem with approximation errors and disturbances is effectively transformed into a two-person zero-sum optimal control problem. A new iterative ADP algorithm is then developed to obtain the optimal control laws for the transformed system. Convergence property is developed to guarantee that the performance index function converges to a finite neighborhood of the optimal performance index function, and the convergence criterion is also obtained. Finally, numerical results are given to illustrate the performance of the present method.

关键词： Adaptive dynamic programming coal gasification data-based control finite approximation errors neural networks optimal tracking control

来源：评论

学校读者我要写书评

暂无评论

Scene text detection using graph model built upon maximally stable extremal regions

引用

PATTERN RECOGNITION LETTERS 2013年第2期34卷 107-116页

作者： Shi, Cunzhao Wang, Chunheng Xiao, Baihua Zhang, Yang Gao, Song Chinese Acad Sci State Key Lab Management & Control Complex Syst Inst Automat Beijing 100190 Peoples R China

Scene text detection could be formulated as a bi-label (text and non-text regions) segmentation problem. However, due to the high degree of intraclass variation of scene characters as well as the limited number of training samples, single information source or classifier is not enough to segment text from non-text background. Thus, in this paper, we propose a novel scene text detection approach using graph model built upon Maximally Stable Extremal Regions (MSERs) to incorporate various information sources into one framework. Concretely, after detecting MSERs in the original image, an irregular graph whose nodes are MSERs, is constructed to label MSERs as text regions or non-text ones. Carefully designed features contribute to the unary potential to assess the individual penalties for labeling a MSER node as text or non-text, and color and geometric features are used to define the pairwise potential to punish the likely discontinuities. By minimizing the cost function via graph cut algorithm, different information carried by the cost function could be optimally balanced to get the final MSERs labeling result. The proposed method is naturally context-relevant and scale-insensitive. Experimental results on the ICDAR 2011 competition dataset show that the proposed approach outperforms state-of-the-art methods both in recall and precision. (C) 2012 Elsevier B.V. All rights reserved.

关键词： Scene text detection MSER Graph model Cost function Graph cut

来源：评论

学校读者我要写书评

暂无评论

Data-Driven Neuro-Optimal Temperature control of Water-Gas Shift Reaction Using Stable Iterative Adaptive Dynamic Programming

引用

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2014年第11期61卷 6399-6408页

作者： Wei, Qinglai Liu, Derong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this paper, a novel data-driven stable iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal temperature control problems for water-gas shift (WGS) reaction systems. According to the system data, neural networks (NNs) are used to construct the dynamics of the WGS system and solve the reference control, respectively, where the mathematical model of the WGS system is unnecessary. Considering the reconstruction errors of NNs and the disturbances of the system and control input, a new stable iterative ADP algorithm is developed to obtain the optimal control law. The convergence property is developed to guarantee that the iterative performance index function converges to a finite neighborhood of the optimal performance index function. The stability property is developed to guarantee that each of the iterative control laws can make the tracking error uniformly ultimately bounded (UUB). NNs are developed to implement the stable iterative ADP algorithm. Finally, numerical results are given to illustrate the effectiveness of the developed method.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming approximation errors data-driven control neural networks (NNs) optimal control reinforcement learning water-gas shift (WGS)

来源：评论

学校读者我要写书评

暂无评论

A Novel Dual Iterative Q-Learning Method for Optimal Battery management in Smart Residential Environments

引用

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2015年第4期62卷 2509-2518页

作者： Wei, Qinglai Liu, Derong Shi, Guang Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this paper, a novel iterative Q-learning method called "dual iterative Q-learning algorithm" is developed to solve the optimal battery management and control problem in smart residential environments. In the developed algorithm, two iterations are introduced, which are internal and external iterations, where internal iteration minimizes the total cost of power loads in each period, and the external iteration makes the iterative Q-function converge to the optimum. Based on the dual iterative Q-learning algorithm, the convergence property of the iterative Q-learning method for the optimal battery management and control problem is proven for the first time, which guarantees that both the iterative Q-function and the iterative control law reach the optimum. Implementing the algorithm by neural networks, numerical results and comparisons are given to illustrate the performance of the developed algorithm.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming neural networks optimal control Q-learning smart grid

来源：评论

学校读者我要写书评

暂无评论

Infinite Horizon Self-Learning Optimal control of Nonaffine Discrete-Time Nonlinear systems

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING systems 2015年第4期26卷 866-879页

作者： Wei, Qinglai Liu, Derong Yang, Xiong Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this paper, a novel iterative adaptive dynamic programming (ADP)-based infinite horizon self-learning optimal control algorithm, called generalized policy iteration algorithm, is developed for nonaffine discrete-time (DT) nonlinear systems. Generalized policy iteration algorithm is a general idea of interacting policy and value iteration algorithms of ADP. The developed generalized policy iteration algorithm permits an arbitrary positive semidefinite function to initialize the algorithm, where two iteration indices are used for policy improvement and policy evaluation, respectively. It is the first time that the convergence, admissibility, and optimality properties of the generalized policy iteration algorithm for DT nonlinear systems are analyzed. Neural networks are used to implement the developed algorithm. Finally, numerical examples are presented to illustrate the performance of the developed algorithm.

关键词： Adaptive critic designs adaptive dynamic programming (ADP) approximate dynamic programming generalized policy iteration neural networks (NNs) neurodynamic programming nonlinear systems optimal control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Algorithm for camera parameter adjustment in multicamera systems

引用

OPTICAL ENGINEERING 2015年第10期54卷

作者： Liu, Jianran Fang, Zaojun Zhang, Kun Tan, Min Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

Multicamera systems have many advantages and are widely used. However, many situations require camera parameters that are more accurate than those that are currently available. A new algorithm is proposed to improve the accuracy and consistency of these systems by adjusting the camera parameters. The algorithm assumes that the distribution of the measured point positions follows the Gaussian mixture model. Based on this model, point positions in space are estimated, and new camera parameters are computed from the estimation. A metric is defined to describe the difference between the newly computed and precalibrated camera parameters, following which the parameters are adjusted by minimizing this difference. Finally, the validity of the algorithm is confirmed by conducting experiments. Two indicators that describe the accuracy and consistency are defined and applied to analyze the experimental data. (C) 2015 Society of Photo-Optical Instrumentation Engineers (SPIE)

关键词： multicamera system camera calibration computer vision camera parameter adjustment

来源：评论

学校读者我要写书评

暂无评论

Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints

引用

IET control THEORY AND APPLICATIONS 2013年第17期7卷 2037-2047页

作者： Yang, Xiong Liu, Derong Huang, Yuzhu Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

In this study, an online adaptive optimal control scheme is developed for solving the infinite-horizon optimal control problem of uncertain non-linear continuous-time systems with the control policy having saturation constraints. A novel identifier-critic architecture is presented to approximate the Hamilton-Jacobi-Bellman equation using two neural networks (NNs): an identifier NN is used to estimate the uncertain system dynamics and a critic NN is utilised to derive the optimal control instead of typical action-critic dual networks employed in reinforcement learning. Based on the developed architecture, the identifier NN and the critic NN are tuned simultaneously. Meanwhile, unlike initial stabilising control indispensable in policy iteration, there is no special requirement imposed on the initial control. Moreover, by using Lyapunov's direct method, the weights of the identifier NN and the critic NN are guaranteed to be uniformly ultimately bounded, while keeping the closed-loop system stable. Finally, an example is provided to demonstrate the effectiveness of the present approach.

关键词： adaptive control approximation theory closed loop systems continuous time systems Lyapunov methods neurocontrollers nonlinear control systems optimal control robust control uncertain systems neural network-based online adaptive optimal control uncertain nonlinear continuous-time systems control constraints infinite-horizon optimal control problem control policy saturation constraints identifier-critic architecture Hamilton-Jacobi-Bellman equation approximation uncertain system dynamics critic NN action-critic dual networks reinforcement learning identifier NN policy iteration LyapunovaEuros direct method closed loop system stability

来源：评论

学校读者我要写书评

暂无评论

Development of an Underwater Manipulator and Its Free-Floating Autonomous Operation

引用

IEEE-ASME TRANSACTIONS ON MECHATRONICS 2016年第2期21卷 815-824页

作者： Wang, Yu Wang, Shuo Wei, Qingping Tan, Min Zhou, Chao Yu, Junzhi Chinese Acad Sci Inst Automat State Key Lab Management & Control Complex Syst Beijing 100190 Peoples R China

This paper addresses the novel design of an underwater manipulator with a lightweight multilink structure and its free-floating autonomous operation. The concept design reduces the coupling between the manipulator and the vehicle efficiently, even in the case where the vehicle weight in air is not significantly greater than the manipulator weight. The specific implementation of the mechanical structure is elaborated. Moreover, a closed-loop control system based on binocular vision is proposed for underwater manipulation. In the end, experimental results demonstrate that the conceived underwater manipulator can accomplish the autonomous operation quickly.

关键词： Autonomous operation free-floating manipulation underwater manipulator UVMS

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：