检索结果-内蒙古大学图书馆

ADAPTIVE VECTOR QUANTIZATION FOR REINFORCEMENT learning

IFAC Proceedings Volumes 2002年第1期35卷 493-498页

作者： H.Y.K. Lau K.L. Mak I.S.K. Lee Department of Industrial and Manufacturing Systems Engineering The University of Hong Kong Pokfulam Road Hong Kong

Dynamic programming methods are capable of solving reinforcement learning problems, in which an agent must improve its behavior through trial-and-error interactions with a dynamic environment. However, these computational algorithms suffer from the curse of dimensionality (Bellman, 1957) that the number of computational operations increases exponentially with the cardinality of the state space. In practice, this usually results in a very long training time and applications in continuous domain are far from trivial. In order to ease this problem, we propose the use of vector quantization to adaptively partition the state space based on the recent estimate of the action-value function. In particular, this state-space partitioning operation is performed incrementally to reflect the experience accumulated by the agent as it explores the underlying environment.

关键词： learning algorithm intelligent control vector quantization robot navigation automated guided vehicles

来源：评论

学校读者我要写书评

暂无评论

REINFORCEMENT learning OF FUZZY LOGIC CONTROLLERS FOR QUADRUPED

引用

IFAC Proceedings Volumes 2002年第1期35卷 91-96页

作者： WALKING ROBOTS Dongbing Gu Huosheng Hu Department of Computer Science University of Essex Wivenhoe Park Colchester CO4 3SQ UK

This paper presents a fuzzy logic controller (FLC) for the implementation of some behaviour of Sony legged robots. The adaptive heuristic Critic (AHC) reinforcement learning is employed to refine the FLC. The actor part of AHC is a conventional FLC in which the parameters of input membership functions are learned by an immediate internal reinforcement signal. This internal reinforcement signal comes from a prediction of the evaluation value of a policy and the external reinforcement signal. The evaluation value of a policy is learned by temporal difference (TD) learning in the critic part that is also represented by a FLC. A genetic algorithm (GA) is employed for learning internal reinforcement of the actor part because it is more efficient in searching than other trial and error search approaches.

关键词： Fuzzy logic controller learning algorithm Robot control

来源：评论

学校读者我要写书评

暂无评论

Training trajectories by continuous recurrent multilayer networks

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS 2002年第2期13卷 283-291页

作者： Leistritz, L Galicki, M Witte, H Kochs, E Friedrich Schiller Univ Jena Inst Med Stat Comp Sci & Documentat D-07740 Jena Germany Tech Univ Dept Anesthesiol Munich Germany

This paper addresses the problem of training trajectories by means of continuous recurrent neural networks whose feedforward parts are multilayer perceptrons. Such networks can approximate a general nonlinear dynamic system with arbitrary accuracy. The learning process is transformed into an optimal control framework where the weights are the controls to be determined. A training algorithm based upon a variational formulation of Pontryagin's maximum principle is proposed for such networks. Computer examples demonstrating the efficiency of the given approach are also presented.

关键词： approximation learning algorithm multilayer perceptron recurrent network target trajectories

来源：评论

学校读者我要写书评

暂无评论

Using feedback for coherent control of quantum systems

引用

JOURNAL OF OPTICS B-QUANTUM AND SEMICLASSICAL OPTICS 2002年第3期4卷 R35-R52页

作者： Weinacht, TC Bucksbaum, PH Univ Michigan Ann Arbor MI 48109 USA

A longstanding goal in chemical physics has been the control of atoms and molecules using coherent light fields. This paper provides a brief overview of the field and discusses experiments that use a programmable pulse shaper to control the quantum state of electronic wavepackets in Rydberg atoms and electronic and nuclear dynamics in molecular liquids. The shape of Rydberg wavepackets was controlled by using tailored ultrafast pulses to excite a beam of caesium atoms. The quantum state of these atoms was measured using holographic techniques borrowed from optics. The experiments with molecular liquids involved the construction of an automated learning machine. A genetic algorithm directed the choice of shaped pulses which interacted with the molecular system inside a learning control loop. Analysis of successful pulse shapes that were found by using the genetic algorithm yield insight into the systems being controlled.

关键词： coherent control feedback ultrafast quantum state preparation learning algorithm

来源：评论

学校读者我要写书评

暂无评论

GAM: A general auto-associative memory model

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2002年第7期E85D卷 1153-1164页

作者： Shi, H Zhao, YX Zhuang, XH Ren, FJ Univ Missouri Dept Comp Sci & Comp Engn Columbia MO 65211 USA Univ Tokushima Fac Engn Tokushima 7708506 Japan

This paper attempts to establish a theory for a general auto-associative memory model. We start by defining a new concept called supporting function to replace the concept of energy function. As known, the energy function relies on the assumption of symmetric interconnection weights, which is used in the conventional Hopfield auto-associative memory, but not evidenced in any biological memories. We then formulate the information retrieving process as a dynamic system by making use of the supporting function and derive the attraction or asymptotic stability condition and the condition for convergence of an arbitrary state to a desired state. The latter represents a key condition for associative memory to have a capability of learning from variant samples. Finally, we develop an algorithm to learn the asymptotic stability condition and an algorithm to train the system to recover desired states from their variant samples. The latter called sample learning algorithm is the first of its kind ever been discovered for associative memories. Both recalling and learning processes are of finite convergence, a must-have feature for associative memories by analogy to normal human memory. The effectiveness of the recalling and learning algorithms is experimentally demonstrated.

关键词： bidirectional associative memory support function learning algorithm recalling algorithm

来源：评论

学校读者我要写书评

暂无评论

learning algorithm for nonlinear support vector machines suited for digital VLSI

引用

ELECTRONICS LETTERS 1999年第16期35卷 1349-1350页

作者： Anguita, D Boni, A Ridella, S Univ Genoa Dept Biophys & Elect Engn I-16145 Genoa Italy

A learning algorithm for radial basis function support vector machines (RBF-SVMs) that can be easily implemented in digital VLSI is proposed. It is shown that, as opposed to traditional artificial neural networks, learning in SVMs is very robust with respect to quantisation effects deriving from the finite precision of computations.

关键词： radial basis function networks learning (artificial intelligence) digital VLSI artificial neural network nonlinear support vector machine Neural net devices VLSI radial basis function network Neural nets (circuit implementations) digital integrated circuits quantisation learning algorithm neural chips

来源：评论

学校读者我要写书评

暂无评论

A novel learning algorithm which improves the partial fault tolerance of multilayer neural networks

引用

NEURAL NETWORKS 1999年第1期12卷 91-106页

作者： Cavalieri, S Mirabella, O Univ Catania Fac Engn Inst Informat & Telecommun I-95125 Catania Italy

The paper deals with the problem of fault tolerance in a multilayer perceptron network. Although it already possesses a reasonable fault tolerance capability, it may be insufficient in particularly critical applications. Studies carried out by the authors have shown that the traditional backpropagation learning algorithm may entail the presence of a certain number of weights with a much higher absolute value thin the others. Further studies have shown that faults in these weights is the main cause of deterioration in the performance of the neural network. In other words, the main cause of incorrect network functioning on the occurrence of a fault is the non-uniform distribution of absolute values of weights in each layer. The paper proposes a learning algorithm which updates the weights, distributing their absolute values as uniformly as possible in each layer. Tests performed on benchmark test sets have shown the considerable increase in fault tolerance obtainable with the proposed approach as compared with the traditional backpropagation algorithm and with some of the most efficient fault tolerance approaches to be found in literature. (C) 1999 Elsevier Science Ltd. All rights: reserved.

关键词： partial fault tolerance multilayer perceptron neural network learning algorithm classification problems mapping problems

来源：评论

学校读者我要写书评

暂无评论

神经网络在汇率预测上的应用

神经网络在汇率预测上的应用

引用

第十九届全国数据库学术会议

作者：张志政毛宇光韩波邬丽云南京航空航天大学信息科学与技术学院南京大学计算机软件新技术国家重点实验室

1引言汇率在国际金融市场上扮演着重要的角色,金融部门和个人进行外汇交易可以获得高额利润的关键是对汇率走势的正确把握。汇率波动受到政治、经济、心理等多种因素的影响,而且经济指标常常复杂多变、带有噪音,所有这些使得用经典数学... 详细信息

1引言汇率在国际金融市场上扮演着重要的角色,金融部门和个人进行外汇交易可以获得高额利润的关键是对汇率走势的正确把握。汇率波动受到政治、经济、心理等多种因素的影响,而且经济指标常常复杂多变、带有噪音,所有这些使得用经典数学方法很难

关键词： Artificial neural network learning algorithm Back propagated network

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning control of nonlinear multi-link system

引用

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2001年第5期14卷 563-575页

作者： Bucak, IO Zohdy, MA Oakland Univ Dept Elect & Syst Engn Rochester MI 48309 USA

In this paper, the effects of basic parameters in reinforcement learning control such as eligibility, action and critic network constrained weights, system nonlinearities, gradient information, state-space partitioning, variance of exploration are studied in detail. It is attempted to increase feasibility for practical applications, implementation, learning efficiency, and enhance performance. Also, a novel adaptive grid algorithm is proposed to overcome the difficulty in partitioning the input space to achieve better performance. Reinforcement learning is applied for control of a nonlinear one and two-link robots. This problem dictates that the learning is performed on-line, based on a binary or real-valued reinforcement signal from a critic network, without knowing the system model or nonlinearity. (C) 2002 Elsevier Science Ltd. All rights reserved.

关键词： reinforcement learning learning control nonlinear control robotics learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Model-based update in task-level feedforward control using on-line approximation

引用

AUTOMATICA 2001年第3期37卷 391-400页

作者： Gorinevsky, D Vukovich, G Honeywell Technol Ctr Cupertino CA 95014 USA Canadian Space Agcy St Hubert PQ J3Y 8Y9 Canada

This paper proposes and studies an algorithm for task-level control based on a radial. basis function network approximation of the optimal task input vector on parameters of the task. A learning update scheme is proposed for on-line compensation for the inaccuracy of the model used in the controller design. The update approximates the Jacobian of the task input-output mapping using an off-line design model. Deadzone convergence of this learning scheme in the presence of modeling errors is proved and constructive estimates of the convergence robustness parameters are obtained. An application of the proposed algorithm to Feedforward vibration compensation for flexible spacecraft slewing complements the theoretical analysis. Simulations demonstrate practically acceptable performance of the algorithms in this difficult problem. (C) 2001 Elsevier Science Ltd. All rights reserved.

关键词： feedforward control learning algorithm convergence neural network approximation flexible spacecraft

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：