检索结果-内蒙古大学图书馆

Comparison of model-free kinetic methods for modeling the cure kinetics of commercial phenol-formaldehyde resins

THERMOCHIMICA ACTA 2005年第1-2期439卷 68-73页

作者： Wang, JW Laborie, MPG Wolcott, MP Washington State Univ Dept Civil & Environm Engn Wood Mat & Engn Lab Pullman WA 99164 USA

For many industrial processes it is important to model the cure kinetics of phenol-formaldehyde resoles. Yet the applicability of common model-free kinetic algorithms for the cure of phenolic resins is not known. In this study the ability of the Friedman, Vyazovkin and Kissinger-Akahira-Sunose (KAS) model-free-kinetics algorithms to model and predict the cure kinetics of commercial resoles is compared. The Friedman and Vyazovkin methods generate consistent activation energy dependences on conversion compared to the KAS method. In addition, the activation energy dependency on conversion is of higher amplitude with these two methods than with the KAS method. Hence, the Friedman and Vyazovkin methods are more adequate for revealing the cure steps of commercial PF resoles. Conversely, the KAS algorithm is easily amenable to dynamic cure predictions compared to the Friedman and Vyazovkin methods. Isothermal cure is equally well predicted with the three. As a result, the KAS algorithm is the method of choice for modeling and predicting the cure kinetics of commercial phenolic resoles under various temperature programs. (C) 2005 Elsevier B.V. All rights reserved.

关键词： phenol-formaldehyde model-free algorithms differential scanning calorimetry (DSC) cure prediction

来源：评论

学校读者我要写书评

暂无评论

VOQL: Towards Optimal Regret in model-free RL with Nonlinear Function Approximation 36

VOQL: Towards Optimal Regret in Model-free RL with Nonlinear...

引用

36th Annual Conference on Learning Theory (COLT)

作者： Agarwal, Alekh Jin, Yujia Zhang, Tong Google Res Mountain View CA 94043 USA Stanford Univ Stanford CA 94305 USA

We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards. We design a new algorithm, Variance-weighted Optimistic Q-Learning (VOQL), based on Q-learning and bound its regret assuming closure under Bellman backups, and bounded Eluder dimension for the regression function class. As a special case, VOQL achieves (O) over tilde (d root TH + d(6)H(5)) regret over T episodes for a horizon H MDP under (d-dimensional) linear function approximation, which is asymptotically optimal. Our algorithm incorporates weighted regression-based upper and lower bounds on the optimal value function to obtain this improved regret. The algorithm is computationally efficient given a regression oracle over the function class, making this the first computationally tractable and statistically optimal approach for linear MDPs.

关键词： Reinforcement learning nonlinear function approximation model-free algorithms eluder dimension

来源：评论

学校读者我要写书评

暂无评论

The Efficacy of Pessimism in Asynchronous Q-Learning

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2023年第11期69卷 7185-7219页

作者： Yan, Yuling Li, Gen Chen, Yuxin Fan, Jianqing MIT Inst Data Syst & Soc Cambridge MA 02139 USA Univ Penn Wharton Sch Dept Stat & Data Sci Philadelphia PA 19104 USA Princeton Univ Dept Operat Res & Financial Engn Princeton NJ 08544 USA

This paper is concerned with the asynchronous form of Q-learning, which applies a stochastic approximation scheme to Markovian data samples. Motivated by the recent advances in offline reinforcement learning, we develop an algorithmic framework that incorporates the principle of pessimism into asynchronous Q-learning, which penalizes infrequently-visited state-action pairs based on suitable lower confidence bounds (LCBs). This framework leads to, among other things, improved sample efficiency and enhanced adaptivity in the presence of near-expert data. Our approach permits the observed data in some important scenarios to cover only partial state-action space, which is in stark contrast to prior theory that requires uniform coverage of all state-action pairs. When coupled with the idea of variance reduction, asynchronous Q-learning with LCB penalization achieves near-optimal sample complexity, provided that the target accuracy level is small enough. In comparison, prior works were suboptimal in terms of the dependency on the effective horizon even when i.i.d. sampling is permitted. Our results deliver the first theoretical support for the use of pessimism principle in the presence of Markovian non-i.i.d. data.

关键词： Q-learning Trajectory Behavioral sciences Markov processes Fans Data models Computational modeling Asynchronous Q-learning offline reinforcement learning pessimism principle model-free algorithms partial coverage variance reduction

来源：评论

学校读者我要写书评

暂无评论

First-Order Active Disturbance Rejection-Virtual Reference Feedback Tuning Control of Tower Crane Systems 24

First-Order Active Disturbance Rejection-Virtual Reference F...

引用

24th International Conference on System Theory, Control and Computing (ICSTCC)

作者： Roman, Raul-Cristian Precup, Radu-Emil Petriu, Emil M. David, Radu-Codrut Hedrea, Elena-Lorena Szedlak-Stinean, Alexandra-Iulia Politehn Univ Timisoara Dept Autom Appl Informat Timisoara Romania

ISBN: (纸本)9781728198095

The current paper combines the main features of first-order Active Disturbance Rejection Control (ADRC) with Virtual Reference Feedback Tuning (VRFT) to automatically determine the parameters of the controller without the process model. The development of the resulted data-driven algorithm, called first-order ADRC-VRFT algorithm, is exemplified in the control of cart position, arm angular position and payload position of three-degree-of-freedom tower crane systems (TCSs) using three Single Input-Single Output (SISO) loops that are running in parallel. The three SISO first-order ADRC-VRFT algorithms benefit from a twofold validation using experiments on real-time TCS equipment: model-free (without the process model) and also model-based (making use of the process model). The algorithms are also compared in terms of the experimental results by considering an objective function as performance index.

关键词： Active Disturbance Rejection Control model-free algorithms tower crane systems

来源：评论

学校读者我要写书评

暂无评论

Comparative Analysis of Existing Architectures for General Game Agents 17

Comparative Analysis of Existing Architectures for General G...

引用

17th International Symposium on Symbolic and Numeric algorithms for Scientific Computing (SYNASC)

作者： Hosu, Ionel-Alexandru Urzica, Andreea Univ Politehn Bucuresti Fac Automat Control & Comp Sci Bucharest Romania

ISBN: (纸本)9781509004614

This paper addresses the development of general purpose game agents able to learn a vast number of games using the same architecture. The article analyzes the main existing approaches to general game playing, reviews their performance and proposes future research directions. Methods such as deep learning, reinforcement learning and evolutionary algorithms are considered for this problem. The testing platform is the popular video game console Atari 2600. Research into developing general purpose agents for games is closely related to achieving artificial general intelligence (AGI).

关键词： convolutional neural networks Q-learning model-free algorithms neuroevolution games

来源：评论

学校读者我要写书评

暂无评论

An algorithm for dip point detection in lithium-sulfur battery cells

引用

JOURNAL OF ENERGY STORAGE 2022年 55卷

作者： Nozarijouybari, Zahra Fang, Catherine Doosthosseini, Mahsa Xu, Chu Fathy, Hosam K. Univ Maryland Dept Mech Engn College Pk MD 20742 USA

This article examines the problem of developing a simple, model-free algorithm for detecting and identifying the time instant when a Lithium-Sulfur (Li-S) cell passes through its "dip point"during discharge. The dip point marks a sharp transition between two different sets of redox reactions involved in Li-S battery discharge, and is characterized by a significant change in the slope of battery potential with respect to charge processed. This makes it possible to detect the dip point accurately using a simple algorithm that uses a moving-horizon least-squares method to estimate the above slope, then detects the dip point by detecting changes in this slope. We validate this algorithm both using a physics-based battery simulation and experimentally, using custom-fabricated Li-S coin cells. One potential benefit of this algorithm is the degree to which it makes it possible to pinpoint battery cell arrival into the low plateau region accurately, which is important in light of the well-recognized difficulties associated with Li-S battery state estimation in this region. This opens the door to potential innovations in Li-S battery pack balancing that rely on dip point detection as a simpler and reliable alternative to full battery state estimation.

关键词： Li-S cells SOC estimation Dip point Regression model-free algorithms Balancing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：