检索结果-内蒙古大学图书馆

Regret Analysis of learning-Based MPC With Partially Unknown Cost Function

ieee TRANSACTIONS ON AUTOMATIC control 2024年第5期69卷 3246-3253页

作者： Dogan, Ilgin Shen, Zuo-Jun Max Aswani, Anil Univ Calif Berkeley Ind Engn & Operat Res Berkeley CA 94720 USA Univ Hong Kong Fac Engn Hong Kong Peoples R China Univ Hong Kong Fac Business & Econ Hong Kong Peoples R China

The exploration-exploitation tradeoff is an inherent challenge in data-driven adaptive control. Though this tradeoff has been studied for multiarmed bandits (MABs) and reinforcement learning for linear systems, it is less well studied for learning-based control of nonlinear systems. A significant theoretical challenge in the nonlinear setting is that there is no explicit characterization of an optimal controller for a given set of cost and system parameters. We propose the use of a finite-horizon oracle controller with full knowledge of parameters as a reasonable surrogate to an optimal controller. This allows us to develop policies in the context of learning-based model-predictive control (MPC) and conduct a control-theoretic analysis using techniques from MPC and optimization theory to show that these policies achieve low regret with respect to this finite-horizon oracle. Our simulations exhibit the low regret of our policy on a heating, ventilation, and air-conditioning model with partially unknown cost function.

关键词： Cost function Costs Linear systems HVAC control systems Adaptation models Ventilation learning-based control model-predictive control (MPC) nonmyopic exploitation restless bandits

来源：评论

学校读者我要写书评

暂无评论

data-driven Temperature control of Internal Mixers 12

Data-driven Temperature Control of Internal Mixers

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Zhou, Zhihao Chi, Ronghu Qingdao Univ Sci & Technol Sch Automat & Elect Engn Qingdao 266061 Peoples R China

ISBN: (纸本)9798350321050

This paper proposes a model-free adaptive control (MFAC) strategy for the internal mixer temperature (IMT) systems with characteristics of large time lags, large inertia, time varying and disturbances. A dynamic linearization method is applied to reconstruct the IMT system into a linear incremental form to facilitate the consequent controller design. Both input saturation and event-triggering condition are considered in the proposed algorithm, where the former is added to avoid overshooting and the latter is used to reduce the execution number of controller updates. In addition, the unknown parameter in the obtained linear model can be identified by the proposed estimation algorithm using only I/O data. The effectiveness of the proposed MFAC is verified through simulations.

关键词： Internal Mixer Temperature System data-driven Method Dynamic Linearization

来源：评论

学校读者我要写书评

暂无评论

data-driven Adaptive Distributed Localization of Multi-Agent systems With Sensor Failure

引用

ieee TRANSACTIONS ON INDUSTRIAL ELECTRONICS 2024年第9期71卷 11229-11238页

作者： Lv, Yunkai Ren, Hongliang Zhang, Hao Wang, Zhuping Yan, Huaicheng East China Univ Sci & Technol Sch Informat Sci & Engn Shanghai 200237 Peoples R China Tongji Univ Dept Control Sci & Engn Shanghai 200092 Peoples R China Chinese Univ Hong Kong CUHK Fac Engn Dept Elect Engn Hong Kong 999077 Peoples R China Chinese Univ Hong Kong CUHK Shun Hing Inst Adv Engn Hong Kong 999077 Peoples R China East China Univ Sci & Technol Key Lab Adv Control & Optimizat Chem Proc Minist Educ Shanghai 200237 Peoples R China

This work solves the localization estimation of dynamic multi-agent systems (MASs) with sensor multiplicative failures, which is more general yet challenging to address than static sensor networks with ideal conditions. Barycentric coordinate is introduced to characterize the relative positions between agents. A new linear data model is constructed to represent the relationship between barycentric coordinates and relative distance. Based on the linear model, an adaptive parameter estimation algorithm is designed, and then it is applied to solve the relative distance compensation problem of the MASs with sensor multiplicative failures. Using the estimated parameter, a data-driven adaptive distributed localization estimation scheme based on iterative learning is proposed, in which only the measured relative distance data are available instead of the system model information. A key to obtaining accurate localization is overcoming the difficulties from inaccurate relative distance variables due to sensor failure via the data-driven adaptive relative distance compensation method. The numerical examples and experimental results verify the effectiveness of the proposed methods.

关键词： Location awareness Estimation Robot sensing systems Multi-agent systems data models Adaptation models Convergence Adaptive data-driven distributed localization iterative learning sensor multiplicative failures

来源：评论

学校读者我要写书评

暂无评论

Speed and heading control of an unmanned surface vehicle using deep reinforcement learning 12

Speed and heading control of an unmanned surface vehicle usi...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Wu, Ting Ye, Hui Xiang, Zhengrong Yang, Xiaofei Jiangsu Univ Sci & Technol Sch Automat Zhenjiang 212100 Jiangsu Peoples R China Nanjing Univ Sci & Technol Sch Automat Nanjing 210094 Peoples R China

ISBN: (纸本)9798350321050

In this paper, a deep reinforcement learning-based speed and heading control method is proposed for an unmanned surface vehicle (USV). A deep deterministic policy gradient (DDPG) algorithm which combines with an actor-critic reinforcement learning mechanism, is adopted to provide continuous control variables by interacting with the environment. Moreover, two types of reward functions are created for speed and heading control of the USV. The control policy is trained by trial and error so that the USV can be guided to achieve the desired speed and heading angle steadily and rapidly. Simulation results verify the feasibility and effectiveness of the proposed approach by comparisons with classical PID control and S plane control.

关键词： Deep reinforcement learning DDPG algorithm unmanned surface vehicle

来源：评论

学校读者我要写书评

暂无评论

Model-Free Adaptive control for Nonlinear systems Under Sparse Sensor Attacks 12

Model-Free Adaptive Control for Nonlinear Systems Under Spar...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Chen, Yifan Liu, Dong Shenyang Aerosp Univ Coll Automat Shenyang 110136 Peoples R China

ISBN: (纸本)9798350321050

This paper discusses the model-free adaptive control for nonlinear systems under sparse sensor attacks. Firstly, it is proposed that there are multiple transmission channels in the sensor-to-controller transmission network. Secondly, system sensors are affected by DoS attacks and FDI attacks. Then, a channel switching mechanism is used to adjust the channel to compensate for adverse effects of attacks. Finally, it is proved that the tracking error of the system converges to a tiny constant and a numerical simulation example demonstrates the validity of the proposed method.

关键词： Model-free adaptive control sparse sensor attacks DoS attacks FDI attacks channel switching mechanism

来源：评论

学校读者我要写书评

暂无评论

learning Optimal control Policy for Unknown Discrete-Time systems

引用

ieee TRANSACTIONS ON CIRCUITS AND systems II-EXPRESS BRIEFS 2023年第11期70卷 4191-4195页

作者： Lai, Jing Xiong, Junlin Univ Sci & Technol China Dept Automat Hefei 230026 Peoples R China

This brief studies the optimal control policy learning problem for discrete-time linear systems. A data-driven model-free algorithm is proposed by using the data matrices of the augmented system state and the increasing of the discount factor. The control gains generated by the proposed algorithm are proven to converge to the optimal one. Compared with the existing work, our model-free algorithm avoids the dependence on initial stabilizing control policy and the use of Kronecker product. Some numerical examples are provided to illustrate the proposed algorithm and analysis results.

关键词： Model-free stabilizing control data-driven reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

data-driven Combined Longitudinal and Lateral control for the Car Following Problem

引用

ieee TRANSACTIONS ON control systems TECHNOLOGY 2025年第3期33卷 991-1005页

作者： Cui, Leilei Chakraborty, Sayan Ozbay, Kaan Jiang, Zhong-Ping MIT Cambridge MA 02139 USA NYU Tandon Sch Engn Dept Elect & Comp Engn Control & Networks Lab Brooklyn NY 11201 USA NYU C2SMARTER Ctr Tandon Sch Engn Dept Civil & Urban Engn Brooklyn NY 11201 USA NYU Dept Elect & Comp Engn Dept Civil & Urban Engn Control & Networks LabTandon Sch Engn Brooklyn NY 11201 USA

This article studies the problem of data-driven combined longitudinal and lateral control of autonomous vehicles (AVs) such that the AV can stay within a safe but minimum distance from its leading vehicle and, at the same time, in the lane. Most of the existing methods for combined longitudinal and lateral control are either model-based or developed by purely data-driven methods such as reinforcement learning. Traditional model-based control approaches are insufficient to address the adaptive optimal control design issue for AVs in dynamically changing environments and are subject to model uncertainty. Moreover, the conventional reinforcement learning approaches require a large volume of data, and cannot guarantee the stability of the vehicle. These limitations are addressed by integrating the advanced control theory with reinforcement learning techniques. To be more specific, by utilizing adaptive dynamic programming (ADP) techniques and using the motion data collected from the vehicles, a policy iteration algorithm is proposed such that the control policy is iteratively optimized in the absence of the precise knowledge of the AV's dynamical model. Furthermore, the stability of the AV is guaranteed with the control policy generated at each iteration of the algorithm. The efficiency of the proposed approach is validated by the integrated simulation of SUMO and CommonRoad.

关键词： Adaptation models Vehicle dynamics Mathematical models Transportation Roads Reinforcement learning Nonlinear dynamical systems Electronic mail Dynamic programming Accuracy Adaptive dynamic programming (ADP) combined longitudinal and lateral control connected vehicles

来源：评论

学校读者我要写书评

暂无评论

Heterogeneous AGVs Scheduling in Hospital Using ALNS-based Metaheuristic Algorithm 12

Heterogeneous AGVs Scheduling in Hospital Using ALNS-based M...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Song, Xueming Zhu, Ke Zhao, Yuxing Zhang, Jianming Zhejiang Univ Coll Control Sci & Technol Hangzhou 310000 Peoples R China Zhejiang Univ Robot Inst Yuyao 314500 Peoples R China

ISBN: (纸本)9798350321050

Automated Guided vehicles (AGVs) provide a better solution to hospital logistics. In this paper, a mathematical model for point-to-point pickup and delivery tasks in a hospital with time windows and capacity constraints based on heterogeneous AGVs fleet is established, and a meta-heuristic algorithm based on ALNS is designed to solve the static scheduling problem of AGVs in the hospital environment. The effectiveness of the proposed algorithm is verified by numerical experiments and comparison with the basic algorithm. Finally, we summarized the direction of the further work.

关键词： AGV heterogeneous ALNS scheduling PDP

来源：评论

学校读者我要写书评

暂无评论

Secure control for the discrete-time CPSs under DoS attacks via a switching strategy 12

Secure control for the discrete-time CPSs under DoS attacks ...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Zhang, Ruifeng Li, Guitong Yang, Rongni Shandong Univ Sch Control Sci & Engn Jinan 250061 Peoples R China

ISBN: (纸本)9798350321050

In this work, the stability analysis and stabilization problem for one class of discrete-time cyber-physical systems (CPSs) under denial-of-service (DoS) attacks is investigated. Firstly, according to appropriate formulation of DoS attacks, different scenarios based on the presence or absence of DoS attacks are developed. Then the input-to-state stability (ISS) and globally asymptotical stability (GAS) of the considered CPSs can be guaranteed in terms of DoS frequency and duration restrictions, respectively. Finally, one example is given to verify the applicability of our theoretical result.

关键词： Cyber-physical systems denial-of-service attacks frequency duration stability

来源：评论

学校读者我要写书评

暂无评论

Hamiltonian-driven Adaptive Dynamic Programming With Efficient Experience Replay

引用

ieee TRANSACTIONS ON NEURAL NETWORKS AND learning systems 2024年第3期35卷 3278-3290页

作者： Yang, Yongliang Pan, Yongping Xu, Cheng-Zhong Wunsch, Donald C. Univ Sci & Technol Beijing Sch Automat & Elect Engn Key Lab Knowledge Automat Ind Proc Minist Educ Beijing 100083 Peoples R China Sun Yat Sen Univ Sch Adv Mfg Shenzhen 518107 Peoples R China Univ Macau Dept Comp & Informat Sci State Key Lab Internet Things Smart City Macau Peoples R China Missouri Univ Sci & Technol Dept Elect & Comp Engn Rolla MO 65409 USA

This article presents a novel efficient experience-replay-based adaptive dynamic programming (ADP) for the optimal control problem of a class of nonlinear dynamical systems within the Hamiltonian-driven framework. The quasi-Hamiltonian is presented for the policy evaluation problem with an admissible policy. With the quasi-Hamiltonian, a novel composite critic learning mechanism is developed to combine the instantaneous data with the historical data. In addition, the pseudo-Hamiltonian is defined to deal with the performance optimization problem. Based on the pseudo-Hamiltonian, the conventional Hamilton-Jacobi-Bellman (HJB) equation can be represented in a filtered form, which can be implemented online. Theoretical analysis is investigated in terms of the convergence of the adaptive critic design and the stability of the closed-loop systems, where parameter convergence can be achieved under a weakened excitation condition. Simulation studies are investigated to verify the efficacy of the presented design scheme.

关键词： Mathematical models Optimal control Optimization Convergence Iterative algorithms Dynamic programming learning systems Hamilton-Jacobi-Bellman (HJB) equation Hamiltonian-driven adaptive dynamic programming (ADP) pseudo-Hamiltonian quasi-Hamiltonian relaxed excitation condition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：