检索结果-内蒙古大学图书馆

ieee 7th International conference on Industrial Cyber-Physical systems (ICPS)

作者： Lian, Bosen Wu, Jiacheng Auburn Univ Elect & Comp Engn Dept Auburn AL 36849 USA Zhejiang Univ State Key Lab Ind Control Technol Hangzhou 310007 Peoples R China

ISBN: (纸本)9798350363029;9798350363012

This paper studies the synchronization problem of two-player multiagent systems through reinforcement learning methods. A Nash-minmax strategy is formulated, where the interactions of two players in the same agent are non-zerosum, while interactions of players between agents are zero-sum games. We propose an offline model-based reinforcement learning algorithm to identify Nash solutions for players within each agent, as well as the worst control solutions for players in neighboring antagonistic agents. On this basis, a data-driven off-policy algorithm is provided to alleviate the requirement for accurate system dynamics in the offline algorithm. Besides, the convergence of the proposed algorithms is analyzed. Finally, simulation results verify the effectiveness of the designed algorithms.

关键词： multiagent systems reinforcement learning Nash Minmax

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning Driving Strategy based on Auxiliary Task for Multi-Scenarios Autonomous Driving 12

Reinforcement Learning Driving Strategy based on Auxiliary T...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Sun, Jingbo Fang, Xing Zhang, Qichao Peng Cheng Labrary Shenzhen Peoples R China Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence Beijing Peoples R China Univ Chinese Acad Sci Sch Artif Intelligence Beijing Peoples R China

ISBN: (纸本)9798350321050

Reinforcement learning (RL) has made great progress in autonomous driving applications. However, using one RL based driving policy for multi-scenarios autonomous driving is still challenging for RL in autonomous driving. There are different observations and reward measurements in different scenarios. At the same time, there is also the problem of multi-source heterogeneous observation in autonomous driving. To address the problems above, we propose a reinforcement learning framework based on the auxiliary task. Firstly, we designed a reward function to enable vehicles to learn safe and efficient strategies. Further, an auxiliary task is designed to learn the characteristics of different scenarios so that the ego agent can adopt different strategies for different scenarios. Finally, in order to handle the driving problem in multiple scenarios, we propose a representation network based on Multi-layer perceptron (MLP), Convolutional neural network (CNN), and Transformer networks to learn multi-source heterogeneous observation. The multi-source heterogeneous observation consists of the ego vehicle state, the bird's eye view (BEV) state and neighbour vehicle states. Experiments show that our method achieves a higher success rate compared to a popular reinforcement learning algorithm.

关键词： Reinforcement learning Autonomous Driving Multiple Scenarios Auxiliary Task

来源：评论

学校读者我要写书评

暂无评论

Robot Skill learning and Generalization based on Human-robot Collaboration and Computer Vision 13

Robot Skill Learning and Generalization based on Human-robot...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Zhang, Xin Zhang, Ruiqing Huang, Deqing Li, Yanan School of Electrical Engineering Southwest Jiaotong University Chengdu611756 China

ISBN: (纸本)9798350361674

The paper presents a novel approach for human-robot skill transferring. Firstly, we propose a method that combines dynamic time warping (DTW) with the Gaussian mixture model (GMM) to reconstruct the demonstrated skills, including reference path and pose, for each workpiece. Secondly, we integrate hand-eye coordination to facilitate dataset creation and employ YOLOv8 for model training. Finally, we utilize a neural network to obtain the current workpiece's category and pose information and then transform the previous skills into the current workpiece coordinate system. The effectiveness and robustness of our proposed method have been validated on a 7-DOF Sawyer robot equipped with a camera. © 2024 ieee.

关键词： Robot learning

来源：评论

学校读者我要写书评

暂无评论

How Well Do Reinforcement learning Approaches Cope With Disruptions? The Case of Traffic Signal control

引用

ieee ACCESS 2023年 11卷 36504-36515页

作者： Korecki, Marcin Dailisan, Damian Helbing, Dirk Swiss Fed Inst Technol Computat Social Sci CH-8092 Zurich Switzerland Complex Sci Hub Vienna A-1080 Vienna Austria

data-driven and machine-learning-based methods are increasingly used in attempts to master the challenges of the world. But are they really the best approaches to manage complex dynamical systems? Our aim is to gain more insights into this question by studying various popular reinforcement learning methods for traffic signal control, namely in disrupted scenarios characterized by significant, unpredictable variations. The results are expected to be relevant in subject areas ranging from traffic physics to transportation theory, from dynamics in networks to complex systems, from control theory to self-organization, and from adaptive heuristics to machine learning.

关键词： Road traffic Reinforcement learning Traffic control Schedules Optimization Benchmark testing Disruptive technologies Machine learning Intelligent vehicles Traffic networks reinforcement learning self-organization signal control disruptions benchmark

来源：评论

学校读者我要写书评

暂无评论

Hybrid Variable Structure DBN Mission Decision-Making Method for UAV Swarm 12

Hybrid Variable Structure DBN Mission Decision-Making Method...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Liu, Bowei Sun, Jingliang Long, Teng Liu, Dawei Cao, Yan Beijing Inst Technol Sch Aerosp Engn Beijing 100081 Peoples R China Minist Educ Key Lab Dynam & Control Flight Vehicle Beijing 100081 Peoples R China Beijing Inst Technol Chongqing Innovat Ctr Chongqing 401135 Peoples R China Res & Dev Acad Machinery Equipment Beijing 100089 Peoples R China

ISBN: (纸本)9798350321050

To cope with the dynamic mission decision-making issue in complex environments for UAV swarm, a hybrid variable structure-based dynamic Bayesian network (HVSDBN) inference decision-making method is proposed. Firstly, the UAV swarm mission decision-making model is established to assess the UAV swarm state and threat state accurately. To further improve the accuracy of decision-making, the threat assessment model and swarm state assessment model are built by using mixed continuous and discrete variables, respectively. Furthermore, a dynamic HVSDBN decision-making algorithm based on hybrid performance-capability parameters is proposed, which can adjust the structure of the decision model according to the priori information and observation data to improve the adaptability of the solution strategy. Simulation results demonstrate that, the HVSDBN method can im-prove the variance of decision results by 25.03% compared with traditional method, which effectively improves the accuracy of UAV swarm mission decision-making under complex dynamic environment.

关键词： UAV Swarm Dynamic Bayesian Network Variable Structure Mission Decision-Making

来源：评论

学校读者我要写书评

暂无评论

Contrastive representation learning for time series via compound consistency and hierarchical contrasting 12

Contrastive representation learning for time series via comp...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Zheng, Teng Cao, Guanghao Chen, Lei Hao, Kuangrong Donghua Univ Minist Educ Engn Res Ctr Digitized Text & Apparel Technol Shanghai 201620 Peoples R China Donghua Univ Coll Informat Sci & Technol Shanghai 201620 Peoples R China

ISBN: (纸本)9798350321050

In this paper, a novel contrastive representation learning framework for time series data is proposed. The framework is designed to learn general representations of time series at various semantic levels and is capable of transferring across different datasets. The framework incorporates two key components. Firstly, a hierarchical contrasting method is used to consider both the temporal and instance dimensions of the time series and captures information at different levels through maximum pooling at corresponding timestamps, enabling the model to learn fine-grained and multi-scale time-stamped representations for time series prediction tasks. Secondly, a compound consistency constraint is leveraged, which combines transformation consistency and temporal-frequency consistency, to effectively learn a universal representation of the time series, thereby ensuring its transferability across different datasets. Additionally, the framework considers both the temporal and frequency information of the time series, and uses an adaptive wavelet transform to obtain the frequency domain representation while maintaining temporal alignment, facilitating the contrast of temporal-frequency consistency. Finally, the proposed framework is evaluated through extensive experiments on time series prediction tasks and compared with existing models on four public datasets. The results show that the linear regressor trained with the representations learned by the proposed model outperforms existing time series prediction models in terms of prediction accuracy and transferability.

关键词： Contrastive learning Time Series Time-Frequency Consistency Hierarchical Contrasting

来源：评论

学校读者我要写书评

暂无评论

Improved Residual Reinforcement learning for Dynamic Obstacle Avoidance of Robotic Arm 13

Improved Residual Reinforcement Learning for Dynamic Obstacl...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Liu, Zhenting Liu, Shan College of Control Science and Engineering Zhejiang University State Key Laboratory of Industrial Control Technology Hangzhou310027 China

ISBN: (纸本)9798350361674

This paper proposes an improved residual deep reinforcement learning method for robot arm dynamic obstacle avoidance and position servo. The proposed method first simplifies the state space by constructing key points and pre-trains a model capable of completing obstacle avoidance tasks using the simplified state. Then, when training with real state information, a guiding network is used to help accumulate good samples, which improves the training efficiency. To overcome the convergence difficulty of residual DQN in robot arm obstacle avoidance, this paper incorporates the action of the feedback controller into the action space and uses incremental reward values to evaluate the actions. Simulation results demonstrate that the proposed method can effectively achieve robot arm dynamic obstacle avoidance and position servo. © 2024 ieee.

关键词： Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Distributed Adaptive Formation control of Connected Vehicles with Actuator Saturation and Time-Varying Spacing 13

Distributed Adaptive Formation Control of Connected Vehicles...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Ji, Honghai La, Xiaoyan Fan, Lingling Wang, Li Liu, Shida Ren, Ye School of Electronics & Control Engineering North China University of Technology Beijing100144 China School of Automation Beijing Information Science and Technology University Beijing100096 China

ISBN: (纸本)9798350361674

This study investigates the problem of distributed adaptive formation control of connected vehicles with actuator saturation and time-varying spacing. Firstly, optimization performance metrics are defined based on the requirements of vehicle formation control to ensure communication coordination among the following vehicles while enabling them to track the leader vehicle's trajectory. Secondly, time-varying spacing is introduced to meet the formation requirements in different scenarios. Furthermore, actuator saturation are imposed to ensure passenger comfort during the ride. Finally, the effectiveness of the proposed algorithm is validated through three numerical simulation cases. © 2024 ieee.

关键词： Adaptive control systems

来源：评论

学校读者我要写书评

暂无评论

data-driven Model-Free Adaptive control for Motion control systems with Saturation Constraints 13

Data-Driven Model-Free Adaptive Control for Motion Control S...

引用

13th ieee data driven control and learning systems conference, DDCLS 2024

作者： Mi, Baohan Huo, Xin Li, Rongmei Xu, Fan Zhao, Hui Control and Simulation Center Harbin Institute of Technology Harbin150080 China

ISBN: (纸本)9798350361674

In actual motion control systems, saturation constraints are generally encountered, restricting the tracking performance seriously. This paper aims at devising an anti-windup scheme for motion control systems controlled by full form dynamic linearization based model-free adaptive control (FFDL-MFAC). The proposed anti-windup scheme includes a designed saturation identifier and a modified control algorithm. With the proposed hysteresis-deadband characteristic modules, the identifier exhibits robust tolerance on signal noise. For the modified control algorithm, the cumulative term of error will be cut off by the output of the identifier when the actuator falls saturated, supporting exiting saturation constraint once reaches. Besides, the performance and the BIBO stability of the system controlled by the proposed scheme are substantiated. Finally, numerical simulations are presented to demonstrate the superiority of the proposed scheme, even in the presence of signal noise and disturbances. © 2024 ieee.

关键词： Adaptive control systems

来源：评论

学校读者我要写书评

暂无评论

Time series generator adversarial network with stochastic process for the degradation generation and prediction 12

Time series generator adversarial network with stochastic pr...

引用

ieee 12th data driven control and learning systems conference (DDCLS)

作者： Shangguan, Anqi Feng, Nan Mu, Lingxia Fei, Rong Hei, Xinhong Xie, Guo Xian Univ Technol Shaanxi Key Lab Complex Syst Control & Intelligen Xian 710048 Peoples R China Xian Univ Technol Sch Comp Sci & Engn Xian 710048 Peoples R China

ISBN: (纸本)9798350321050

Since the cumbersome collection process and high cost, the collected degradation of the product is basically small samples, which will affect the accuracy of reliability evaluation. It is necessary to expand the degradation to improve the accuracy of later reliability assessment. Therefore, a degradation generation and prediction method is proposed combining the time series generator adversarial network (TimeGAN) and stochastic process. Firstly, the input degradation is expanded by the sliding window to improve the later training accuracy;Then, the construction of the generator in TimeGAN is linked with the stochastic process to make the generation data more realistic. Finally, the results of degradation prediction by the Gated Recurrent Unit (GRU) can be obtained. Two datasets and different generation methods are adopted to evaluate the effectiveness of the proposed method. The results shows that the Kullback-Leibler(KL) divergence is the smallest, and the prediction error is the smallest compared with the other methods. So, the proposed method is proved that it is valid in the degradation generation and prediction, and can be used for the further reliability assessment of the product in the industrial system.

关键词： Time GAN Stochastic process Degradation generation Small samples

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：