检索结果-内蒙古大学图书馆

A Transfer Learning Framework for Deep Multi-Agent Reinforcement Learning

IEEE/CAA Journal of Automatica Sinica 2024年第11期11卷 2346-2348页

作者： Yi Liu Xiang Wu Yuming Bo Jiacun Wang Lifeng Ma the School of Automation Nanjing University of Science and Technology the Department of Computer Science and Software Engineering Monmouth University

Dear Editor,This letter presents a new transfer learning framework for the deep multi-agent reinforcement learning(DMARL) to reduce the convergence difficulty and training time when applying DMARL to a new scenario [1... 详细信息

关键词： Deep agent Framework

来源：评论

学校读者我要写书评

暂无评论

Reinforcement learning-based unknown reference tracking control of HMASs with nonidentical communication delays

引用

science China(Information sciences) 2023年第7期66卷 46-57页

作者： Yong XU Zheng-Guang WU Wei-Wei CHE Deyuan MENG School of Automation Beijing Institute of Technology Institute of Cyber-Systems and Control Zhejiang University College of Mathematics and Computer Science Zhejiang Normal University Department of Automation Qingdao University School of Automation Science and Electrical Engineering Beihang University(BUAA)

This paper focuses on the optimal output synchronization control problem of heterogeneous multiagent systems(HMASs) subject to nonidentical communication delays by a reinforcement learning *** with existing studies assuming that the precise model of the leader is globally or distributively accessible to all or some of the followers, the leader's precise dynamical model is entirely inaccessible to all the followers in this paper. A data-based learning algorithm is first proposed to reconstruct the leader's unknown system matrix online. A distributed predictor subject to communication delays is further devised to estimate the leader's state, where interaction delays are allowed to be nonidentical. Then, a learning-based local controller, together with a discounted performance function, is projected to reach the optimal output synchronization. Bellman equations and game algebraic Riccati equations are constructed to learn the optimal solution by developing a model-based reinforcement learning(RL) algorithm online without solving regulator equations, which is followed by a model-free off-policy RL algorithm to relax the requirement of all agents' dynamics faced by the model-based RL algorithm. The optimal tracking control of HMASs subject to unknown leader dynamics and communication delays is shown to be solvable under the proposed RL algorithms. Finally, the effectiveness of theoretical analysis is verified by numerical simulations.

关键词： heterogeneous multiagent systems HMAS reinforcement learning RL optimal output synchronization communication delays

来源：评论

学校读者我要写书评

暂无评论

Dynamic Modeling of Robotic Manipulator via an Augmented Deep Lagrangian Network

引用

Tsinghua science and Technology 2024年第5期29卷 1604-1614页

作者： Shuangshuang Wu Zhiming Li Wenbai Chen Fuchun Sun School of Automation Beijing Information Science and Technology UniversityBeijing 100192China Department of Computer Science and Technology Tsinghua UniversityBeijing 100084China

Learning the accurate dynamics of robotic systems directly from the trajectory data is currently a prominent research *** physics-enforced networks,exemplified by Hamiltonian neural networks and Lagrangian neural networks,demonstrate proficiency in modeling ideal physical systems,but face limitations when applied to systems with uncertain non-conservative dynamics due to the inherent constraints of the conservation laws *** this paper,we present a novel augmented deep Lagrangian network,which seamlessly integrates a deep Lagrangian network with a standard deep *** fusion aims to effectively model uncertainties that surpass the limitations of conventional Lagrangian *** proposed network is applied to learn inverse dynamics model of two multi-degree manipulators including a 6-dof UR-5 robot and a 7-dof SARCOS manipulator under *** experimental results clearly demonstrate that our approach exhibits superior modeling precision and enhanced physical credibility.

关键词： deep Lagrangian network nonconservative dynamics multi-degree manipulator inverse dynamic modeling

来源：评论

学校读者我要写书评

暂无评论

Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images

引用

IEEE Transactions on Pattern Analysis and Machine Intelligence 2025年第7期47卷 5884-5900页

作者： Tang, Yingzhi Zhang, Qijian Liu, Yebin Hou, Junhui City University of Hong Kong Department of Computer Science Hong Kong Tsinghua University Department of Automation Beijing China

The latest trends in the research field of single-view human reconstruction are devoted to learning deep implicit functions constrained by explicit body shape priors. Despite the remarkable performance improvements compared with traditional processing pipelines, existing learning approaches still exhibit limitations in terms of flexibility, generalizability, robustness, and/or representation capability. To comprehensively address the above issues, in this paper, we investigate an explicit point-based human reconstruction framework named HaP, which utilizes point clouds as the intermediate representation of the target geometric structure. Technically, our approach features fully explicit point cloud estimation (exploiting depth and SMPL), manipulation (SMPL rectification), generation (built upon diffusion), and refinement (displacement learning and depth replacement) in the 3D geometric space, instead of an implicit learning process that can be ambiguous and less controllable. Extensive experiments demonstrate that our framework achieves quantitative performance improvements of 20% to 40% over current state-of-the-art methods, and better qualitative results. Our promising results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design. In addition, we newly contribute a real-scanned 3D human dataset featuring more intricate geometric details. We will make our code and data publicly available at https://***/yztang4/HaP. © 1979-2012 IEEE.

关键词： 3D reconstruction

来源：评论

学校读者我要写书评

暂无评论

Comparing Player Rating Awarding Systems: Evaluating Performance Metrics in Soccer: An in-Depth Analysis of Rating Methodologies and Their Impact on Player Assessment and Team Performance 11

Comparing Player Rating Awarding Systems: Evaluating Perform...

引用

11th International Conference on Electrical and Electronics Engineering, ICEEE 2024

作者： Toderici, Dan Lupsa, Sergiu-Rares Lemnaru, Camelia Technical University of Cluj-Napoca Faculty of Automation and Computer Science Computer Science Department Cluj-Napoca Romania

ISBN: (纸本)9798350362541

The primary objective of this paper is to develop a method for accurately and consistently rating football players. A secondary objective is to evaluate the suitability of artificial neural networks for this purpose, particularly with the available data. To address these objectives, data was collected from various sources, including APIs and the WhoScored website via web scraping. Multiple solutions were explored for implementing the player rating system, encompassing both AI-based and mathe-matical approaches. In the pursuit of AI-based models, ratings provided by WhoScored were utilized as a target due to their recognized precision and consistency. Following the implemen-tation of various models, comprehensive testing and evaluation using diverse metrics and methods were conducted to identify the most effective approach, or determine if certain models are better suited for specific scenarios. Overall, this research contributes to the integration of data from multiple sources to enhance player rating methodologies in football analysis. © 2024 IEEE.

关键词： Football

来源：评论

学校读者我要写书评

暂无评论

Overhead-free Noise-tolerant Federated Learning: A New Baseline

引用

Machine Intelligence Research 2024年第3期21卷 526-537页

作者： Shiyi Lin Deming Zhai Feilong Zhang Junjun Jiang Xianming Liu Xiangyang Ji Department of Computer Science and Technology Harbin Institute of TechnologyHarbin150000China Department of Automation Tsinghua UniversityBeijing100084China

Federated learning (FL) is a promising decentralized machine learning approach that enables multiple distributed clients to train a model jointly while keeping their data private. However, in real-world scenarios, the supervised training data stored in local clients inevitably suffer from imperfect annotations, resulting in subjective, inconsistent and biased labels. These noisy labels can harm the collaborative aggregation process of FL by inducing inconsistent decision boundaries. Unfortunately, few attempts have been made towards noise-tolerant federated learning, with most of them relying on the strategy of transmitting overhead messages to assist noisy labels detection and correction, which increases the communication burden as well as privacy risks. In this paper, we propose a simple yet effective method for noise-tolerant FL based on the well-established co-training framework. Our method leverages the inherent discrepancy in the learning ability of the local and global models in FL, which can be regarded as two complementary views. By iteratively exchanging samples with their high confident predictions, the two models “teach each other” to suppress the influence of noisy labels. The proposed scheme enjoys the benefit of overhead cost-free and can serve as a robust and efficient baseline for noise-tolerant federated learning. Experimental results demonstrate that our method outperforms existing approaches, highlighting the superiority of our method.

关键词： Federated learning noise-label learning privacy-preserving machine learning edge intelligence distributed machine learning

来源：评论

学校读者我要写书评

暂无评论

Reinforcement Learning Algorithms with Graph Convolution Networks for Traffic Signal Control 8th

Reinforcement Learning Algorithms with Graph Convolution Ne...

引用

8th International Conference on Intelligent Transport Systems, INTSYS 2024

作者： Salmalge, Shreya Bhatnagar, Shalabh Department of Computer Science and Automation Indian Institute of Science Bengaluru India

ISBN: (纸本)9783031863691

Traffic congestion is the root cause of various social and economic problems like longer travel times, increased pollution, and fuel or energy consumption. Addressing the issue is becoming increasingly crucial with rising city traffic and limited road infrastructure. The way we change traffic signals has a significant impact on congestion in road networks. We implement reinforcement learning algorithms for controlling traffic signals adaptive to congestion in incoming roads at junctions. Road networks can be viewed as graphs with intersections as nodes and roads as edges. This motivates us to use graph convolutional networks (GCN) as function approximators in various RL algorithms applied to traffic signal control. We implement Deep Q-learning (DQN), Graph Convolutional Q-learning (GCQN), Graph Convolutional Actor-Critic (GCAC), and individual-DQN models to learn a deterministic policy for adaptive traffic signal control. We also present a comparison of the performances of these models and infer that GCQN models are better suited to work for large road networks. To the best of our knowledge, the Graph Convolutional Actor-Critic model is not used in any existing traffic signal control method. We also compare the GCQN and GCAC models against existing and state-of-the-art approaches. Experimental evaluation shows that our proposed method achieves performance levels comparable to the state-of-the-art techniques. © ICST Institute for computer sciences, Social Informatics and Telecommunications Engineering 2025.

关键词： Travel time

来源：评论

学校读者我要写书评

暂无评论

Mixed Reality Application in Robot Navigation and Control 16

Mixed Reality Application in Robot Navigation and Control

引用

16th International Conference on Applied and Theoretical Electricity, ICATE 2024

作者： Cismaru, Stefan-Irinel Petcu, Florina-Luminita Resceanu, Ionut-Cristian Roibu, Horatiu Trasculescu, Andrei-Costin Pana, Cristina-Floriana Bizdoaca, Nicu-George University of Craiova Faculty of Automation Computers and Electronics Mechatronics and Robotics Department Craiova Romania University of Craiova Faculty of Automation Computers and Electronics Computer Science Department Craiova Romania

ISBN: (纸本)9798350388107

Among the emerging technologies, Mixed Reality (MR) has provided the means to interact with holograms. A very distant future is now near and accessible, which allows for the replacement of classic controllers for robots and is driven by MR. Holograms involve dynamic 3D elements that are integrated into the environment, enhancing control means for the user. These elements may be visually attended to and interacted with via eye concentration or hand gestures, and they have the capability to execute a variety of activities. This emerging technology enables interactions that are otherwise unattainable in alternative settings or contexts, such as the presence of an avatar inside the program. Furthermore, it has the potential to be expanded to govern other mechatronic devices, including robots. The purpose of this work is to offer an improved and interactive method of controlling a mechatronic platform by using holograms, such that the adaptability of this technology may be shown in the context of operating real systems. Important for the authors of this paper is the broadening of the spectrum in which the application lies, thus contributing to the process followed for the development and applicability of the solution in key areas such as industry, training, and medicine. © 2024 IEEE.

关键词： Holograms

来源：评论

学校读者我要写书评

暂无评论

Constrained Multi-Objective Optimization With Deep Reinforcement Learning Assisted Operator Selection

引用

IEEE/CAA Journal of Automatica Sinica 2024年第4期11卷 919-931页

作者： Fei Ming Wenyin Gong Ling Wang Yaochu Jin the School of Computer Science China University of GeosciencesWuhan 430074China the Department of Automation Tsinghua UniversityBeijing 100084China the Faculty of Technology Bielefeld UniversityNorth Rhine-Westphalia33619 BielefeldGermany

Solving constrained multi-objective optimization problems with evolutionary algorithms has attracted considerable *** constrained multi-objective optimization evolutionary algorithms(CMOEAs)have been developed with the use of different algorithmic strategies,evolutionary operators,and constraint-handling *** performance of CMOEAs may be heavily dependent on the operators used,however,it is usually difficult to select suitable operators for the problem at ***,improving operator selection is promising and necessary for *** work proposes an online operator selection framework assisted by Deep Reinforcement *** dynamics of the population,including convergence,diversity,and feasibility,are regarded as the state;the candidate operators are considered as actions;and the improvement of the population state is treated as the *** using a Q-network to learn a policy to estimate the Q-values of all actions,the proposed approach can adaptively select an operator that maximizes the improvement of the population according to the current state and thereby improve the algorithmic *** framework is embedded into four popular CMOEAs and assessed on 42 benchmark *** experimental results reveal that the proposed Deep Reinforcement Learning-assisted operator selection significantly improves the performance of these CMOEAs and the resulting algorithm obtains better versatility compared to nine state-of-the-art CMOEAs.

关键词： Constrained multi-objective optimization deep Qlearning deep reinforcement learning(DRL) evolutionary algorithms evolutionary operator selection

来源：评论

学校读者我要写书评

暂无评论

Random-Order Online Independent Set of Intervals and Hyperrectangles 32

Random-Order Online Independent Set of Intervals and Hyperre...

引用

32nd Annual European Symposium on Algorithms, ESA 2024

作者： Garg, Mohit Kar, Debajyoti Khan, Arindam Department of Computer Science and Automation Indian Institute of Science Bengaluru India

ISBN: (纸本)9783959773386

In the Maximum Independent Set of Hyperrectangles problem, we are given a set of n (possibly overlapping) d-dimensional axis-aligned hyperrectangles, and the goal is to find a subset of non-overlapping hyperrectangles of maximum cardinality. For d = 1, this corresponds to the classical Interval Scheduling problem, where a simple greedy algorithm returns an optimal solution. In the offline setting, for d-dimensional hyperrectangles, polynomial time (log n)O(d)-approximation algorithms are known [16]. However, the problem becomes notably challenging in the online setting, where the input objects (hyperrectangles) appear one by one in an adversarial order, and on the arrival of an object, the algorithm needs to make an immediate and irrevocable decision whether or not to select the object while maintaining the feasibility. Even for interval scheduling, an Ω(n) lower bound is known on the competitive ratio. To circumvent these negative results, in this work, we study the online maximum independent set of axis-aligned hyperrectangles in the random-order arrival model, where the adversary specifies the set of input objects which then arrive in a uniformly random order. Starting from the prototypical secretary problem, the random-order model has received significant attention to study algorithms beyond the worst-case competitive analysis (see the survey by Gupta and Singla [40]). Surprisingly, we show that the problem in the random-order model almost matches the best-known offline approximation guarantees, up to polylogarithmic factors. In particular, we give a simple (log n)O(d)competitive algorithm for d-dimensional hyperrectangles in this model, which runs in Õd(n) time. Our approach also yields (log n)O(d)-competitive algorithms in the random-order model for more general objects such as d-dimensional fat objects and ellipsoids. Furthermore, all our competitiveness guarantees hold with high probability, and not just in expectation. © Mohit Garg, Debajyoti Kar, and Arind

关键词： Approximation algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：