检索结果-内蒙古大学图书馆

Mixed Motivation Driven Social Multi-Agent Reinforcement Learning for Autonomous Driving

IEEE/CAA Journal of automatica Sinica 2025年第6期12卷 1272-1282页

作者： Long Chen Peng Deng Lingxi Li Xuemin Hu State Key Laboratory of Multimodal Artificial Intelligence Systems and the State Key Laboratory of Management and Control for Complex Systems Chinese Academy of Sciences Beijing WAYTOUS Inc. Beijing Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) Shenzhen Institute of Artificial Intelligence and Robotics Xi'an Jiaotong University Xi'an China School of Computer Science and Information Engineering Hubei University Wuhan China Purdue School of Engineering and Technology Indiana University-Purdue University Indianapolis Indianapolis IN USA School of Artificial Intelligence Hubei University Wuhan Key Laboratory of Intelligent Sensing System and Security (Hubei University) Ministry of Education Wuhan China

Despite great achievement has been made in autonomous driving technologies, autonomous vehicles (AVs) still exhibit limitations in intelligence and lack social coordination, which is primarily attributed to their reliance on single-agent technologies, neglecting inter-AV interactions. Current research on multi-agent autonomous driving (MAAD) predominantly focuses on either distributed individual learning or centralized cooperative learning, ignoring the mixed-motive nature of MAAD systems, where each agent is not only self-interested in reaching its own destination but also needs to coordinate with other traffic participants to enhance efficiency and safety. Inspired by the mixed motivation of human driving behavior and their learning process, we propose a novel mixed motivation driven social multi-agent reinforcement learning method for autonomous driving. In our method, a multi-agent reinforcement learning (MARL) algorithm, called Social Learning Policy Optimization (SoLPO), which takes advantage of both the individual and social learning paradigms, is proposed to empower agents to rapidly acquire self-interested policies and effectively learn socially coordinated behavior. Based on the proposed SoLPO, we further develop a mixed-motive MARL method for autonomous driving combined with a social reward integration module that can model the mixed-motive nature of MAAD systems by integrating individual and neighbor rewards into a social learning objective for improved learning speed and effectiveness. Experiments conducted on the MetaDrive simulator show that our proposed method outperforms existing state-of-the-art MARL approaches in metrics including the success rate, safety, and efficiency. More-over, the AVs trained by our method form coordinated social norms and exhibit human-like driving behavior, demonstrating a high degree of social coordination.

关键词： Measurement Reinforcement learning Safety Driver behavior Autonomous vehicles Optimization

来源：评论

学校读者我要写书评

暂无评论

Green Routing Game: Strategic Logistical Planning using Mixed Fleets of ICEVs and EVs

arXiv

引用

arXiv 2022年

作者： Sasahara, Hampei Dán, György Amin, Saurabh Sandberg, Henrik The Department of Systems and Control Engineering School of Engineering Tokyo Institute of Technology Tokyo152-8552 Japan The Division of Network and Systems Engineering School of Electrical Engineering and Computer Science Kth Royal Institute of Technology StockholmSE-100 44 Sweden The Laboratory for Information and Decision Systems Massachusetts Institute of Technology CambridgeMA02139 United States The Division of Decision and Control Systems School of Electrical Engineering and Computer Science Kth Royal Institute of Technology StockholmSE-100 44 Sweden

This paper introduces a "green" routing game between multiple logistic operators (players), each owning a mixed fleet of internal combustion engine vehicle (ICEV) and electric vehicle (EV) trucks. Each player faces the cost of delayed delivery (due to charging requirements of EVs) and a pollution cost levied on the ICEVs. This cost structure models: 1) limited battery capacity of EVs and their charging requirement;2) shared nature of charging facilities;3) pollution cost levied by regulatory agency on the use of ICEVs. We characterize Nash equilibria of this game and derive a condition for its uniqueness. We also use the gradient projection method to compute this equilibrium in a distributed manner. Our equilibrium analysis is useful to analyze the trade-off faced by players in incurring higher delay due to congestion at charging locations when the share of EVs increases versus a higher pollution cost when the share of ICEVs increases. A numerical example suggests that to increase marginal pollution cost can dramatically reduce inefficiency of equilibria. © 2022, CC BY-NC-ND.

关键词： Economic and social effects

来源：评论

学校读者我要写书评

暂无评论

Generalized Multi-kernel Maximum Correntropy Kalman Filter for Disturbance Estimation

arXiv

引用

arXiv 2023年

作者： Li, Shilei Shi, Dawei Lou, Yunjiang Zou, Wulin Shi, Ling The Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology Hong Kong The School of Automation Beijing Institute of Technology China The State Key Laboratory of Robotics and System School of Mechanical Engineering and Automation Harbin Institute of Technology Shenzhen Shenzhen518055 China Xeno Dynamics Control Department Xeno Dynamics Co. Ltd Shenzhen518055 China

Disturbance observers have been attracting continuing research efforts and are widely used in many applications. Among them, the Kalman filter-based disturbance observer is an attractive one since it estimates both the state and the disturbance simultaneously, and is optimal for a linear system with Gaussian noises. Unfortunately, The noise in the disturbance channel typically exhibits a heavy-tailed distribution because the nominal disturbance dynamics usually do not align with the practical ones. To handle this issue, we propose a generalized multi-kernel maximum correntropy Kalman filter for disturbance estimation, which is less conservative by adopting different kernel bandwidths for different channels and exhibits excellent performance both with and without external disturbance. The convergence of the fixed point iteration and the complexity of the proposed algorithm are given. Simulations on a robotic manipulator reveal that the proposed algorithm is very efficient in disturbance estimation with moderate algorithm complexity. Copyright © 2023, The Authors. All rights reserved.

关键词： Kalman filters

来源：评论

学校读者我要写书评

暂无评论

Stochastic Stability of Discrete-time Phase-coupled Oscillators over Uncertain and Random Networks

arXiv

引用

arXiv 2021年

作者： Jafarian, Matin Mamduhi, Mohammad H. Johansson, Karl H. The Delft Center for Systems and Control Delft University of Technology Netherlands The Automatic Control Laboratory ETH Zürich Switzerland The Division of Decision and Control Systems School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Stockholm Sweden Digital Futures Sweden

This article studies stochastic relative phase stability, i.e., stochastic phase-cohesiveness, of discrete-time phase-coupled oscillators. Stochastic phase-cohesiveness in two types of networks is studied. First, we consider oscillators coupled with 2π-periodic odd functions over underlying undirected graphs subject to both multiplicative and additive stochastic uncertainties. We prove stochastic phase-cohesiveness of the network with respect to two specific, namely in-phase and anti-phase, sets by deriving sufficient coupling conditions. We show the dependency of these conditions on the size of the mean values of additive and multiplicative uncertainties, as well as the sign of the mean values of multiplicative uncertainties. Furthermore, we discuss the results under a relaxation of the odd property of the coupling function. Second, we study an uncertain network in which the multiplicative uncertainties are governed by the Bernoulli process representing the well-known Erdös-Rényi network. We assume constant exogenous frequencies and derive sufficient conditions for achieving both stochastic phase-cohesive and phase-locked solutions, i.e., stochastic phase-cohesiveness with respect to the origin. For the latter case, where identical exogenous frequencies are assumed, we prove that any positive probability of connectivity leads to phase-locking. Thorough analyses are provided, and insights obtained from stochastic analysis are discussed, along with numerical simulations to validate the analytical results. © 2021, CC BY.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

Timeliness-Aware Multiple Rumor Sources Estimation in Dynamic Online Social Networks

SSRN

引用

SSRN 2024年

作者： Huang, Da-Wen Wu, Wenjie Bi, Jichao Li, Junli Gan, Chenquan Zhou, Wei College of Computer Science Sichuan Normal University Chengdu China State Key Laboratory of Industrial Control Technology Zhejiang University Hangzhou China School of Cyber Security and Information Law Chongqing University of Posts and Telecommunications Chongqing China School of Big Data and Software Engineering Chongqing University Chongqing China

Identifying rumor sources in online social networks (OSNs) plays a crucial role in controlling the spread of rumors and mitigating the damage caused by them. However, most studies are not suitable for identifying rumor sources in dynamic OSNs with community structures. Moreover, these studies tend to neglect the impact of timeliness on the spread of rumors, potentially leading to an overestimation of rumors' propagation ability. To overcome these limitations, this paper proposes a multiple rumor sources estimation framework for OSNs. First, a community-based dynamic network model is introduced to depict the temporal nature and community feature of OSNs. Second, a meticulously designed microcosmic SIR model, incorporating the timeliness of rumor topics, is developed to unravel the complex dynamics of rumor propagation. Then, a computationally efficient multiple rumor sources estimation algorithm is proposed. This algorithm utilizes the infection information collected by sensors and applies the maximum likelihood estimation (MLE) method to identify the rumor sources. Finally, experimental results on real-world and synthetic temporal networks demonstrate the effectiveness of the proposed rumor source estimation algorithm. © 2024, The Authors. All rights reserved.

关键词： Social networking (online)

来源：评论

学校读者我要写书评

暂无评论

A Multi-Stage Goal-Driven Network for Pedestrian Trajectory Prediction

arXiv

引用

arXiv 2024年

作者： Wu, Xiuen Wang, Tao Cai, Yuanzheng Liang, Lingyu Papageorgiou, George Fujian Provincial Key Laboratory of Information Processing and Intelligent Control Minjiang University Fuzhou China College of Computer and Data Science Fuzhou University Fuzhou China School of Electronic and Information Engineering South China University of Technology Guangzhou China SYSTEMA Research Center European University Cyprus Nicosia Cyprus

Pedestrian trajectory prediction plays a pivotal role in ensuring the safety and efficiency of various applications, including autonomous vehicles and traffic management systems. This paper proposes a novel method for pedestrian trajectory prediction, called multi-stage goal-driven network (MGNet). Diverging from prior approaches relying on stepwise recursive prediction and the singular forecasting of a long-term goal, MGNet directs trajectory generation by forecasting intermediate stage goals, thereby reducing prediction errors. The network comprises three main components: a conditional variational autoencoder (CVAE), an attention module, and a multi-stage goal evaluator. Trajectories are encoded using conditional variational autoencoders to acquire knowledge about the approximate distribution of pedestrians' future trajectories, and combined with an attention mechanism to capture the temporal dependency between trajectory sequences. The pivotal module is the multistage goal evaluator, which utilizes the encoded feature vectors to predict intermediate goals, effectively minimizing cumulative errors in the recursive inference process. The effectiveness of MGNet is demonstrated through comprehensive experiments on the JAAD and PIE datasets. Comparative evaluations against state-of-the-art algorithms reveal significant performance improvements achieved by our proposed method. Copyright © 2024, The Authors. All rights reserved.

关键词： Trajectories

来源：评论

学校读者我要写书评

暂无评论

Enhancing Deep Reinforcement Learning: A Tutorial on Generative Diffusion Models in Network Optimization

arXiv

引用

arXiv 2023年

作者： Du, Hongyang Zhang, Ruichen Liu, Yinqiu Wang, Jiacheng Lin, Yijing Li, Zonghang Niyato, Dusit Kang, Jiawen Xiong, Zehui Cui, Shuguang Ai, Bo Zhou, Haibo Kim, Dong In The School of Computer Science and Engineering The Energy Research Institute @ NTU Interdisciplinary Graduate Program Nanyang Technological University Singapore The School of Computer Science and Engineering Nanyang Technological University Singapore The State Key Laboratory of Networking and Switching Technology Beijing University of Posts and Telecommunications China The School of Information and Communication Engineering University of Electronic Sciences and Technology of China Chengdu China The School of Automation Guangdong University of Technology China The Pillar of Information Systems Technology and Design Singapore University of Technology and Design Singapore Shenzhen China The State Key Laboratory of Rail Traffic Control and Safety Beijing Jiaotong University Beijing100044 China School of Electronic Science and Engineering Nanjing University Jiangsu Nanjing210093 China The Department of Electrical and Computer Engineering Sungkyunkwan University Suwon16419 Korea Republic of

Generative Diffusion Models (GDMs) have emerged as a transformative force in the realm of Generative Artificial Intelligence (GenAI), demonstrating their versatility and efficacy across various applications. The ability to model complex data distributions and generate high-quality samples has made GDMs particularly effective in tasks such as image generation and reinforcement learning. Furthermore, their iterative nature, which involves a series of noise addition and denoising steps, is a powerful and unique approach to learning and generating data. This paper serves as a comprehensive tutorial on applying GDMs in network optimization tasks. We delve into the strengths of GDMs, emphasizing their wide applicability across various domains, such as vision, text, and audio generation. We detail how GDMs can be effectively harnessed to solve complex optimization problems inherent in networks. The paper first provides a basic background of GDMs and their applications in network optimization. This is followed by a series of case studies, showcasing the integration of GDMs with Deep Reinforcement Learning (DRL), incentive mechanism design, Semantic Communications (SemCom), Internet of Vehicles (IoV) networks, etc. These case studies underscore the practicality and efficacy of GDMs in real-world scenarios, offering insights into network design. We conclude with a discussion on potential future directions for GDM research and applications, providing major insights into how they can continue to shape the future of network optimization. © 2023, CC BY.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Channel and space-based joint rate allocation algorithm

Channel and space-based joint rate allocation algorithm

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Dayong Wang Chao Yuan Yu Sun Xin Lu Hui Guo Frederic Dufaux Ce Zhu Key Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Guangxi Key Laboratory of Machine Vision and Intelligent Control Wuzhou University Chongqing Key Laboratory of Image Cognition Chongqing University of Posts and Telecommunications China Department of Computer Science University of Central Arkansas Faculty of Computing Engineering and Media (CEM) De Montfort University UK Université Paris-Saclay CNRS CentraleSupélec Laboratoire Des Signaux et Systèmes France School of Information and Communication Engineering University of Electronic Science and Technology of China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

Rate control is a critical component for image and video compression Particularly under limited network bandwidth conditions, bitrate control is essential to ensure efficient image transmission by effectively allocation channel resources. In this research, since both Channel and Spatial have relationship with rate allocation, we first propose a joint Channel-wise and Spatial-wise Quantization scheme to determine optimal quantization parameters. Subsequently, we develop a quantization step estimation network to obtain parameters to efficiently allocate rate according to target rate. Experiments demonstrate that our algorithm significantly improve compressed image quality with minimal bitrate distortion and achieve accurate rate control with nearly 3% average bitrate error.

关键词： Image quality Quantization (signal) Image coding Bit rate Signal processing algorithms Estimation Rate-distortion Video compression Resource management Speech processing

来源：评论

学校读者我要写书评

暂无评论

Discrete-Time ZND Algorithms for Time-Dependent LQ Decomposition Applied to Sound Source Localization 11

Discrete-Time ZND Algorithms for Time-Dependent LQ Decomposi...

引用

11th International Conference on Intelligent control and Information Processing, ICICIP 2021

作者： Guo, Jinjin Zhang, Yunong Sun Yat-sen University School of Computer Science and Engineering Guangzhou510006 China Research Institute of Sun Yat-sen University in Shenzhen Shenzhen518057 China Guangdong Key Laboratory of Modern Control Technology Guangzhou510070 China

ISBN: (纸本)9781665425155

To solve discrete-time LQ decomposition (DTLQD) problem, a 5-step Adams-Bashforth-type (5SAB-type) discrete-time zeroing neural dynamics (DTZND) algorithm is proposed by combining 5-step Adams-Bashforth (AB) method with continuous-time zeroing neural dynamics (CTZND) model. For comparison, general 4-step and 3-step Zhang et al. discretization (ZeaD) formulas are also presented and used to discretize the CTZND model. The corresponding 4-step ZeaD-type (4SZeaDtype) and 3-step ZeaD-type (3SZeaD-type) DTZND algorithms are thus developed. Theoretical analyses and results show that the proposed 5SAB-type DTZND algorithm has higher computational precision than the 4SZeaD-type and 3SZeaD-type DTZND algorithms. Two numerical examples further validate the availability of the three DTZND algorithms and the superiority of the proposed 5SAB-type DTZND algorithm. Moreover, the proposed DTZND algorithms are applied to the sound source localization based on the time difference of arrival (TDOA) technique. © 2021 IEEE.

关键词： Time difference of arrival

来源：评论

学校读者我要写书评

暂无评论

Context-aware Emotion Recognition Based on Vision-Language Pre-trained Model

Context-aware Emotion Recognition Based on Vision-Language P...

引用

International Conference on Advanced Robotics and Mechatronics (ICARM)

作者： XingLin Li Xinde Li Chuanfei Hu Huaping Liu School of Automation Southeast University Nanjing China Key Laboratory of Measurement and Control of Complex Systems of Engineering Ministry of Education Nanjing China Nanjing Center for Applied Mathematics Nanjing China Southeast University Shenzhen Research Institute Shenzhen China Department of Computer Science and Technology Tsinghua University Beijing China

ISBN: (数字)9798350385724

ISBN: (纸本)9798350385731

Given the difficulty of recognizing ambiguous emotions in facial expression recognition tasks, we propose a visual-language model named CAER-CLIP to address this challenge. The proposed CAER-CLIP standed for Context-Aware Emotion Recognition (CAER), and were incorporated structure of the Contrastive Language–Image Pre-training (CLIP) model as promising alternative to classifier. There are two parts in CAER-CLIP model. In the visual part, facial expressions and contextual information of the image are simultaneously extracted to obtain the final feature embeddings, which are then used as a learnable “class” token for text-image pairing with desired module. In the textual part, we use text labels for emotion recognition classes as input. The outputs were merged to participate the comparative study to generated parameters of the model. The experiments demonstrate the effectiveness of the proposed method and show that our CAER-CLIP outperforms the state-of-the-art results on the CAER benchmark. The ablation experiment verified the effectiveness of both the classifier-based and text-based (ours without classifier) models, demonstrating that our method with the CAER-CLIP structure performed better, and the incorporation of a text encoder in the deep network model architecture effectively enhancing recognition accuracy.

关键词： Training Emotion recognition Visualization Accuracy Text recognition Face recognition Benchmark testing Feature extraction Data mining Context modeling

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：