In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarc...
详细信息
In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarcity of reward signals may give rise to reward conflicts among agents. In these scenarios, each agent tends to compete to obtain limited rewards, deviating from collaborative efforts aimed at achieving collective team objectives. This not only amplifies the learning challenge but also imposes constraints on the overall learning performance of agents, ultimately compromising the attainment of team goals. To mitigate the conflicting competition for rewards among agents in MARL, we introduce the bidirectional influence and interaction (BDII) MARL framework. This innovative approach draws inspiration from the collaborative ethos observed in human social cooperation, specifically the concept of "sharing joys and sorrows." The fundamental concept behind BDII is to empower agents to share their individual rewards with collaborators, fostering a cooperative rather than competitive behavioral paradigm. This strategic shift aims to resolve the pervasive issue of reward conflicts among agents operating in sparse-reward environments. BDII incorporates two key factors—namely, the Gaussian kernel distance between agents (physical distance) and policy diversity among agents (logical distance). The two factor collectively contribute to the dynamic adjustment of reward allocation coefficients, culminating in the formation of reward distribution weights. The incorporation of these weights facilitates the equitable sharing of agents’ contributions to rewards, promoting a cooperative learning environment. Through extensive experimental evaluations, we substantiate the efficacy of BDII in addressing the challenge of reward conflicts in MARL. Our research findings affirm that BDII significantly mitigates reward conflicts, ensuring that agents consistently align with the origi
Motion and appearance cues play a crucial role in Multi-object Tracking (MOT) algorithms for associating objects across consecutive frames. While most MOT methods prioritize accurate motion modeling and distincti...
详细信息
DSP holds significant potential for important applications in Deep Neural Networks. However, there is currently a lack of research focused on shared-memory CPU-DSP heterogeneous chips. This paper proposes CD-Sched, an...
详细信息
Document-level event extraction task has achieved significant progress based on template generation methods. However, there is no reasonable regulation and restriction in the existing template-based generation methods...
详细信息
DSP holds significant potential for important applications in Deep Neural Networks. However, there is currently a lack of research focused on shared-memory CPU-DSP heterogeneous chips. This paper proposes CD-Sched, an...
详细信息
ISBN:
(纸本)9781450399951
DSP holds significant potential for important applications in Deep Neural Networks. However, there is currently a lack of research focused on shared-memory CPU-DSP heterogeneous chips. This paper proposes CD-Sched, an automated scheduling framework that aims to address this gap. By predicting the latency of operators on both CPU and DSP, CD-Sched automatically schedules the computation of operators to the appropriate computing device. This scheduling optimization accelerates the computation of individual operators and ultimately improves the overall training time of neural networks. In end-to-end training tasks, CD-Sched can significantly reduce the overall training time, with an average reduction of approximately 10.77%.
The existing graph neural network (GNN) systems adopt sample-based training on large-scale graphs over multiple GPUs. Although they support large-scale graph training, large data loading overhead is still a bottleneck...
详细信息
Network traffic classification is crucial for network security and network management and is one of the most important network tasks. Current state-of-the-art traffic classifiers are based on deep learning models to a...
详细信息
Broadcast authentication is a critical security service in wireless sensor networks. A protocol named $\mu\text{TESLA}$ [1] has been proposed to provide efficient authentication service for such networks. However, w...
详细信息
Broadcast authentication is a critical security service in wireless sensor networks. A protocol named $\mu\text{TESLA}$ [1] has been proposed to provide efficient authentication service for such networks. However, when applied to applications such as time synchronization and fire alarm in which broadcast messages are sent infrequently, $\mu\text{TESLA}$ encounters problems of wasted key resources and slow message verification. This paper presents a new protocol named GBA (Generalized broadcast authentication), for efficient broadcast authentication in these applications. GBA utilizes the one-way key chain mechanism of $\mu\text{TESLA}$ , but modifies the keys and time intervals association, and changes the key disclosure mechanism according to the message transmission model in these applications. The proposed technique can take full use of key resources, and shorten the message verification time to an acceptable level. The analysis and experiments show that GBA is more efficient and practical than $\mu\text{TESLA}$ in applications with various message transmission models.
Jamming attack can severely affect the performance of Wireless sensor networks (WSNs) due to the broadcast nature of wireless medium. In order to localize the source of the attacker, we in this paper propose a jammer ...
详细信息
Jamming attack can severely affect the performance of Wireless sensor networks (WSNs) due to the broadcast nature of wireless medium. In order to localize the source of the attacker, we in this paper propose a jammer localization algorithm named as Minimum-circle-covering based localization (MCCL). Comparing with the existing solutions that rely on the wireless propagation parameters, MCCL only depends on the location information of sensor nodes at the border of the jammed region. MCCL uses the plane geometry knowledge, especially the minimum circle covering technique, to form an approximate jammed region, and hence the center of the jammed region is treated as the estimated position of the jammer. Simulation results showed that MCCL is able to achieve higher accuracy than other existing solutions in terms of jammer's transmission range and sensitivity to nodes' density.
In recent years, the problem of lake eutrophication has become increasingly severe. The monitoring and control of cyanobacteria in lakes are of great significance. The information obtained by existing monitoring metho...
详细信息
暂无评论