Mobile sensing has become a promising paradigm for monitoring the environmental state. When equipped with sensors, a group of unmanned vehicles can autonomously move around for distributed sensing. To maximize the sen...
详细信息
ISBN:
(纸本)9783030861308;9783030861292
Mobile sensing has become a promising paradigm for monitoring the environmental state. When equipped with sensors, a group of unmanned vehicles can autonomously move around for distributed sensing. To maximize the sensing coverage, a critical challenge is to coordinate the decentralized vehicles for cooperation. In this work, we propose a novel algorithm Comm-Q, in which the vehicles can learn to communicate for cooperation via multi-agent reinforcement learning. At each step, the vehicles can broadcast a message to others, and condition on received aggregated message to update their sensing policies. The message is also learned via reinforcement learning. In addition, we decompose and reshape the reward function for more efficient policy training. Experimental results show that our algorithm is scalable and can converge very fast during training phase. It also outperforms other baselines significantly during execution. The results validate that communication message plays an important role to coordinate the behaviors of different vehicles.
Cooperative multi-agent reinforcement learning (MARL) has achieved significant results, most notably by leveraging the representation-learning abilities of deep neural networks. However, large centralized approaches q...
详细信息
Cooperative multi-agent reinforcement learning (MARL) has achieved significant results, most notably by leveraging the representation-learning abilities of deep neural networks. However, large centralized approaches quickly become infeasible as the number of agents scale, and fully decentralized approaches can miss important opportunities for information sharing and coordination. Furthermore, not all agents are equal-in some cases, individual agents may not even have the ability to send communication to other agents or explicitly model other agents. This paper considers the case where there is a single, powerful, central agent that can observe the entire observation space, and there are multiple, low-powered local agents that can only receive local observations and are not able to communicate with each other. The central agent's job is to learn what message needs to be sent to different local agents based on the global observations, not by centrally solving the entire problem and sending action commands, but by determining what additional information an individual agent should receive so that it can make a better decision. In this work, we present our MARL algorithm HAMMER, describe where it would be most applicable, and implement it in the cooperative navigation and multi-agent walker domains. Empirical results show that (1) learned communication does indeed improve system performance, (2) results generalize to heterogeneous local agents, and (3) results generalize to different reward structures.
We propose a novel formulation of the "effectiveness problem" in communications, put forth by Shannon and Weaver in their seminal work "The Mathematical Theory of Communication", by considering mul...
详细信息
We propose a novel formulation of the "effectiveness problem" in communications, put forth by Shannon and Weaver in their seminal work "The Mathematical Theory of Communication", by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a multi-agent partially observable Markov decision process (MA-POMDP), in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively" over a noisy channel. This framework generalizes both the traditional communication problem, where the main goal is to convey a message reliably over a noisy channel, and the "learning to communicate" framework that has received recent attention in the MARL literature, where the underlying communication channels are assumed to be error-free. We show via examples that the joint policy learned using the proposed framework is superior to that where the communication is considered separately from the underlying MA-POMDP. This is a very powerful framework, which has many real world applications, from autonomous vehicle planning to drone swarm control, and opens up the rich toolbox of deep reinforcement learning for the design of multi-user communication systems.
Communication is a critical factor for the big multi-agent world to stay organized and productive. Recently, Deep Reinforcement learning (DRL) has been adopted to learn the communication among multiple intelligent age...
详细信息
Communication is a critical factor for the big multi-agent world to stay organized and productive. Recently, Deep Reinforcement learning (DRL) has been adopted to learn the communication among multiple intelligent agents. However, in terms of the DRL setting, the increasing number of communication messages introduces two problems: (1) there are usually some redundant messages;(2) even in the case that all messages are necessary, how to process a large number of messages in an efficient way remains a big challenge. In this paper, we propose a DRL method named Double Attentional Actor-Critic Message Processor (DAACMP) to jointly address these two problems. Specifically, DAACMP adopts two attention mechanisms. The first one is embedded in the actor part, such that it can select the important messages from all communication messages adaptively. The other one is embedded in the critic part so that all important messages can be processed efficiently. We evaluate DAACMP on three multi-agent tasks with seven different settings. Results show that DAACMP not only outperforms several state-of-the-art methods but also achieves better scalability in all tasks. Furthermore, we conduct experiments to reveal some insights about the proposed attention mechanisms and the learned policies.
Though caching on edge servers is widely acknowledged to be essential, it is not trivial to cache content on edge servers adaptively without any prior knowledge of the distribution of content popularity across the use...
详细信息
ISBN:
(纸本)9781538674628
Though caching on edge servers is widely acknowledged to be essential, it is not trivial to cache content on edge servers adaptively without any prior knowledge of the distribution of content popularity across the users. Several edge caching algorithms have been proposed in the literature based on multi-agent reinforcement learning (MARL) for dynamic control, however, they ignored the non-stationarity and partial-observability issues commonly existing in multi-agent systems. In an MARL-based edge caching application where agents collaborate towards a common goal, communication is essential as their decisions are jointly applied to improve collective intelligence. However, most existing methods proposed to exchange messages between agents have not considered the induced communication overhead, which is critical in practice with real-world multi-agent applications. In this paper, we propose a new MARL framework for edge caching where agents learn to construct, exchange and interpret collective messages for individual benefits, while controlling the complex collaborative task of cache replacement in a communication-efficient manner. With a standard edge caching model, we show that with limited communication and delays introduced, our proposed framework is able to outperform existing rule-based and learning-based caching policy alternatives.
This paper investigates multi-agent reinforcement learning (MARL) communication mechanisms in large-scale scenarios. We propose a novel framework, learning Structured Communication (LSC), that leverages a flexible and...
详细信息
ISBN:
(纸本)9781450394321
This paper investigates multi-agent reinforcement learning (MARL) communication mechanisms in large-scale scenarios. We propose a novel framework, learning Structured Communication (LSC), that leverages a flexible and efficient communication topology. LSC enables adaptive agent grouping to create diverse hierarchical formations over episodes generated through an auxiliary task and a hierarchical routing protocol. We learn a hierarchical graph neural network with the formed topology that facilitates effective message generation and propagation between inter- and intra-group communications. Unlike state-of-the-art communication mechanisms, LSC possesses a detailed and learnable design for hierarchical communication. Numerical experiments on challenging tasks demonstrate that the proposed LSC exhibits high communication efficiency and global cooperation capability.
暂无评论