With countless promising applications in various domains such as IoT and Industry 4.0, task-oriented communication design (TOCD) is getting accelerated attention from the research community. This paper presents a nove...
详细信息
With countless promising applications in various domains such as IoT and Industry 4.0, task-oriented communication design (TOCD) is getting accelerated attention from the research community. This paper presents a novel approach for designing scalable task-oriented quantization and communications in cooperative multi-agent systems (MAS). The proposed approach utilizes the TOCD framework and the value of information (VoI) concept to enable efficient communication of quantized observations among agents while maximizing the average return performance of the MAS, a parameter that quantifies the MAS's task effectiveness. The computational complexity of learning the VoI, however, grows exponentially with the number of agents. Thus, we propose a three-step framework: (i) learning the VoI (using reinforcement learning (RL)) for a two-agent system, (ii) designing the quantization policy for an N-agent MAS using the learned VoI for a range of bit-budgets and, (iii) learning the agents' control policies using RL while following the designed quantization policies in the earlier step. Our analytical results show the applicability of the proposed framework under a wide range of problems. Numerical results show striking improvements in reducing the computational complexity of obtaining VoI needed for the TOCD in a MAS problem without compromising the average return performance of the MAS.
We consider a distributed quantization problem that arises when multiple edge devices, i.e., agents, are controlled via a centralized controller (CC). While agents have to communicate their observations to the CC for ...
详细信息
ISBN:
(纸本)9781665491228
We consider a distributed quantization problem that arises when multiple edge devices, i.e., agents, are controlled via a centralized controller (CC). While agents have to communicate their observations to the CC for decision-making, the bit-budgeted communications of agent-CC links may limit the task-effectiveness of the system which is measured by the system's average sum of stage costs/rewards. As a result, each agent, given its local processing resources, should compress/quantize its observation such that the average sum of stage costs/rewards of the control task is minimally impacted. We address the problem of maximizing the average sum of stage rewards by proposing two different Action-Based State Aggregation (ABSA) algorithms that carry out the indirect and joint design of control and communication policies in the multi-agent system (MAS). While the applicability of ABSA-1 is limited to single-agent systems, it provides an analytical framework that acts as a stepping stone to the design of ABSA-2. ABSA-2 carries out the joint design of control and communication for an MAS. We evaluate the algorithms - with average return as the performance metric using numerical experiments performed to solve a multi-agent geometric consensus problem.
暂无评论