The mutual information is analyzed as a function of the input distribution using an identity due to Topsoe for channels with (possibly multiple) linear constraints and finite input and output sets. The mutual informat...
详细信息
Dynamic light fields provide a richer, more realistic 3D representation of a moving scene. However, this leads to higher data rates since excess storage and transmission requirements are needed. We propose a novel app...
详细信息
To control and shape the spatial wavefront of reconfigurable leaky-wave antenna (LWA) by integrating a single layered transmissive phase gradient metasurface (TPGMS) is *** amplitude and phase of transmission is contr...
详细信息
In this system paper, we present our approach to the DJI Robomaster AI Challenge, 2022 where our team ERA-IITK was ranked 3rd worldwide among 83 teams. We primarily focus on system hardware design and algorithmic comp...
详细信息
Malware is conventionally written in the C/C++ programming languages. However, a recent trend has been observed where other languages are being used to write malware. One such language is the Rust programming language...
详细信息
We experimentally investigate the quality of single photons produced with longpulse aboveband excitation of a quantum dot embedded in a semiconductor nanowire and model the results via rate equations and Markovchains....
详细信息
Modern methods for spoken language identification (LID) have demonstrated promising results when trained on large datasets. However, their effectiveness is considerably impacted by the discrepancies between the traini...
详细信息
High-fidelity, switching-level modeling of power electronics simulations can be computationally intensive and time-consuming. This computational burden has recently escalated further due to the increased number of con...
详细信息
This study investigates partial demagnetization faults arising from stator interturn faults in a surface-mounted permanent-magnet-type brushless direct current motor. Because of rotor demagnetization, the fault severi...
详细信息
Offline reinforcement learning (RL), which seeks to learn an optimal policy using offline data, has garnered significant interest due to its potential in critical applications where online data collection is infeasibl...
详细信息
Offline reinforcement learning (RL), which seeks to learn an optimal policy using offline data, has garnered significant interest due to its potential in critical applications where online data collection is infeasible or expensive. This work explores the benefit of federated learning for offline RL, aiming at collaboratively leveraging offline datasets at multiple agents. Focusing on finite-horizon episodic tabular Markov decision processes (MDPs), we design FedLCB-Q, a variant of the popular model-free Q-learning algorithm tailored for federated offline RL. FedLCB-Q updates local Q-functions at agents with novel learning rate schedules and aggregates them at a central server using importance averaging and a carefully designed pessimistic penalty term. Our sample complexity analysis reveals that, with appropriately chosen parameters and synchronization schedules, FedLCB-Q achieves linear speedup in terms of the number of agents without requiring high-quality datasets at individual agents, as long as the local datasets collectively cover the state-action space visited by the optimal policy, highlighting the power of collaboration in the federated setting. In fact, the sample complexity almost matches that of the single-agent counterpart, as if all the data are stored at a central location, up to polynomial factors of the horizon length. Furthermore, FedLCB-Q is communication-efficient, where the number of communication rounds is only linear with respect to the horizon length up to logarithmic factors. Copyright 2024 by the author(s)
暂无评论