Multi-modal applications are expected to dominate in the 5G and B5G era. However, traditional source coding methods are not efficient or reliable due to neglecting semantic redundancy and mutual influences between dif...
详细信息
Multi-modal applications are expected to dominate in the 5G and B5G era. However, traditional source coding methods are not efficient or reliable due to neglecting semantic redundancy and mutual influences between different modalities' sources. To address this, cross-modal source coding (CMSC) has been proposed as a promising solution. However, there are still two main challenges: determining the optimum rate of CMSC considering delay and reliability constraints, and designing a practical CMSC near the optimum rate. To tackle these challenges, this paper focuses on studying the optimum source coding rate of CMSC and its practical implementation. On the theoretical side, an (n, epsilon)-achievable rate region is derived, representing the source coding rates subject to a fixed blocklength n and the target error probability e . Additionally, the optimum source coding rate can be approximated by calculating the infimum of the (n, epsilon)- achievable rate region with a rate dispersion function. On the technical side, a general implementation for CMSC is proposed, which fully leveraging channel coding and artificial intelligence (AI) semantic analysis to achieve the optimum rate. Numerical results demonstrate that CMSC can obtain 50% improvement in theory and 37.5% enhancement in practice against the baseline model abstracted from traditional schemes when multi-modal sources are semantically correlated.
暂无评论