Remote inference allows lightweight edge devices, such as autonomous drones, to perform vision tasks exceeding their computational, energy, or processing delay budget. In such applications, reliable transmission of in...
详细信息
Remote inference allows lightweight edge devices, such as autonomous drones, to perform vision tasks exceeding their computational, energy, or processing delay budget. In such applications, reliable transmission of information is challenging due to high variations of channel quality. Traditional approaches involving spatio-temporal transforms, quantization, and entropy coding followed by digital transmission may be affected by a sudden decrease in quality (the digital cliff) when the channel quality is less than expected during design. This problem can be addressed by using linear coding and transmission (LCT), a joint source and channel coding scheme relying on linear operators only, allowing to achieve reconstructed per-pixel error commensurate with the wireless channel quality. In this paper, we propose CV-Cast: the first LCT scheme optimized for computer vision task accuracy instead of per-pixel distortion. Using this approach, for instance at 10 dB channel signal-to-noise ratio, CV-Cast requires transmitting 28% less symbols than a baseline LCT scheme in semantic segmentation and 15% in object detection tasks. Simulations involving a realistic 5G channel model confirm the smooth decrease in accuracy achieved with CV-Cast, while images encoded by JPEG or learned image coding (LIC) and transmitted using classical schemes at low Eb/N0 are subject to digital cliff.
Image communication increasingly involves machine-to-machine delivery. For example, images acquired by an autonomous drone can be compressed and sent to an edge server over a wireless network for resource-intensive pr...
详细信息
ISBN:
(纸本)9798350303582;9798350303599
Image communication increasingly involves machine-to-machine delivery. For example, images acquired by an autonomous drone can be compressed and sent to an edge server over a wireless network for resource-intensive processing. Traditional compression techniques involving transform, quantization, and entropy coding reach high compression efficiency, but channel conditions worse than expected may lead to a sharp decrease in the decoded image quality. As an alternative, linear coding and transmission (LCT) systems have been proposed to avoid this digital cliff problem: The reconstructed image quality decreases gradually as channel conditions degrade. This paper presents a comprehensive evaluation of computer vision tasks with input images processed and transmitted using LCT. It also analyses the benefits of network retraining, accounting for impairments due to LCT and noisy channel. Considering object detection and semantic segmentation over images transmitted and received by LCT systems, we show that the task accuracy degrades smoothly when the channel quality decreases, avoiding the cliff effect. Retraining with noisy images processed by LCT restores detection mAP degradation from 23.8% to 4.4% and segmentation mIoU degradation from 43.2% to 8.1% when the channel signal-to-noise ratio is 10 dB.
暂无评论