Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may fun...
详细信息
Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may functionally make the difference. Deep learning has emerged as a powerful tool in this domain, offering advanced techniques for compressing point clouds more efficiently than conventional coding methods while also allowing effective computer vision tasks performed in the compressed domain thus, for the first time, making available a common compressed visual representation effective for both man and machine. Taking advantage of this potential, JPEG has recently finalized the JPEG Pleno learning-based Point Cloud coding (PCC) standard offering efficient lossy coding of static point clouds, targeting both human visualization and machine processing by leveraging deep learning models for geometry and color coding. The geometry is processed directly in its original 3D form using sparse convolutional neural networks, while the color data is projected onto 2D images and encoded using the also learning-based JPEG AI standard. The goal of this paper is to provide a complete technical description of the JPEG PCC standard, along with a thorough benchmarking of its performance against the state-of-the-art, while highlighting its main strengths and weaknesses. In terms of compression performance, JPEG PCC outperforms the conventional MPEG PCC standards, especially in geometry coding, achieving significant rate reductions. Color compression performance is less competitive but this is overcome by the power of a full learning-based coding framework for both geometry and color and the associated effective compressed domain processing.
Deep learning (DL)-basedcoding has recently become very popular for multimedia data, notably images and point clouds (PCs). Training a DL coding model using the backpropagation algorithm requires a differentiable los...
详细信息
ISBN:
(数字)9781665459631
ISBN:
(纸本)9781665459631
Deep learning (DL)-basedcoding has recently become very popular for multimedia data, notably images and point clouds (PCs). Training a DL coding model using the backpropagation algorithm requires a differentiable loss function. Thus, for PC joint geometry and color coding, both the PC geometry and color distortion metrics must be differentiable. Since the distortion/quality metrics commonly used for the final PC quality assessment do not meet this criterion, new PC distortion metrics have to be designed for DL-based training purposes. Moreover, for PC joint geometry and color coding, it is critical to define the balance between the geometry and color distortions in a meaningful way, ideally driven by the human perception and subjective quality assessment. In this context, this paper proposes a perceptually-driven design for a differentiable PC joint geometry and color distortion metric to be used for training purposes in DL-basedcoding, notably to define the relative weights for the geometry and color distortions. The obtained perceptually-driven weights achieve a rate reduction of around 3% regarding the default balanced weights at no complexity cost. This is the first proposal in the literature with this purpose and this perceptual approach.
Intelligent video coding (IVC), which dates back to the late 1980s with the concept of encoding videos with knowledge and semantics, includes visual content compact representation models and methods enabling structura...
详细信息
暂无评论