检索结果-内蒙古大学图书馆

Deep Learning-Based point cloud coding and Super-Resolution: A Joint Geometry and Color Approach

IEEE TRANSACTIONS ON MULTIMEDIA 2025年 27卷 914-926页

作者： Guarda, Andre F. R. Ruivo, Manuel Coelho, Luis Seleem, Abdelrahman Rodrigues, Nuno M. M. Pereira, Fernando Inst Telecomunicacoes P-1049001 Lisbon Portugal Univ Lisbon Inst Super Tecn P-1049001 Lisbon Portugal Inst Telecomunicacoes P-1049001 Lisbon Portugal Univ Lisbon Inst Super Tecn Inst Telecomunicacoes P-1049001 Lisbon Portugal South Valley Univ Qena 83523 Egypt ESTG Politecn Leiria P-2411901 Leiria Portugal Inst Telecomunicacoes P-2411901 Leiria Portugal

In this golden age of multimedia, realistic content is in high demand with users seeking more immersive and interactive experiences. As a result, new image modalities for 3D representations have emerged in recent years, among which point clouds have deserved especial attention. Naturally, with this increase in demand, efficient storage and transmission became a must, with standardization groups such as MPEG and JPEG entering the scene, as it happened before with other types of visual media. In a surprising development, JPEG issued a Call for Proposals on point cloud coding targeting exclusively learning-based solutions, in parallel to a similar call for image coding. This is a natural consequence of the growing popularity of deep learning, which due to its excellent performances is currently dominant in the multimedia processing field, including coding. This article presents the coding solution selected by JPEG as the best-performing response to the Call for Proposals and adopted as the first version of the JPEG Pleno point cloud coding Verification Model, in practice the first step for developing a standard. The proposed solution offers a novel joint geometry and color approach for point cloud coding, in which a single deep learning model processes both geometry and color simultaneously. To maximize the RD performance for a large range of point clouds, the proposed solution uses down-sampling and learning-based super-resolution as pre- and post-processing steps. Compared to the MPEG point cloud coding standards, the proposed coding solution comfortably outperforms G-PCC, for both geometry, color, and joint quality metrics.

关键词： Encoding Geometry Image coding Transform coding point cloud compression Standards Image color analysis Deep learning geometry and color JPEG Pleno standard point cloud coding point cloud super-resolution

来源：评论

学校读者我要写书评

暂无评论

The JPEG Pleno Learning-Based point cloud coding Standard: Serving Man and Machine

引用

IEEE ACCESS 2025年 13卷 43289-43315页

作者： Guarda, Andre F. R. Rodrigues, Nuno M. M. Pereira, Fernando Inst Telecomunicacoes P-1049001 Lisbon Portugal Politecn Leiria ESTG P-2411901 Leiria Portugal Univ Lisbon Inst Super Tecn P-1049001 Lisbon Portugal

Efficient point cloud coding has become increasingly critical for multiple applications such as virtual reality, autonomous driving, and digital twin systems, where rich and interactive 3D data representations may functionally make the difference. Deep learning has emerged as a powerful tool in this domain, offering advanced techniques for compressing point clouds more efficiently than conventional coding methods while also allowing effective computer vision tasks performed in the compressed domain thus, for the first time, making available a common compressed visual representation effective for both man and machine. Taking advantage of this potential, JPEG has recently finalized the JPEG Pleno Learning-based point cloud coding (PCC) standard offering efficient lossy coding of static point clouds, targeting both human visualization and machine processing by leveraging deep learning models for geometry and color coding. The geometry is processed directly in its original 3D form using sparse convolutional neural networks, while the color data is projected onto 2D images and encoded using the also learning-based JPEG AI standard. The goal of this paper is to provide a complete technical description of the JPEG PCC standard, along with a thorough benchmarking of its performance against the state-of-the-art, while highlighting its main strengths and weaknesses. In terms of compression performance, JPEG PCC outperforms the conventional MPEG PCC standards, especially in geometry coding, achieving significant rate reductions. Color compression performance is less competitive but this is overcome by the power of a full learning-based coding framework for both geometry and color and the associated effective compressed domain processing.

关键词： Transform coding point cloud compression Encoding Standards Image coding Three-dimensional displays Geometry Image color analysis Artificial intelligence Codecs JPEG Pleno standard learning-based coding man and machine point cloud coding

来源：评论

学校读者我要写书评

暂无评论

JPEG PLENO LEARNING-BASED point cloud coding: A PERFORMANCE ANALYSIS 30

JPEG PLENO LEARNING-BASED POINT CLOUD CODING: A PERFORMANCE ...

引用

30th IEEE International Conference on Image Processing (ICIP)

作者： Prazeres, Joao Rodrigues, Rafael Pereira, Manuela Pinheiro, Antonio M. G. Univ Beira Interior Covilha Portugal Inst Telecomunicacoes Covilha Portugal

ISBN: (纸本)9781728198354

In this paper, a stability analysis of the JPEG Pleno Learning-based point cloud coding Verification Model (VmUC) is performed. The codec is a deep learning-based solution that is able to compress both color and geometry. Three different training sessions were conducted using the default training set and cost function, and six point clouds were encoded/decoded with the resulting operating points for six target distortion/bitrate ratios. The VmUC performance was compared with the MPEG codecs V-PCC and G-PCC, considering three objective metrics, notably PSNR MSE D1, PSNR MSE D2, and PCQM. PSNR MSE D1 was also computed at each training epoch for the six decoded point clouds. It is concluded that the VmUC is able to outperform G-PCC and V-PCC in geometry encoding. However, it is outperformed by V-PCC in terms of color encoding, namely across all three training sessions. Furthermore, it is also shown that the codec does not present a high level of stability, changing its performance considerably with different training sessions.

关键词： point cloud coding deep learning-based codecs objective evaluation

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-based point cloud coding for Immersive Experiences 22

Deep Learning-based Point Cloud Coding for Immersive Experie...

引用

30th ACM International Conference on Multimedia (MM)

作者： Pereira, Fernando Univ Lisbon Inst Super Tecn Inst Telecomunicacoes Av Rovisco Pais P-1049001 Lisbon Portugal

ISBN: (纸本)9781450392037

The recent advances in visual data acquisition and consumption have led to the emergence of the so-called plenoptic visual models, where point clouds (PCs) are playing an increasingly important role. point clouds are a 3D visual model where the visual scene is represented through a set of points and associated attributes, notably color. To offer realistic and immersive experiences, point clouds need to have millions, or even billions, of points, thus asking for efficient representation and coding solutions. This is critical for emerging applications and services, notably virtual and augmented reality, personal communications and meetings, education and medical applications and virtual museum tours. The point cloud coding field has received many contributions in recent years, notably adopting deep learning-based approaches, and it is critical for the future of immersive media experiences. In this context, the key objective of this tutorial is to review the most relevant point cloud coding solutions available in the literature with a special focus on deep learning-based solutions and its specific novel features. Special attention will be dedicated to the ongoing standardization projects in this domain, notably in JPEG and MPEG.

关键词： 3D multimedia point cloud coding deep learning immersive experiences

来源：评论

学校读者我要写书评

暂无评论

Neighborhood Adaptive Loss Function for Deep Learning-Based point cloud coding With Implicit and Explicit Quantization

引用

IEEE MULTIMEDIA 2021年第3期28卷 107-116页

作者： Guarda, Andre F. R. Rodrigues, Nuno M. M. Pereira, Fernando Univ Lisbon Inst Super Tecn Lisbon Portugal Inst Telecomunicacoes Aveiro Portugal ESTG Politecn Leiria Leiria Portugal

As the interest in deep learning tools continues to rise, new multimedia research fields begin to discover its potential. Both image and point cloud coding are good examples of technologies, where deep learning-based solutions have recently displayed very competitive performance. In this context, this article brings two novel contributions to the point cloud geometry coding state-of-the-art;first, a novel neighborhood adaptive distortion metric to be used in the training loss function, which allows significantly improving the rate-distortion performance with commonly used objective quality metrics;second, an explicit quantization approach at the training and coding times to generate varying rate/quality with a single trained deep learning coding model, effectively reducing the training complexity and storage requirements. The result is an improved deep learning-based point cloud geometry coding solution, which is both more compression efficient and less demanding in training complexity and storage.

关键词： Encoding Training Distortion Geometry Measurement Three-dimensional displays Image coding point cloud coding deep learning neighborhood adaptive distortion adaptive loss function explicit quantization

来源：评论

学校读者我要写书评

暂无评论

coding and Streaming System Design for Interactive 360-Degree Video Applications and Scalable Octree-Based point cloud coding

Coding and Streaming System Design for Interactive 360-Degre...

引用

作者： Mao, Yixiang New York University Tandon School of Engineering

学位级别：Ph.D., Doctor of Philosophy

Efficient coding and streaming of 360-degree video and point cloud video are critical for the continued development of lifelike virtual reality (VR) experiences. Interactive 360-degree video applications, e.g. video conferencing, require an extremely low delay in video delivery and robustness to both network dynamics and field of view (FoV) prediction errors. We propose a frame-level FoV-adaptive coding structure that varies the bit rates for different regions of a coded frame based on the predicted FoV. Integrating such frame-level FoV adaptation with temporal predictive coding is challenging due to the temporal variations of the FoV. We propose novel ways for modeling the influence of FoV dynamics on the quality-rate performance of temporal predictive coding. Compared with other benchmark systems, our system shows significantly improved rendered video quality, while achieving very low end-to-end delay and low frame-freeze probability. Octree-based point cloud representation and compression have been adopted by the MPEG G-PCC standard. However, it only uses handcrafted methods to predict the probability that a leaf node is non-empty, which is used for entropy coding. We propose a 3D convolution-based machine learning model to predict such probabilities for geometry coding using the context information from the previous and currently coded octree level. We further propose a convolution-based model to upsample the decoded point cloud at a coarse resolution on the decoder side. Integration of the two approaches significantly improves the octree-based geometry coding performance. A key advantage of our work from the prior related studies is that our octree-based entropy coding model is naturally scalable. This benefits the future design of the point cloud streaming system.

关键词： 360-degree video coding 360-degree video streaming point cloud coding point cloud compression

来源：评论

学校读者我要写书评

暂无评论

point cloud Geometry coding with Relational Neighborhood Self-Attention 26

Point Cloud Geometry Coding with Relational Neighborhood Sel...

引用

26th International Workshop on Multimedia Signal Processing

作者： Ghafari, Mohammadreza Guarda, Andre F. R. Rodrigues, Nuno M. M. Pereira, Fernando Univ Lisbon Inst Super Tecn Inst Telecomunicacoes Lisbon Portugal Inst Telecomunicacoes Lisbon Portugal Politecn Leiria ESTG Inst Telecomunicacoes Lisbon Portugal

ISBN: (纸本)9798350387261;9798350387254

In the ever-evolving landscape of deep learning, attention models have contributed to boost the performance in diverse fields such as computer vision and natural language processing. Following this trend, this paper proposes a novel Relational Neighborhood Self-Attention (RNSA) model, specifically designed for point cloud (PC) geometry coding to be integrated in the emerging learning-based JPEG PCC standard. The RNSA model proposes three new methods: first, to effectively learn correlations between the points by capturing the relational features and positions of neighboring points;second, to address the inefficiencies of conventional dot product attention, a novel Relational Scoring method to generate an attention map able to capture both linear and non-linear relationships between points and their neighbors is adopted;third, the created attention maps are normalized by Sparsemax instead of Softmax to generate sparse probabilities and assigns higher scores to the most important neighbors while marginalizing the less significant ones. Experimental results show that the proposed attention model achieves around 8% gains in both BD-Rate PSNR D1 and PSNR D2 compared to the baseline codec, i.e., JPEG PCC, while adding a small number of model parameters to JPEG PCC.

关键词： Deep Learning Self-Attention point cloud coding JPEG PCC

来源：评论

学校读者我要写书评

暂无评论

LEARNING-BASED RATE CONTROL FOR LEARNING-BASED point cloud GEOMETRY coding 30

LEARNING-BASED RATE CONTROL FOR LEARNING-BASED POINT CLOUD G...

引用

30th IEEE International Conference on Image Processing (ICIP)

作者： Ruivo, Manuel Guarda, Andre F. R. Pereira, Fernando Univ Lisbon Inst Super Tecn Lisbon Portugal Inst Telecomun Lisbon Portugal

ISBN: (纸本)9781728198354

point clouds represent one of the most versatile 3D visual representation models as they can provide the user the six degrees of freedom required for a truly immersive experience. In the last decade, several point cloud coding solutions have been proposed using distinct approaches, notably two MPEG standards, addressing static and dynamic point cloud coding. More recently, learning-based coding approaches started to be considered also for point cloud coding. The performance of these solutions has been so competitive that JPEG already decided to develop a point cloud coding standard adopting this novel approach. This paper proposes the first learning-based rate control mechanism to minimize the complexity associated to the selection of appropriate coding parameters for the learning-based point cloud geometry codec adopted as the initial Verification Model for the development of the JPEG Pleno Learning-based point cloud coding standard.

关键词： point cloud coding deep learning rate control

来源：评论

学校读者我要写书评

暂无评论

Learning-based point cloud Geometry coding Rate Control

Learning-based Point Cloud Geometry Coding Rate Control

引用

Data Compression Conference (DCC)

作者： Ruivo, Manuel Guarda, Andre F. R. Pereira, Fernando Univ Lisbon IST Inst Telecomunicacoes Lisbon Portugal Inst Telecomunicacoes Lisbon Portugal

ISBN: (纸本)9798350347951

Multimedia applications have been evolving towards providing users with more immersive and realistic experiences. A common way to model the light available for the users' eyes is the so-called plenoptic function - a powerful 7D representation of light. There are three main types of 3D representation models for the plenoptic function, capable of expressing the light information needed to offer 6-Degrees of Freedom (DoF) experiences, namely light fields, meshes, and point clouds (PCs). This paper focuses on PCs since they allow representing and processing objects directly in the 3D space, facilitating user interaction and navigation in a multitude of application domains. Since the illusion of real surfaces is provided by high-density point sets, a good quality of experience requires a rather large set of points to represent a single PC, thus originating huge amounts of data to be stored and/or transmitted. Consequently, PC coding (PCC) with significant compression levels is a must to reduce the PC data to more manageable sizes and bring PC-based applications to practical deployment. The promising results for image coding led the Joint Photographic Experts Group (JPEG) to launch a standardization project especially targeting Deep Learning (DL)-based PCC, with a final Call for Proposals in January 2022. The best performing response to this call [1] became the JPEG Pleno Learning-based PCC Verification Model (VM), which is the seed codec for the final standard. In this codec, the rate may be controlled through a set of coding parameters, largely depending on the specific PC to code, notably its sparsity and homogeneity. © 2023 IEEE.

关键词： coding rate control deep learning based coding point cloud coding

来源：评论

学校读者我要写书评

暂无评论

point cloud coding: A privileged view driven by a classification taxonomy

引用

SIGNAL PROCESSING-IMAGE COMMUNICATION 2020年 85卷 115862-115862页

作者： Pereira, Fernando Dricot, Antoine Ascenso, Joao Brites, Catarina Inst Telecomunicacoes Lisbon Portugal Univ Lisbon Inst Super Tecn Lisbon Portugal

Humans mainly communicate among them and with the world around them using light and vision, thus implying that visual representation technologies play a central role in human societies. While visual representation has been based on the 2D representation paradigm for many decades, multiple developments are nowadays pressing towards the adoption of more realistic and immersive 3D visual representation models. point clouds are one of these emerging representation models. However, the huge amount of data involved asks for highly efficient coding solutions, some of which have recently started to be developed by the MPEG and JPEG standardization groups. In this hectic context, this paper proposes a privileged view over the current point cloud coding technologies, driven by a novel, appropriate classification taxonomy. For this purpose, some of the most representative point cloud coding solutions available in the literature will be reviewed to exercise the most relevant classification paths in the proposed taxonomy. It is expected that this type of classification taxonomy and privileged view may help better understanding the point cloud coding landscape for further solid and consistent advancements in this emerging technical area.

关键词： point cloud coding Taxonomy Geometry Attributes Voxel Octree Patch Graph

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：