Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences using traditional video codecs for efficient encoding. However, this lossy compression scheme introduces artifacts...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences using traditional video codecs for efficient encoding. However, this lossy compression scheme introduces artifacts that degrade the color attributes of the data. This paper introduces a framework designed to enhance the color quality in the V-PCC compressed point clouds. We propose the lightweight de-compression Unet (LDC-Unet), a 2D neural network, to optimize the projection maps generated during V-PCC encoding. The optimized 2D maps will then be back-projected to the 3D space to enhance the corresponding point cloud attributes. Additionally, we introduce a transfer learning strategy and develop a customized natural image dataset for the initial training. The model was then fine-tuned using the projection maps of the compressed point clouds. The whole strategy effectively addresses the scarcity of point cloud training data. Our experiments, conducted on the public 8i voxelized full bodies long sequences (8iVSLF) dataset, demonstrate the effectiveness of our proposed method in improving the color quality.
One of the most striking properties of natural image statistics is their scale invariance. Intuitively, a natural image always contains the same contents of different scales and dually the same contents of same scale ...
详细信息
ISBN:
(纸本)9781479902880
One of the most striking properties of natural image statistics is their scale invariance. Intuitively, a natural image always contains the same contents of different scales and dually the same contents of same scale exist throughout scales of the image. Different from the previous scale invariance related work decomposing an image to its local band-pass filter components, this paper seeks a general model of the natural image paths distribution to describe the scale invariance in the visual world and then a novel strategy for high-fidelity image restoration is presented by characterizing nonlocal self-similarity of natural images throughout scales in a unified statistical manner, which offers a powerful mechanism of combining natural images scale invariance and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Extensive experiments on image restoration from partial random samples manifest that the proposed algorithm achieves significant performance improvements over the current state-of-the-art schemes.
AI-generated images (AGIs) are increasingly utilized across diverse domains due to their ability to quickly produce high-quality visuals. However, assessing the quality of AGIs remains challenging due to their inheren...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
AI-generated images (AGIs) are increasingly utilized across diverse domains due to their ability to quickly produce high-quality visuals. However, assessing the quality of AGIs remains challenging due to their inherent variability and distinctive distortions. To address these challenges, we propose a novel AGI quality assessment method named SIRQA, which enhances feature representation by integrating visual features with textual prompts, effectively measureing the alignment between the generated images and the described content to improve the precision of quality assessment. Specifically, SIRQA employs self-ranking and inter-ranking mechanisms to refine feature representation. The self-ranking mechanism maintains consistency between feature distances and sampling scales, making sure that features from similar sampling scales are positioned closer together. Additionally, inter-ranking mechanism sorts the weighted similarity scores between images and prompts to align with the ranking in the label space. Extensive experiments on the AGIQA3K and PKUI2IQA datasets show that our SIRQA outperforms eight state-of-the-art algorithms in terms of both Spearman's rank correlation coefficient (SRCC) and Pearson linear correlation coefficient (PLCC).
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship betwee...
详细信息
ISBN:
(纸本)9781479902880
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship between visual attention deployment and aesthetic appeal judgment. In particular, we seek to validate through a scientific approach those simplicity and compositional rules of thumb that have been applied by photographers and modeled by computer vision scientists in computational aesthetics algorithms. Our results provide a confirmation that both simplicity and composition matter for aesthetic appeal of images, and indicate effective ways to compute them directly from the saliency distribution of an image.
In traditional asymmetric stereo video encoding scheme, one eye is represented with high quality sequence, the other eye is represented with lower quality one. However, if the low quality view is the observer's do...
详细信息
ISBN:
(纸本)9781424448562
In traditional asymmetric stereo video encoding scheme, one eye is represented with high quality sequence, the other eye is represented with lower quality one. However, if the low quality view is the observer's dominant eye, the masking effect will not work. Based on this human visual characteristic, this paper proposed a GOP-based resolution cross-switching asymmetric encoding scheme. By allocating degradation to both of views in a balanced way over time, our experimental results show better compression efficiency than JMVM reference software and better subjective visual quality than the traditional asymmetric stereo video encoding scheme. Our stereo video coding scheme can be a trade-off between compression performance and subjective visual quality.
We are living in the Information Age, and information has become a critically important component of our life. Due to the success of the Internet, the amount of available information, including immense volumes of visu...
详细信息
ISBN:
(纸本)9789612480363
We are living in the Information Age, and information has become a critically important component of our life. Due to the success of the Internet, the amount of available information, including immense volumes of visual information, is growing explosively. Therefore means for its faultless circulation and handling are urgently required. Considerable research efforts are dedicated today to address this necessity, but they are seriously hampered by the lack of a common agreement about "What actually is visual information?" Without answering this question, all our remarkable efforts inevitably end up as a plain alchemy. I am trying to rind out a remedy for this bizarre and absurd situation. I propose my own definition of information (derived from the Kolmogorov's complexity theory), and from this point of view, attempt to revise the state of the art of contemporary imageprocessing convention.
An audio-graphic teleconferencing system has been developed that uses ordinary personal computers (PCs) interconnected over a basic rate (2B+D) ISDN line. The system supports high-speed transmission of 200-dpi resolut...
详细信息
ISBN:
(纸本)0819407437
An audio-graphic teleconferencing system has been developed that uses ordinary personal computers (PCs) interconnected over a basic rate (2B+D) ISDN line. The system supports high-speed transmission of 200-dpi resolution documents read by an optical scanner and presented on the displays of the conference participants. While looking at the same material, the conferees can interactively converse and make handwritten notations for all the participants to see on the document via a LCD tablet. This paper describes the configuration and performance of the system, focusing mainly on the ISDN-based multi-media transmission method and the method of reducing and enlarging binary images.
Transform coding based on the discrete cosine transform (DCT) has been widely used in image coding standards. However, the coded images often suffer from severe visual distortions such as blocking artifacts. In this p...
详细信息
ISBN:
(纸本)9781479961399
Transform coding based on the discrete cosine transform (DCT) has been widely used in image coding standards. However, the coded images often suffer from severe visual distortions such as blocking artifacts. In this paper, we propose a novel image deblocking method to address the blocking artifacts reduction problem in a patch-based scheme. image patches are clustered and reconstructed by the low-rank approximation, which is weighted by the geodesic distance. Experimental results show that the proposed method achieves higher PSNR than the state-of-the-art deblocking and denoising methods and the processed images present good visual quality.
Depth for single image is a hot problem in computer vision, which is very important to 2D/3D image conversion. Generally, depth of the object in the scene varies with the amount of blur in the defocus images. So, dept...
详细信息
ISBN:
(纸本)9781479902880
Depth for single image is a hot problem in computer vision, which is very important to 2D/3D image conversion. Generally, depth of the object in the scene varies with the amount of blur in the defocus images. So, depth in the scene can be recovered by measuring the blur. In this paper, a new method for depth estimation based on focus/defocus cue is presented, where the entropy of high frequency subband of wavelet decomposition is regarded as the measure of blur. The proposed method, which is unnecessary to select threshold, can provide pixel-level depth map. The experimental results show that this method is effective and reliable.
In this paper, we propose a novel algorithm for summarization-based image resizing. In the past, a process of detecting precise locations of repeating patterns is required before the pattern removal step in resizing. ...
详细信息
ISBN:
(纸本)9781728185514
In this paper, we propose a novel algorithm for summarization-based image resizing. In the past, a process of detecting precise locations of repeating patterns is required before the pattern removal step in resizing. However, it is difficult to find repeating patterns which are illuminated under different lighting conditions and viewed from different perspectives. To solve the problem, we first identify the regularity unit of repeating patterns by statistics. Then we can use the regularity unit for shift-map optimization to obtain a better resized image. The experimental results show that our method is competitive with other well-known methods.
暂无评论