Today, thanks to the developing technology, changes are experienced in many areas and these changes sometimes bring some problems, issues that need to be solved or improved. In the field of digital image and video pro...
详细信息
ISBN:
(纸本)9781665436496
Today, thanks to the developing technology, changes are experienced in many areas and these changes sometimes bring some problems, issues that need to be solved or improved. In the field of digital image and video processing, the viewer experience has been greatly improved with the development of image and video coding methods and technologies in recent years. However, due to these developments, the importance of video and image quality evaluation techniques has increased. In the field of imaging technology, although the human visual system and human judgment experiments will give us the most accurate results, this rapidly growing field has made human judgment experiments unsuitable with the increasing content. At this point, effective and automatic quality assessment metrics are needed to evaluate and optimize advanced compression technologies in terms of visual perception quality. In this paper, our aim is to compare the performances of some of the new generation image and video quality assessment metrics.
One of the most striking properties of natural image statistics is their scale invariance. Intuitively, a natural image always contains the same contents of different scales and dually the same contents of same scale ...
详细信息
ISBN:
(纸本)9781479902880
One of the most striking properties of natural image statistics is their scale invariance. Intuitively, a natural image always contains the same contents of different scales and dually the same contents of same scale exist throughout scales of the image. Different from the previous scale invariance related work decomposing an image to its local band-pass filter components, this paper seeks a general model of the natural image paths distribution to describe the scale invariance in the visual world and then a novel strategy for high-fidelity image restoration is presented by characterizing nonlocal self-similarity of natural images throughout scales in a unified statistical manner, which offers a powerful mechanism of combining natural images scale invariance and nonlocal self-similarity simultaneously to ensure a more reliable and robust estimation. Extensive experiments on image restoration from partial random samples manifest that the proposed algorithm achieves significant performance improvements over the current state-of-the-art schemes.
The aim of an objective image quality assessment is to find an automatic algorithm that evaluates the quality of pictures or video as a human observer would do. To reach this goal, researchers try to simulate the Huma...
详细信息
ISBN:
(纸本)9781424414369
The aim of an objective image quality assessment is to find an automatic algorithm that evaluates the quality of pictures or video as a human observer would do. To reach this goal, researchers try to simulate the Human visual System (HVS). visual attention is a main feature of the HVS, but few studies have been done on using it in image quality assessment. In this work, we investigate the use of the visual attention information in their final pooling step. The rationale of this choice is that an artefact is likely more annoying in a salient region than in other areas. To shed light on this point, a quality assessment campaign has been conducted during which eye movements have been recorded. The results show that applying the visual attention to image quality assessment is not trivial, even with the ground truth.
Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences using traditional video codecs for efficient encoding. However, this lossy compression scheme introduces artifacts...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences using traditional video codecs for efficient encoding. However, this lossy compression scheme introduces artifacts that degrade the color attributes of the data. This paper introduces a framework designed to enhance the color quality in the V-PCC compressed point clouds. We propose the lightweight de-compression Unet (LDC-Unet), a 2D neural network, to optimize the projection maps generated during V-PCC encoding. The optimized 2D maps will then be back-projected to the 3D space to enhance the corresponding point cloud attributes. Additionally, we introduce a transfer learning strategy and develop a customized natural image dataset for the initial training. The model was then fine-tuned using the projection maps of the compressed point clouds. The whole strategy effectively addresses the scarcity of point cloud training data. Our experiments, conducted on the public 8i voxelized full bodies long sequences (8iVSLF) dataset, demonstrate the effectiveness of our proposed method in improving the color quality.
Querying by visual Thesaurus (VT) is a novel paradigm for content-based image retrieval approaches for it gives the user the possibility, in case of inappropriate starting example, to compose his query by arranging th...
详细信息
ISBN:
(纸本)9781424414369
Querying by visual Thesaurus (VT) is a novel paradigm for content-based image retrieval approaches for it gives the user the possibility, in case of inappropriate starting example, to compose his query by arranging the visual patches of the starting "page zero" according to his mental image. A refinement of the willed results can be achieved by inducing a spatial description within the retrieval procedure. This paper presents a novel approach to model the spatial relations between the visual patches. We define the Weighted Angle Spatial Histogram (WASH) that combines the angular computation between pairs of regions of interest and their respective topological regularity/irregularity. WASH has shown great robustness to region shape and scale in the image because segmented regions are considered as a composition of elementary relevant and minor subregions. We tested our approach on generic database, and we compared it with other state-of-the-art techniques.
AI-generated images (AGIs) are increasingly utilized across diverse domains due to their ability to quickly produce high-quality visuals. However, assessing the quality of AGIs remains challenging due to their inheren...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
AI-generated images (AGIs) are increasingly utilized across diverse domains due to their ability to quickly produce high-quality visuals. However, assessing the quality of AGIs remains challenging due to their inherent variability and distinctive distortions. To address these challenges, we propose a novel AGI quality assessment method named SIRQA, which enhances feature representation by integrating visual features with textual prompts, effectively measureing the alignment between the generated images and the described content to improve the precision of quality assessment. Specifically, SIRQA employs self-ranking and inter-ranking mechanisms to refine feature representation. The self-ranking mechanism maintains consistency between feature distances and sampling scales, making sure that features from similar sampling scales are positioned closer together. Additionally, inter-ranking mechanism sorts the weighted similarity scores between images and prompts to align with the ranking in the label space. Extensive experiments on the AGIQA3K and PKUI2IQA datasets show that our SIRQA outperforms eight state-of-the-art algorithms in terms of both Spearman's rank correlation coefficient (SRCC) and Pearson linear correlation coefficient (PLCC).
Post processing of medical images often needs interpolation. Taking cues from human visual system, we propose here an interpolation kernel consisting of linear combination of Gaussians at different scales. We compare ...
详细信息
ISBN:
(纸本)9780769530598
Post processing of medical images often needs interpolation. Taking cues from human visual system, we propose here an interpolation kernel consisting of linear combination of Gaussians at different scales. We compare the efficacy of the proposed kernel with other interpolation kernels, particularly in the processing of medical images. The basic algorithm has been implemented on a TI DM642 based hardware platform for real-time filtering and programmed for post-processing of ultrasound video frames (20fames/s) from the commercially available Siemens Medical Ultrasound Scanner.
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship betwee...
详细信息
ISBN:
(纸本)9781479902880
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship between visual attention deployment and aesthetic appeal judgment. In particular, we seek to validate through a scientific approach those simplicity and compositional rules of thumb that have been applied by photographers and modeled by computer vision scientists in computational aesthetics algorithms. Our results provide a confirmation that both simplicity and composition matter for aesthetic appeal of images, and indicate effective ways to compute them directly from the saliency distribution of an image.
In traditional asymmetric stereo video encoding scheme, one eye is represented with high quality sequence, the other eye is represented with lower quality one. However, if the low quality view is the observer's do...
详细信息
ISBN:
(纸本)9781424448562
In traditional asymmetric stereo video encoding scheme, one eye is represented with high quality sequence, the other eye is represented with lower quality one. However, if the low quality view is the observer's dominant eye, the masking effect will not work. Based on this human visual characteristic, this paper proposed a GOP-based resolution cross-switching asymmetric encoding scheme. By allocating degradation to both of views in a balanced way over time, our experimental results show better compression efficiency than JMVM reference software and better subjective visual quality than the traditional asymmetric stereo video encoding scheme. Our stereo video coding scheme can be a trade-off between compression performance and subjective visual quality.
An audio-graphic teleconferencing system has been developed that uses ordinary personal computers (PCs) interconnected over a basic rate (2B+D) ISDN line. The system supports high-speed transmission of 200-dpi resolut...
详细信息
ISBN:
(纸本)0819407437
An audio-graphic teleconferencing system has been developed that uses ordinary personal computers (PCs) interconnected over a basic rate (2B+D) ISDN line. The system supports high-speed transmission of 200-dpi resolution documents read by an optical scanner and presented on the displays of the conference participants. While looking at the same material, the conferees can interactively converse and make handwritten notations for all the participants to see on the document via a LCD tablet. This paper describes the configuration and performance of the system, focusing mainly on the ISDN-based multi-media transmission method and the method of reducing and enlarging binary images.
暂无评论