AI-generated images (AGIs) are increasingly utilized across diverse domains due to their ability to quickly produce high-quality visuals. However, assessing the quality of AGIs remains challenging due to their inheren...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
AI-generated images (AGIs) are increasingly utilized across diverse domains due to their ability to quickly produce high-quality visuals. However, assessing the quality of AGIs remains challenging due to their inherent variability and distinctive distortions. To address these challenges, we propose a novel AGI quality assessment method named SIRQA, which enhances feature representation by integrating visual features with textual prompts, effectively measureing the alignment between the generated images and the described content to improve the precision of quality assessment. Specifically, SIRQA employs self-ranking and inter-ranking mechanisms to refine feature representation. The self-ranking mechanism maintains consistency between feature distances and sampling scales, making sure that features from similar sampling scales are positioned closer together. Additionally, inter-ranking mechanism sorts the weighted similarity scores between images and prompts to align with the ranking in the label space. Extensive experiments on the AGIQA3K and PKUI2IQA datasets show that our SIRQA outperforms eight state-of-the-art algorithms in terms of both Spearman's rank correlation coefficient (SRCC) and Pearson linear correlation coefficient (PLCC).
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship betwee...
详细信息
ISBN:
(纸本)9781479902880
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship between visual attention deployment and aesthetic appeal judgment. In particular, we seek to validate through a scientific approach those simplicity and compositional rules of thumb that have been applied by photographers and modeled by computer vision scientists in computational aesthetics algorithms. Our results provide a confirmation that both simplicity and composition matter for aesthetic appeal of images, and indicate effective ways to compute them directly from the saliency distribution of an image.
In traditional asymmetric stereo video encoding scheme, one eye is represented with high quality sequence, the other eye is represented with lower quality one. However, if the low quality view is the observer's do...
详细信息
ISBN:
(纸本)9781424448562
In traditional asymmetric stereo video encoding scheme, one eye is represented with high quality sequence, the other eye is represented with lower quality one. However, if the low quality view is the observer's dominant eye, the masking effect will not work. Based on this human visual characteristic, this paper proposed a GOP-based resolution cross-switching asymmetric encoding scheme. By allocating degradation to both of views in a balanced way over time, our experimental results show better compression efficiency than JMVM reference software and better subjective visual quality than the traditional asymmetric stereo video encoding scheme. Our stereo video coding scheme can be a trade-off between compression performance and subjective visual quality.
We are living in the Information Age, and information has become a critically important component of our life. Due to the success of the Internet, the amount of available information, including immense volumes of visu...
详细信息
ISBN:
(纸本)9789612480363
We are living in the Information Age, and information has become a critically important component of our life. Due to the success of the Internet, the amount of available information, including immense volumes of visual information, is growing explosively. Therefore means for its faultless circulation and handling are urgently required. Considerable research efforts are dedicated today to address this necessity, but they are seriously hampered by the lack of a common agreement about "What actually is visual information?" Without answering this question, all our remarkable efforts inevitably end up as a plain alchemy. I am trying to rind out a remedy for this bizarre and absurd situation. I propose my own definition of information (derived from the Kolmogorov's complexity theory), and from this point of view, attempt to revise the state of the art of contemporary imageprocessing convention.
An audio-graphic teleconferencing system has been developed that uses ordinary personal computers (PCs) interconnected over a basic rate (2B+D) ISDN line. The system supports high-speed transmission of 200-dpi resolut...
详细信息
ISBN:
(纸本)0819407437
An audio-graphic teleconferencing system has been developed that uses ordinary personal computers (PCs) interconnected over a basic rate (2B+D) ISDN line. The system supports high-speed transmission of 200-dpi resolution documents read by an optical scanner and presented on the displays of the conference participants. While looking at the same material, the conferees can interactively converse and make handwritten notations for all the participants to see on the document via a LCD tablet. This paper describes the configuration and performance of the system, focusing mainly on the ISDN-based multi-media transmission method and the method of reducing and enlarging binary images.
Based on Radon transform and 2-D Fourier transform, this paper presents a novel digital image-watermarking scheme that is invariant to geometrical attacks like rotation, scaling and translation (RST). The watermark is...
详细信息
ISBN:
(纸本)0780386744
Based on Radon transform and 2-D Fourier transform, this paper presents a novel digital image-watermarking scheme that is invariant to geometrical attacks like rotation, scaling and translation (RST). The watermark is embedded into the middle frequency band obtained by taking Radon transform of a circular area selected from the original unmarked image. and then extracting 2-D Fourier magnitude of this Radon transformed image. The original unmarked image is not required to extract the watermark. Furthermore, to prevent the watermarked image from degrading due to inverse Radon transform, the final watermarked image is the addition of the original image and the watermark signal obtained by inverse Radon transform. Experimental results show that the proposed scheme is robust to RST attacks.
Transform coding based on the discrete cosine transform (DCT) has been widely used in image coding standards. However, the coded images often suffer from severe visual distortions such as blocking artifacts. In this p...
详细信息
ISBN:
(纸本)9781479961399
Transform coding based on the discrete cosine transform (DCT) has been widely used in image coding standards. However, the coded images often suffer from severe visual distortions such as blocking artifacts. In this paper, we propose a novel image deblocking method to address the blocking artifacts reduction problem in a patch-based scheme. image patches are clustered and reconstructed by the low-rank approximation, which is weighted by the geodesic distance. Experimental results show that the proposed method achieves higher PSNR than the state-of-the-art deblocking and denoising methods and the processed images present good visual quality.
Depth for single image is a hot problem in computer vision, which is very important to 2D/3D image conversion. Generally, depth of the object in the scene varies with the amount of blur in the defocus images. So, dept...
详细信息
ISBN:
(纸本)9781479902880
Depth for single image is a hot problem in computer vision, which is very important to 2D/3D image conversion. Generally, depth of the object in the scene varies with the amount of blur in the defocus images. So, depth in the scene can be recovered by measuring the blur. In this paper, a new method for depth estimation based on focus/defocus cue is presented, where the entropy of high frequency subband of wavelet decomposition is regarded as the measure of blur. The proposed method, which is unnecessary to select threshold, can provide pixel-level depth map. The experimental results show that this method is effective and reliable.
In this paper, we propose a novel algorithm for summarization-based image resizing. In the past, a process of detecting precise locations of repeating patterns is required before the pattern removal step in resizing. ...
详细信息
ISBN:
(纸本)9781728185514
In this paper, we propose a novel algorithm for summarization-based image resizing. In the past, a process of detecting precise locations of repeating patterns is required before the pattern removal step in resizing. However, it is difficult to find repeating patterns which are illuminated under different lighting conditions and viewed from different perspectives. To solve the problem, we first identify the regularity unit of repeating patterns by statistics. Then we can use the regularity unit for shift-map optimization to obtain a better resized image. The experimental results show that our method is competitive with other well-known methods.
Previously we have presented a method for selective image sharpness enhancement. Our method is based on the simultaneous nonlinear reaction-diffusion time-evolution equipped with a nonlinear diffusion term, a reaction...
详细信息
ISBN:
(纸本)0819452114
Previously we have presented a method for selective image sharpness enhancement. Our method is based on the simultaneous nonlinear reaction-diffusion time-evolution equipped with a nonlinear diffusion term, a reaction term and an isotropic peaking term, and it can sharpen only degraded edges blurred by several causes without increasing the visibility of nuisance factors such as random noise. This paper applies our simultaneous nonlinear reaction-diffusion method to removal of image blurs due to image-motion. The motion blur is not only shift-variant but also anisotropic. To adapt our simultaneous nonlinear reaction-diffusion method for motion de-blurring, we replace the isotropic Laplacian operator, included in the isotropic peaking term of our prototypal method, with the anisotropic operator considering the direction of the estimated image motion. Preliminarily experiments using artificially generated test images show that our method achieves excellent motion de-blurring.
暂无评论