In video coding, it is always an intractable problem to compress high frequency components including noise and visually imperceptible content that consumes large amount bandwidth resources while providing limited qual...
详细信息
ISBN:
(纸本)9781728185514
In video coding, it is always an intractable problem to compress high frequency components including noise and visually imperceptible content that consumes large amount bandwidth resources while providing limited quality improvement. Direct using of denoising methods causes coding performance degradation, and hence not suitable for video coding scenario. In this work, we propose a video pre-processing approach by leveraging edge preserving filter specifically designed for video coding, of which filter parameters are optimized in the sense of rate-distortion (R-D) performance. The proposed pre-processing method removes low R-D cost-effective components for video encoder while keeping important structural components, leading to higher coding efficiency and also better subjective quality. Comparing with the conventional denoising filters, our proposed pre-processing method using the R-D optimized edge preserving filter can improve the coding efficiency by up to -5.2% BD-rate with low computational complexity.
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship betwee...
详细信息
ISBN:
(纸本)9781479902880
Predicting the aesthetic appeal of images is of great interest for a number of applications, from image retrieval to visual quality optimization. In this paper, we report a preliminary study on the relationship between visual attention deployment and aesthetic appeal judgment. In particular, we seek to validate through a scientific approach those simplicity and compositional rules of thumb that have been applied by photographers and modeled by computer vision scientists in computational aesthetics algorithms. Our results provide a confirmation that both simplicity and composition matter for aesthetic appeal of images, and indicate effective ways to compute them directly from the saliency distribution of an image.
Non-Lambertian objects present an aspect which depends on the viewer's position towards the surrounding scene. Contrary to diffuse objects, their features move non-linearly with the camera, preventing rendering th...
详细信息
ISBN:
(纸本)9781728185514
Non-Lambertian objects present an aspect which depends on the viewer's position towards the surrounding scene. Contrary to diffuse objects, their features move non-linearly with the camera, preventing rendering them with existing Depth image-Based Rendering (DIBR) approaches, or to triangulate their surface with Structure-from-Motion (SfM). In this paper, we propose an extension of the DIBR paradigm to describe these non-linearities, by replacing the depth maps by more complete multi-channel "non-Lambertian maps", without attempting a 3D reconstruction of the scene. We provide a study of the importance of each coefficient of the proposed map, measuring the trade-off between visual quality and data volume to optimally render non-Lambertian objects. We compare our method to other state-of-the-art image-based rendering methods and outperform them with promising subjective and objective results on a challenging dataset.
Car counting on drone-based images is a challenging task in computer vision. Most advanced methods for counting are based on density maps. Usually, density maps are first generated by convolving ground truth point map...
详细信息
ISBN:
(纸本)9781728180687
Car counting on drone-based images is a challenging task in computer vision. Most advanced methods for counting are based on density maps. Usually, density maps are first generated by convolving ground truth point maps with a Gaussian kernel for later model learning (generation). Then, the counting network learns to predict density maps from input images (estimation). Most studies focus on the estimation problem while overlooking the generation problem. In this paper, a training framework is proposed to generate density maps by learning and train generation and estimation subnetworks jointly. Experiments demonstrate that our method outperforms other density map-based methods and shows the best performance on drone-based car counting.
In this paper, we propose a novel interpolation algorithm for adapting the human visual system (HVS) and applying in real-time image upscaling. The defined statistical features are first computed in a local window of ...
详细信息
ISBN:
(纸本)9781467373142
In this paper, we propose a novel interpolation algorithm for adapting the human visual system (HVS) and applying in real-time image upscaling. The defined statistical features are first computed in a local window of the low-resolution (LR) counterpart. Then the most correlated neighbours of a missing pixel in high-resolution (HR) image are adaptively selected based on local structural analysis for the prominent edges and fine textures. Finally, the unknown pixel values are estimated through a designed directional clustering model (DCM) which incorporates HVS information into the weighted coefficients. The extensive experimental results show that the proposed image interpolation method can accurately reconstruct the structures of HR image in term of arbitrary magnification factors and effectively suppress the jaggy/ringing artifacts with low computation complexity.
The exponential increase of digital data and the limited capacity of current storage devices have made clear the need for exploring new storage solutions. Thanks to its biological properties, DNA has proven to be a po...
详细信息
ISBN:
(纸本)9781728185514
The exponential increase of digital data and the limited capacity of current storage devices have made clear the need for exploring new storage solutions. Thanks to its biological properties, DNA has proven to be a potential candidate for this task, allowing the storage of information at a high density for hundreds or even thousands of years. With the release of nanopore sequencing technologies, DNA data storage is one step closer to become a reality. Many works have proposed solutions for the simulation of this sequencing step, aiming to ease the development of algorithms addressing nanopore-sequenced reads. However, these simulators target the sequencing of complete genomes, whose characteristics differ from the ones of synthetic DNA. This work presents a nanopore sequencing simulator targeting synthetic DNA on the context of DNA data storage.
In this paper, we proposed an optimized model based on the visual attention mechanism(VAM) for no-reference stereoscopic image quality assessment (SIQA). A CNN model is designed based on dual attention mechanism (DAM)...
详细信息
ISBN:
(纸本)9781728180687
In this paper, we proposed an optimized model based on the visual attention mechanism(VAM) for no-reference stereoscopic image quality assessment (SIQA). A CNN model is designed based on dual attention mechanism (DAM), which includes channel attention mechanism and spatial attention mechanism. The channel attention mechanism can give high weight to the features with large contribution to final quality, and small weight to features with low contribution. The spatial attention mechanism considers the inner region of a feature, and different areas are assigned different weights according to the importance of the region within the feature. In addition, data selection strategy is designed for CNN model. According to VAM, visual saliency is applied to guide data selection, and a certain proportion of saliency patches are employed to fine tune the network. The same operation is performed on the test set, which can remove data redundancy and improve algorithm performance. Experimental results on two public databases show that the proposed model is superior to the state-of-the-art SIQA methods. Cross-database validation shows high generalization ability and high effectiveness of our model.
A growing societal awareness about privacy and security push the development of signal processing techniques in the encrypted domain. Data compression in encrypted domain attracts much attention recently years due to ...
详细信息
ISBN:
(纸本)9781479902880
A growing societal awareness about privacy and security push the development of signal processing techniques in the encrypted domain. Data compression in encrypted domain attracts much attention recently years due to its avoiding the leakage of data source during compression. This paper proposes an improved block-by-block compression scheme of encrypted image with flexible compression ratio. The original image is encrypted by permuting the blocks of the image and then permuting the pixels in the blocks. In the compression, pixels chosen randomly used as reference information, and remaining pixels are compressed by coset code. At the decoder side, side information (SI) which is generated by combining correlation among blocks and image restoration from partial random samples (IRPRS) is utilized to assist the decompression. Moreover, an adaptive system parameters selection method is also given in this paper. The experimental results show that the proposed method can achieve a better reconstructed result compared with the earlier method.
Stereo image super-resolution (SR) has achieved great progress in recent years. However, the two major problems of the existing methods are that the parallax correction is insufficient and the cross-view information f...
详细信息
ISBN:
(纸本)9781728185514
Stereo image super-resolution (SR) has achieved great progress in recent years. However, the two major problems of the existing methods are that the parallax correction is insufficient and the cross-view information fusion only occurs in the beginning of the network. To address these problems, we propose a two-stage parallax correction and a multi-stage cross-view fusion network for better stereo image SR results. Specially, the two-stage parallax correction module consists of horizontal parallax correction and refined parallax correction. The first stage corrects horizontal parallax by parallax attention. The second stage is based on deformable convolution to refine horizontal parallax and correct vertical parallax simultaneously. Then, multiple cascaded enhanced residual spatial feature transform blocks are developed to fuse cross-view information at multiple stages. Extensive experiments show that our method achieves state-of-the-art performance on the KITTI2012, KITTI2015, Middlebury and Flickr1024 datasets.
The design of stereo image quality assessment (SIQA) methods cannot be well based on the biological theory of human vision, so the performance of many SIQA methods cannot achieve good consistency with the subjective p...
详细信息
ISBN:
(纸本)9781728180687
The design of stereo image quality assessment (SIQA) methods cannot be well based on the biological theory of human vision, so the performance of many SIQA methods cannot achieve good consistency with the subjective perception. The research on the visual system tends to the dorsal and ventral pathways, which ignores the information asymmetry in the early visual pathways. It is worth noting that the ON and OFF receptive fields in retinal ganglion cells (RGCs) respond asymmetrically to the statistical features of images. Inspired by this, we propose a SIQA method based on monocular and binocular visual features, which takes into account the asymmetry of local contrast bright and dark features in early visual pathways. First, this paper extracts the response maps of ON and OFF cell in RGCs to left and right views respectively. And then the different information fusion modes of visual cortex are used to fuse the response maps information of left and right views. Final, monocular and binocular features were extracted and sent to support vector regression (SVR) for quality regression. Experimental results show that the proposed method is superior to several mainstream SIQA metrics on two publicly available databases.
暂无评论