With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full...
详细信息
ISBN:
(纸本)9781728185514
With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full of challenges due to the complex relationship between binocular views. In this article, firstly, convolutional neural network (CNN) based on the visual pathway of human visual system (HVS) is built, which simulates different parts of visual pathway such as the optic chiasm, lateral geniculate nucleus (LGN), and visual cortex. Secondly, the two pathways of our method simulate the 'what' and 'where' visual pathway respectively, which are endowed with different feature extraction capabilities. Finally, we find a different application way for 3D-convolution, employing it fuse the information from left and right view, rather than just extracting temporal features in video. The experimental results show that our proposed method is more in line with subjective score and has good generalization.
To speedup the image classification process which conventionally takes the reconstructed images as input, compressed domain methods choose to use the compressed images without decompression as input. Correspondingly, ...
详细信息
ISBN:
(纸本)9781665475921
To speedup the image classification process which conventionally takes the reconstructed images as input, compressed domain methods choose to use the compressed images without decompression as input. Correspondingly, there will be a certain decline about the accuracy. Our goal in this paper is to raise the accuracy of compressed domain classification method using compressed images output by the NN-based image compression networks. Firstly, we design a hybrid objective loss function which contains the reconstruction loss of deep feature map. Secondly, one image reconstruction layer is integrated into the image classification network for up-sampling the compressed representation. These methods greatly help increase the compressed domain image classification accuracy and need no extra computational complexity. Experimental results on the benchmark imageNet prove that our design outperforms the latest work ResNet-41 with a large accuracy gain, about 4.49% on the top-1 classification accuracy. Besides, the accuracy lagging behinds the method using reconstructed images is also reduced to 0.47%. Moreover, our designed classification network has the lowest computational complexity and model complexity.
Transform coding based on the discrete cosine transform (DCT) has been widely used in image coding standards. However, the coded images often suffer from severe visual distortions such as blocking artifacts. In this p...
详细信息
ISBN:
(纸本)9781479961399
Transform coding based on the discrete cosine transform (DCT) has been widely used in image coding standards. However, the coded images often suffer from severe visual distortions such as blocking artifacts. In this paper, we propose a novel image deblocking method to address the blocking artifacts reduction problem in a patch-based scheme. image patches are clustered and reconstructed by the low-rank approximation, which is weighted by the geodesic distance. Experimental results show that the proposed method achieves higher PSNR than the state-of-the-art deblocking and denoising methods and the processed images present good visual quality.
Existing cross-component video coding technologies have shown great potential on improving coding efficiency. The fundamental insight of cross-component coding technology is respecting the statistical correlations amo...
详细信息
ISBN:
(纸本)9781728185514
Existing cross-component video coding technologies have shown great potential on improving coding efficiency. The fundamental insight of cross-component coding technology is respecting the statistical correlations among different color components. In this paper, a Cross-Component Sample Offset (CCSO) approach for image and video coding is proposed inspired by the observation that, luma component tends to contain more texture, while chroma component is relatively smoother. The key component of CCSO is a nonlinear offset mapping mechanism implemented as a look-up-table (LUT). The input of the mapping is the co-located reconstructed samples of luma component, and the output is offset values applied on chroma component. The proposed method has been implemented on top of a recent version of libaom. Experimental results show that the proposed approach brings 1.16% Random Access (RA) BD-rate saving on top of AV1 with marginal encoding/decoding time increase.
Recently, the pre-processed video transcoding has attracted wide attention and has been increasingly used in practical applications for improving the perceptual experience and saving transmission resources. However, v...
详细信息
ISBN:
(纸本)9781728185514
Recently, the pre-processed video transcoding has attracted wide attention and has been increasingly used in practical applications for improving the perceptual experience and saving transmission resources. However, very few works have been conducted to evaluate the performance of pre-processing methods. In this paper, we select the source (SRC) videos and various pre-processing approaches to construct the first Pre-processed and Transcoded Video Database (PTVD). Then, we conduct the subjective experiment, showing that compared with the video sent to the codec directly at the same bitrate, the appropriate pre-processing methods indeed improve the perceptual quality. Finally, existing image/video quality metrics are evaluated on our database. The results indicate that the performance of the existing image/video quality assessment (IQA/VQA) approaches remain to be improved. We will make our database publicly available soon.
Learned image compression (LIC) has shown its superior compression ability. Quantization is an inevitable stage to generate quantized latent for the entropy coding. To solve the non-differentiable problem of quantizat...
详细信息
ISBN:
(纸本)9781665475921
Learned image compression (LIC) has shown its superior compression ability. Quantization is an inevitable stage to generate quantized latent for the entropy coding. To solve the non-differentiable problem of quantization in the training phase, many differentiable approximated quantization methods have been proposed. However, the derivative of quantized latent to non-quantized latent are set as one in most of the previous methods. As a result, the quantization error between non-quantized and quantized latent is not taken into consideration in the gradient descent. To address this issue, we exploit the gradient scaling method to scale the gradient of non-quantized latent in the back-propagation. The experimental results show that we can outperform the recent LIC quantization methods.
In this paper, we reveal that many conventional features used in computational image quality assessment (IQA) methods can hardly characterize perceived distortions on various image characteristics and distortion types...
详细信息
ISBN:
(纸本)9781467373142
In this paper, we reveal that many conventional features used in computational image quality assessment (IQA) methods can hardly characterize perceived distortions on various image characteristics and distortion types, thus resulting in relatively low prediction performance of visual quality scores. To solve this problem, we propose a new IQA method, called Structural Contrast-Quality Index (SC-QI) which is based on structural contrast index (SCI) as a very effective feature. SCI can adaptively quantify perceived distortions depending on various image characteristics and distortions types. In addition to SCI, some other perceptually important features that reflect effects of contrast sensitivity function and chrominance component variation are also combined into the proposed SC-QI. Our comprehensive experiments on three large IQA datasets verify that the proposed SC-QI outperforms the state-of-the-art ones while accompanying lower computational complexity.
Tire pattern image classification is an important computer vision problem in pubic security, which can guide policeman to detect criminal cases. It remains challenge due to the small diversity within different classes...
详细信息
ISBN:
(纸本)9781665475921
Tire pattern image classification is an important computer vision problem in pubic security, which can guide policeman to detect criminal cases. It remains challenge due to the small diversity within different classes. Generally, a tire pattern image classification system may require two characteristics: high accuracy and low computation. In this paper, we first assume that capturing rich feature representation will benefits tire classification and learning through a lightweight network will improve computing efficiency. We then propose a simple yet efficient two-stage training mechanism: 1) We learn a feature extractor using a Variational Auto-Encoder framework constrained by contrastive learning, projecting images to latent space owing rich feature representation. 2) We train a single-layer linear classification network depend on the features extracted by the previous trained encoder. The Top-1 and Top-5 accuracy on tire pattern dataset is 89.8% and 96.6% respectively, validating the effectiveness of our strategy.
With the blooming of deep learning technology in computer vision, the integration of deep learning and the traditional video coding has made significant improvements, especially applying the super-resolution neural ne...
详细信息
ISBN:
(纸本)9781728185514
With the blooming of deep learning technology in computer vision, the integration of deep learning and the traditional video coding has made significant improvements, especially applying the super-resolution neural network as the post-processing module in the down-sampling-based video compression framework. However, the pre-processing module lacks back-propagated gradients for jointly considering down-sampling and up-sampling due to the non-differentiability of the traditional video codec. In this paper, we propose an end-to-end down-sampling-based video compression framework applying convolutional neural networks both as down-sampling and up-sampling. We use a virtual codec neural network to approximate the actual video codec so that the gradient can be effectively back-propagated for joint training. Experimental results show the superiority of our proposed framework compared with the predefined down-sampling-based video compression and various methods of joint training.
The images captured in the low-light conditions always suffer from low visibility. Enhancing the visibility of the low-light image is of broad application to various computer vision tasks. Based on the classical Retin...
详细信息
ISBN:
(纸本)9781728180687
The images captured in the low-light conditions always suffer from low visibility. Enhancing the visibility of the low-light image is of broad application to various computer vision tasks. Based on the classical Retinex model, previous methods assume the reflectance components as a well-exposed image. In this paper, we introduce the blurring distortion into the Retinex model to cover more general and challenging scenarios. We further propose a two-stage framework to extract the reflectance images and remove the blurring distortion separately. Specifically, we optimize the whole network by embedding a mechanism robust to the pixel misalignment in the training dataset. The experimental results show that our proposed method achieves promising results.
暂无评论