In this paper, we propose a high-frequency guided CNN for video compression artifacts reduction. In the proposed method, high frequency component in Y channel is extracted and used to guide the quality enhancement of ...
详细信息
ISBN:
(纸本)9781665475921
In this paper, we propose a high-frequency guided CNN for video compression artifacts reduction. In the proposed method, high frequency component in Y channel is extracted and used to guide the quality enhancement of all Y, U, V channels. As high frequency component contains the edge and contour information of the objects in the image, which is of vital importance to both subjective and objective quality. In general, the proposed method consists of two modules: the high frequency guidance module and the quality enhancement module. The high-frequency guidance module uses multiple octave convolutions to extract the high-frequency component in Y channel and then fuse it into the features of Y, U, and V channels. While in the quality enhancement module, multiple CNN residual blocks are used for the quality enhancement of Y, U, and V channels. The proposed method was integrated into both HM-16.22 and VTM-16.0. The results on the JVET test sequence under All Intra configuration shows the effectiveness of the proposed method. Compared with HEVC, the proposed method achieves the average BD-rate reductions of -12.3%, -22.7% and -23.5% for Y, U and V channels respectively. Compared with VVC, the average BD-rate reductions are -6.7%, -12.3% and -13.2% correspondingly.
This paper presents the development of a fast Free-Viewpoint Video (FVV) rendering algorithm that exploits the parallelism offered by General Purpose Graphics processing Units (GPGPUs). The system generates virtual vi...
详细信息
ISBN:
(纸本)9781479961399
This paper presents the development of a fast Free-Viewpoint Video (FVV) rendering algorithm that exploits the parallelism offered by General Purpose Graphics processing Units (GPGPUs). The system generates virtual views through the use of Depth image-Based Rendering (DIBR) algorithms, implemented using NVidia r Compute Unified Device Architecture (CUDA). A novel reference image brightness adjustment algorithm that exploits the correspondences between matching pixels in the reference images to avoid drastic brightness switching while navigating in between views is also discussed. The developed solution ensures that data transfers are kept at a minimum, thus improving the overall rendering speed. Objective and subjective test results show that, for typical free-view scenarios, the proposed algorithm can be successfully deployed in real-time FVV systems, providing a good Quality of Experience (QoE).
This paper presents a new approach to color image denoising under consideration of human visual system (HVS) model. The denosing process takes place in the wavelet transform domain. A contrast sensitivity function (CS...
详细信息
ISBN:
(纸本)0819450235
This paper presents a new approach to color image denoising under consideration of human visual system (HVS) model. The denosing process takes place in the wavelet transform domain. A contrast sensitivity function (CSF) implementation is employed into wavelet-based algorithm based on an invariant single factor weighting per subband and noise masking in succession. Experimental results show that the new approach is good in terms of perceptual error metrics and visual effect.
This paper presents a motion-based depth estimation algorithm for automatic 2D-to-3D video conversion algorithm by employing the co-occurrence matrix of motion vectors (MVCM). Video scenes possess distinct signatures ...
详细信息
ISBN:
(纸本)9781479902880
This paper presents a motion-based depth estimation algorithm for automatic 2D-to-3D video conversion algorithm by employing the co-occurrence matrix of motion vectors (MVCM). Video scenes possess distinct signatures of MVCM, which enables exploiting the corresponding motion-depth relation for depth generation. The subsequent motion-compensated depth updating scheme provides stable and comfort 3D visual quality as synthesized by depth-image-based rendering. The simulation results of several high-definition image sequences indicate that the proposed algorithm produces better and more reasonable depth than two motion-based depth estimation algorithms. With the adaptive depth estimation scheme using MVCM, the proposed 2D-to-3D video conversion algorithm can accommodate a great variety of visual contents. It thus provides an efficient and reliable solution towards the problem of automatic 3D video content creation.
Wyner-Ziv coding refers to lossy source coding with side information at the decoder. Recently some practical applications of Wyner-Ziv coding to video compression have been studied(1-4) due to its advantage of error r...
详细信息
ISBN:
(纸本)0819452114
Wyner-Ziv coding refers to lossy source coding with side information at the decoder. Recently some practical applications of Wyner-Ziv coding to video compression have been studied(1-4) due to its advantage of error robustness over standard video coding standards. Based on recent theoretical result on successive Wyner-Ziv coding, 5,6 we propose in this paper a practical layered Wyner-Ziv video codec using the DCT, nested scalar quantizer (NSQ), and irregular LDPC code based Slepian-Wolf coding (or lossless source coding with side information). The DCT is applied as an approximation to the conditional KLT,(7,8) which makes the components of the transformed block conditionally independent given the side information. NSQ is a binning scheme that facilitates layered bit-plane coding of the bin indices while reducing the bit rate. LDPC code based Slepian-Wolf coding exploits the correlation between the quantized version of the source and the side information to achieve further compression. Different from previous works, an attractive feature of our proposed system is that video encoding is done only once but decoding, allowed at many lower bit rates without quality loss.
Increasing the spatial resolution is an ongoing research topic in imageprocessing. A recently presented approach applies a non-regular sampling mask on a low resolution sensor and subsequently reconstructs the masked...
详细信息
ISBN:
(纸本)9781479961399
Increasing the spatial resolution is an ongoing research topic in imageprocessing. A recently presented approach applies a non-regular sampling mask on a low resolution sensor and subsequently reconstructs the masked area via an extrapolation algorithm to obtain a high resolution image. This paper introduces an acceleration of this approach for use with full color sensors. Instead of employing the effective, yet computationally expensive extrapolation algorithm on each of the three RGB channels, a color space conversion is performed and only the luminance channel is then reconstructed using this algorithm. As natural images contain much less information in the chrominance channels, a fast linear interpolation technique can here be used to accelerate the whole reconstruction procedure. Simulation results show that an average speed up factor of 2.9 is thus achieved, while the loss in visual quality stays imperceptible. Comparisons of PSNR results confirm this.
Object-oriented coding in the MPEG-4 standard enables the separate processing of foreground objects and the scene background (sprite). Since the background sprite only has to be sent once, transmission bandwidth can b...
详细信息
ISBN:
(纸本)0819452114
Object-oriented coding in the MPEG-4 standard enables the separate processing of foreground objects and the scene background (sprite). Since the background sprite only has to be sent once, transmission bandwidth can be saved. This paper shows that the concept of merging several views of a non-changing scene background into a sina-le back-round sprite is usually not the most efficient way to transmit the back-round image. We have found that the counter-intuitive approach of splitting the background into several independent parts can reduce the overall amount of data. For this reason, we propose an algorithm that provides an optimal partitioning of a video sequence into independent background sprites (a multi-sprite), resulting in a significant reduction of the involved coding cost. Additionally. our algorithm results in background sprites with better quality by ensuring that the sprite resolution has at least the final display resolution throughout the sequence. Even though our sprite generation algorithm creates multiple sprites instead of a single background sprite, it is fully compatible with the existing MPEG-4 standard. The algorithm has been evaluated with several test-sequences, including the well-known Table-tennis and Stefan sequences. The total coding cost could be reduced by factors of about 2.7 or even higher.
In this work, a novel bit allocation method based on visual attention and distortion sensitivity is developed for JPEG2000. Although, visual attention map for an image can be measured by using well-known saliency map ...
详细信息
ISBN:
(纸本)9781509064946
In this work, a novel bit allocation method based on visual attention and distortion sensitivity is developed for JPEG2000. Although, visual attention map for an image can be measured by using well-known saliency map methods, true visual attention map can be obtained by conducting experiments to determine fixation points and their durations. A perception model might turn these duration of fixations into visual attention levels. Besides visual attention, visual distortion sensitivity may guide the bit allocation process effectively. This is because human visual system is more sensitive to the distortion around the edges than the distortion in the complex textured areas. In this work, a novel visual distortion sensitivity method that considers all edges without using a threshold for gradient magnitude and uses local entropy of gradient orientation distribution is proposed. Thus, the visual attention and the distortion sensitivity level of each code-block determine its quantization parameters. Using bit allocation based on the visual attention map provides higher subjective evaluation score than using bit allocation based on the post compression rate-distortion optimization method or on a previously proposed method based on the saliency map. Secondly, it is shown that the use of visual distortion sensitivity allows higher objective evaluation scores to be attained.
In computer vision applications, image enhancement is important for improving image quality and extracting meaningful information. Noise removal is a commonly used technique in image enhancement. In this study, the Ba...
详细信息
ISBN:
(纸本)9798350388978;9798350388961
In computer vision applications, image enhancement is important for improving image quality and extracting meaningful information. Noise removal is a commonly used technique in image enhancement. In this study, the Batch Renormalization Denoising Network (BRDNet), which performs well in noise removal, is used as the base model with the use of the Bottleneck Attention Module (BAM) to achieve performance improvement. The proposed method is tested on different datasets with different noise levels and their results are compared. In quantitative experiments, an increase in the PSNR metric value was observed and the visual results were found to be closer to the target images.
Disparity estimation is an important technique in stereo video coding. This paper presents a disparity estimation algorithm based on edge detection. The algorithm makes full use of the human visual characteristics, th...
详细信息
ISBN:
(纸本)9781424448562
Disparity estimation is an important technique in stereo video coding. This paper presents a disparity estimation algorithm based on edge detection. The algorithm makes full use of the human visual characteristics, that is, the human eye is more sensitive to the distortion of the edge region. Therefore, joint estimation is used for edge detection. The large code block size for coding the background region and the flat areas while small size for coding the edge region were used in this paper. Compared to the disparity estimation algorithm proposed in 181, the proposed algorithm can greatly improve the encoding speed of stereo video without affecting subjective image quality.
暂无评论