The amount of volumetric brain image increases rapidly, which requires a vast amount of resources for storage and transmission, so it's urgent to explore an efficient volumetric compression method. Recent years ha...
详细信息
ISBN:
(纸本)9781728180687
The amount of volumetric brain image increases rapidly, which requires a vast amount of resources for storage and transmission, so it's urgent to explore an efficient volumetric compression method. Recent years have witnessed the progress of deep learning-based approaches for two-dimensional (2D) natural image compression, but the field of learned volumetric image compression still remains unexplored. In this paper, we propose the first end-to-end learning framework for volumetric image compression by extending the advanced techniques of 2D image compression to volumetric images. Specifically, a convolutional autoencoder is used to compress 3D image cubes, and the non-local attention models are embedded in the convolutional autoencoder to jointly capture local and global correlations. Both hyperprior and autoregressive models are used to perform the conditional probability estimation in entropy coding. To reduce model complexity, we introduce a convolutional long short-term memory network for the autoregressive model based on channel-wise prediction. Experimental results on volumetric mouse brain images show that the proposed method outperforms JPEG2000-3D, HEVC and state-of-the-art 2D methods.
The key task in image set compression is how to efficiently remove set redundancy among images and within a single image. In this paper, we propose the first multi-model prediction (MoP) method for image set compressi...
详细信息
ISBN:
(纸本)9781479902880
The key task in image set compression is how to efficiently remove set redundancy among images and within a single image. In this paper, we propose the first multi-model prediction (MoP) method for image set compression to significantly reduce inter image redundancy. Unlike the previous prediction methods, our MoP enhances the correlation between images using feature-based geometric multi-model fitting. Based on estimated geometric models, multiple deformed prediction images are generated to reduce geometric distortions in different image regions. The block-based adaptive motion compensation is then adopted to further eliminate local variances. Experimental results demonstrate the advantage of our approach, especially for images with complicated scenes and geometric relationships.
As more and more personal photos are shared and tagged in social media, security and privacy protection are becoming an unprecedentedly focus of attention. Avoiding privacy risks such as unintended verification, b eco...
详细信息
ISBN:
(纸本)9781728180687
As more and more personal photos are shared and tagged in social media, security and privacy protection are becoming an unprecedentedly focus of attention. Avoiding privacy risks such as unintended verification, b ecomes increasingly challenging. To enable people to enjoy uploading photos without having to consider these privacy concerns, it is crucial to study techniques that allow individuals to limit the identity information leaked in visual data. In this paper, we propose a novel hybrid model consists of two stages to generate visually pleasing deidentified f ace i mages a ccording t o a s ingle i nput. Meanwhile, we successfully preserve visual similarity with the original face to retain data usability. Our approach combines latest advances in GAN-based face generation with well-designed adjustable randomness. In our experiments we show visually pleasing deidentified output of our method while preserving a high similarity to the original image content. Moreover, our method adapts well to the verificator of unknown structure, which further improves the practical value in our real life.
Recently, plenoptic image has attracted great attentions because of its applications in various scenarios. However, high resolution and special pixel distribution structure bring huge challenges to its storage and tra...
详细信息
ISBN:
(纸本)9781728180687
Recently, plenoptic image has attracted great attentions because of its applications in various scenarios. However, high resolution and special pixel distribution structure bring huge challenges to its storage and transmission. In order to adapt compression to the structural characteristic of plenoptic image, in this paper, we propose a Data Structure Adaptive 3D-convolutional(DSA-3D) autoencoder. The DSA-3D autoencoder enables up-sampling and down-samping the sub-aperture sequence along the angular resolution or spatial resolution, thereby avoiding the artifacts caused by directly compressing plenoptic image and achieving better compression efficiency. In addition, we propose a special and efficient S quare rearrangement to generate sub-aperture sequence. We compare Square with Zigzag sub-aperture sequence rearrangements, and analyzed the compression efficiency of block image compression and whole image compression. Compared with traditional hybrid encoders HEVC, JPEG2000 and JPEG PLENO(WaSP), the proposed DSA-3D(Square) autoencoder achieves a superior performance in terms of PSNR metrics.
In this paper, we present a novel design of a wavelet-based video coding algorithm within a conventional hybrid framework of temporal motion-compensated prediction and transform coding. Our proposed algorithm involves...
详细信息
ISBN:
(纸本)0819444111
In this paper, we present a novel design of a wavelet-based video coding algorithm within a conventional hybrid framework of temporal motion-compensated prediction and transform coding. Our proposed algorithm involves the incorporation of multi-frame motion compensation as an effective means of improving the quality of the temporal prediction. In addition, we follow the rate-distortion optimizing strategy of using a Lagrangian cost function to discriminate between different decisions in the video encoding process. Finally, we demonstrate that context-based adaptive arithmetic coding is a key element for fast adaptation and high coding efficiency. The combination of overlapped block motion compensation and frame-based transform coding enables blocking-artifact free and hence subjectively more pleasing video. In comparison with a highly optimized MPEG-4 Advanced Simple Profile coder, our proposed scheme provides significant performance gains in objective quality of 2.0-3.5 dB PSNR.
作者:
Yang, JDelp, EJPurdue Univ
Sch Elect & Comp Engn VIPER Video & Image Proc Lab W Lafayette IN 47907 USA
A synchronization scheme that can specify the start position of each macroblock in a compressed video bitstream for low data rate wireless applications is proposed by extending the Error Resilient Entropy Coding (EREC...
详细信息
ISBN:
(纸本)0819444111
A synchronization scheme that can specify the start position of each macroblock in a compressed video bitstream for low data rate wireless applications is proposed by extending the Error Resilient Entropy Coding (EREC) method. Our scheme is implemented in the form of a transcoder, placed before and after the channel, of a MPEG-4 Simple Profile bitstream. We compare our proposed technique using reconstructed video quality (measured in PSNR) and the length of the extra redundancy bits incurred by the transcoder with the error resilient tools of H.263(+). A simple syntax-based codeword repair method is also proposed so that the transcoder generates a MPEG-4 compliant bitstream which can then be decoded with a standard MPEG-4 decoder such as MoMuSys (FDIS V1.0).
Since human vision has much greater resolutions at the center of our visual field than elsewhere, different criteria of quality assessment should be applied on the image areas with different visual resolutions. This p...
详细信息
ISBN:
(纸本)9781479961399
Since human vision has much greater resolutions at the center of our visual field than elsewhere, different criteria of quality assessment should be applied on the image areas with different visual resolutions. This paper proposed a foveation-based image quality assessment method which adopted different sizes of windows in quality assessment for a single image. visual salience models which estimate visual attention regions are used to determine the foveation center and foveation resolution models are used to guide the selection of window sizes for the areas over spatial extent of the image. Finally, the quality scores obtained from different window sizes are pooled together to get a single value for the image. The proposed method has been applied to IQA metrics, SSIM, PSNR, and UQI. The result shows that both Spearman and Kendall correlation coefficients can be improved significantly by our foveation-based method.
Depth image upsampling is an important issue in three-dimensional (3D) applications. However, edge blurring artifacts are still challenging problems in depth image upsampling, resulting in jagged artifacts in synthesi...
详细信息
ISBN:
(纸本)9781479902880
Depth image upsampling is an important issue in three-dimensional (3D) applications. However, edge blurring artifacts are still challenging problems in depth image upsampling, resulting in jagged artifacts in synthesized views which produce unpleasant visual perception. In this paper, an edge-preserving single depth image interpolation (ESDI) method is proposed. Specifically, local planar hypothesis (LPH) assuming that depth in natural scene are clustered as local planar planes is first explored. Then finite candidates generation (FCG) is proposed to generate limited discrete values satisfied with LPH to interpolated pixels. At last, the optimal combination of candidates is formulated as an energy minimization problem with a constraint in gradient domain, solved by iterated conditional modes (ICM) algorithm. Experiments demonstrate that ESDI achieves high resolution (HR) depth image with clear and sharp edges, and produces synthesized views with desirable quality.
With the rapid development of video-on-demand (VOD) and real-time streaming video technologies, the accurate objective assessment of streaming video Quality of Experience (QoE) has become a focal point for optimizing ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
With the rapid development of video-on-demand (VOD) and real-time streaming video technologies, the accurate objective assessment of streaming video Quality of Experience (QoE) has become a focal point for optimizing streaming-related technologies. However, due to the inherent transmission distortions caused by poor Quality of Service (QoS) conditions in streaming videos, such as intermittent stalling, rebuffering, and drastic changes in video sharpness due to bitrate fluctuations, evaluating streaming video QoE presents numerous challenges. This paper introduces a large and diverse in-the-wild streaming video QoE evaluation dataset - the SJLIVE-1k dataset. This work addresses the limitations of corresponding datasets, which lack in-the-wild video sequences under real network conditions and whose amount of video content is insufficient. Furthermore, we propose an end-to-end objective QoE evaluation strategy that extracts video content and QoS features from the video itself without using any extra information. By implementing self-supervised contrastive learning as the "reminder" to bridge the gap between the different types of features, our approach achieves state-of-the-art results across three datasets. Our proposed dataset will be released to facilitate further research.
Sea fog recognition is a challenging and significant semantic segmentation task in remote sensing images. The fully supervised learning method relies on the pixel-level label, which is labor-intensive and time-consumi...
详细信息
ISBN:
(纸本)9781665475921
Sea fog recognition is a challenging and significant semantic segmentation task in remote sensing images. The fully supervised learning method relies on the pixel-level label, which is labor-intensive and time-consuming. Moreover, it is impossible to accurately annotate all pixels of the sea fog region due to the limited ability of the human eye to distinguish between low clouds and sea fog. In this paper, we propose a novel approach of point-based annotation for weakly supervised semantic segmentation with the auxiliary information of International Comprehensive Ocean-Atmosphere Data Set (ICOADS) visibility data. It only needs several definite points for both foreground and background, which significantly reduces the annotation cost of manpower. We conduct extensive experiments on Himawari-8 satellite remote sensing images to demonstrate the effectiveness of our annotation method. The mean intersection over union (mIoU) and overall recognition accuracy of our annotation method reach 82.72% and 95.18%, respectively. Compared with the fully supervised learning method, the accuracy and the recognition rate of sea fog area are improved with a maximum increase of 7.69% and 9.69%, respectively.
暂无评论