In this paper, we develop a tree-structured predictive partial matching (PPM) scheme for progressive compression of PointTexture images. By incorporating PPM with tree-structured coding, the proposed algorithm can com...
详细信息
ISBN:
(纸本)0819452114
In this paper, we develop a tree-structured predictive partial matching (PPM) scheme for progressive compression of PointTexture images. By incorporating PPM with tree-structured coding, the proposed algorithm can compress 3D depth information progressively into a single bitstream. Also, the proposed algorithm compresses color information using a differential pulse coding modulation (DPCM) coder and interweaves the compressed depth and color information efficiently. Thus, the decoder can reconstruct 3D models from the coarsest resolution to the highest resolution from a single bitstream. Simulation results demonstrate that the proposed algorithm provides much better compression performance than a universal Lempel-Ziv coder, WinZip.
To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing methods do not fully eliminate the redundancy between featur...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
To achieve efficient compression for both human vision and machine perception, scalable coding methods have been proposed in recent years. However, existing methods do not fully eliminate the redundancy between features corresponding to different tasks, resulting in suboptimal coding performance. In this paper, we propose a frequency-aware hierarchical image compression framework designed for humans and machines. Specifically, we investigate task relationships from a frequency perspective, utilizing only HF information for machine vision tasks and leveraging both HF and LF features for image reconstruction. Besides, the residual block embedded octave convolution module is designed to enhance the information interaction between HF features and LF features. Additionally, a dual-frequency channel-wise entropy model is applied to reasonably exploit the correlation between different tasks, thereby improving multi-task performance. The experiments show that the proposed method offers -69.3%similar to-75.3% coding gains on machine vision tasks compared to the relevant benchmarks, and -19.1% gains over state-of-the-art scalable image codec in terms of image reconstruction quality.
作者:
Li, ZDelp, EJPurdue Univ
Sch Elect & Comp Engn VIPER Video & Image Proc Lab W Lafayette IN 47907 USA
This paper investigates the motion prediction techniques used in hybrid video coding. We first present a unified interpretation of motion prediction in terms of the prediction of motion threads. It is demonstrated tha...
详细信息
ISBN:
(纸本)0819452114
This paper investigates the motion prediction techniques used in hybrid video coding. We first present a unified interpretation of motion prediction in terms of the prediction of motion threads. It is demonstrated that most current motion prediction techniques can be regarded as linear predictors of motion threads. Based on this new interpretation of motion prediction, we discuss the optimal motion predictor in the framework of Markov universal prediction. We define Markov predictability in a way that it upper bounds the optimal prediction performance in perfect reconstruction scenario. Since most current video applications use lossy coding, this results in imperfect reconstructions of the motion threads used in prediction. However, the optimality with the above perfect reconstruction scenario still holds in this case in an almost sure sense.
The fact that perception of facial beauty may be a universal concept has long been debated amongst psychologists and anthropologists. In this paper, we performed experiments to evaluate the extent of beauty universali...
详细信息
ISBN:
(纸本)0819452114
The fact that perception of facial beauty may be a universal concept has long been debated amongst psychologists and anthropologists. In this paper, we performed experiments to evaluate the extent of beauty universality by asking a number of diverse human referees to grade a same collection of female facial images. Results obtained show that the different individuals gave similar votes, thus well supporting the concept of beauty universality. We then trained an automated classifier using the human votes as the ground truth and used it to classify an independent test set of facial images. The high accuracy achieved proves that this classifier can be used as a general, automated tool for objective classification of female facial beauty. Potential applications exist in the entertainment industry and plastic surgery.
A 4-dimensional (4D) image can be viewed as a stack of volumetric images over channels of observation depth or temporal frames. This data contains rich information at the cost of high demands for storage and transmiss...
详细信息
ISBN:
(纸本)9781728180687
A 4-dimensional (4D) image can be viewed as a stack of volumetric images over channels of observation depth or temporal frames. This data contains rich information at the cost of high demands for storage and transmission resources due to its large volume. In this paper, we present a lossless 4D image compression algorithm by extending CCSDS-123.0-B-1 standard. Instead of separately compressing the volumetric image at each channel of 4D images, the proposed algorithm efficiently exploits redundancy across the fourth dimension of data. Experiments conducted on two types of 4D images demonstrate the effectiveness of the proposed lossless compression method.
A larger portion of fake news quotes untampered images from other sources with ulterior motives rather than conducting image forgery. Such elaborate engraftments keep the inconsistency between images and text reports ...
详细信息
ISBN:
(纸本)9781728180687
A larger portion of fake news quotes untampered images from other sources with ulterior motives rather than conducting image forgery. Such elaborate engraftments keep the inconsistency between images and text reports stealthy, thereby, palm off the spurious for the genuine. This paper proposes an architecture named News image Steganography (NIS) to reveal the aforementioned inconsistency through image steganography based on GAN. Extractive summarization about a news image is generated based on its source texts, and a learned steganographic algorithm encodes and decodes the summarization of the image in a manner that approaches perceptual invisibility. Once an encoded image is quoted, its source summarization can be decoded and further presented as the ground truth to verify the quoting news. The pairwise encoder and decoder endow images of the capability to carry along their imperceptible summarization. Our NIS reveals the underlying inconsistency, thereby, according to our experiments and investigations, contributes to the identification accuracy of fake news that engrafts untampered images.
Flat panel displays, such as liquid crystal displays (LCDs), typically emit light during the whole frame time. In contrast, traditional cathode ray tubes (CRTs) emit light as very short pulses, which gives the CRT a b...
详细信息
ISBN:
(纸本)0819452114
Flat panel displays, such as liquid crystal displays (LCDs), typically emit light during the whole frame time. In contrast, traditional cathode ray tubes (CRTs) emit light as very short pulses, which gives the CRT a better dynamic resolution. As a consequence, LCDs suffer from motion artifacts, which are visible as a blurring of moving objects. Based on a straightforward frequency domain analysis that takes into account the eye tracking of the viewer, we propose a method for reducing these artifacts. This method, 'motion compensated inverse filtering', uses motion vectors to apply a pre-correction to the video data. As such, we are able to recover the sharpness of moving images on LCDs to a large extent.
In the visual inspection, the quality assurance is difficult, because the dispersion occurs in the result by skill and fatigue degree of the inspector. Recently, a visual inspection method by imageprocessing using de...
详细信息
ISBN:
(纸本)9781665435536
In the visual inspection, the quality assurance is difficult, because the dispersion occurs in the result by skill and fatigue degree of the inspector. Recently, a visual inspection method by imageprocessing using deep learning has been proposed. When using deep learning, the dataset to be used is important. In this paper, we describe a method for detecting painting defects using imageprocessing, automatically generating data for deep learning, and using these data for classification using deep learning.
A tensor display is a type of 3D light field display, composed of multiple transparent screens and a back-light that can render a scene with correct depth, allowing to view a 3D scene without wearing glasses. The anal...
详细信息
ISBN:
(纸本)9781665475921
A tensor display is a type of 3D light field display, composed of multiple transparent screens and a back-light that can render a scene with correct depth, allowing to view a 3D scene without wearing glasses. The analysis of state-of-the-art tensor displays assumes that the content is Lambertian. In order to extend its capabilities, we analyze the limitations of displaying non-Lambertian scenes and propose a new method to factorize the non-Lambertian scenes using disparity analysis. Moreover, we demonstrate a new prototype of a tensor display with three layers of full HD content at 60 fps. Compared with state-of-the-art, the evaluation results verify that the proposed non-Lambertian rendering method can display a higher quality for non-Lambertian scenes on both simulation and a prototyped tensor display.
With the rapid development of whole brain imaging technology, a large number of brain images have been produced, which puts forward a great demand for efficient brain image compression methods. At present, the most co...
详细信息
ISBN:
(纸本)9781728185514
With the rapid development of whole brain imaging technology, a large number of brain images have been produced, which puts forward a great demand for efficient brain image compression methods. At present, the most commonly used compression methods are all based on 3-D wavelet transform, such as JP3D. However, traditional 3-D wavelet transforms are designed manually with certain assumptions on the signal, but brain images are not as ideal as assumed. What's more, they are not directly optimized for compression task. In order to solve these problems, we propose a trainable 3-D wavelet transform based on the lifting scheme, in which the predict and update steps are replaced by 3-D convolutional neural networks. Then the proposed transform is embedded into an end-to-end compression scheme called iWave3D, which is trained with a large amount of brain images to directly minimize the rate-distortion loss. Experimental results demonstrate that our method outperforms JP3D significantly by 2.012 dB in terms of average BD-PSNR.
暂无评论