Learned image compression (LIC) has illustrated good ability for reconstruction quality driven tasks (e.g. PSNR, MS-SSIM) and machine vision tasks such as image understanding. However, most LIC frameworks are based on...
详细信息
ISBN:
(纸本)9781728185514
Learned image compression (LIC) has illustrated good ability for reconstruction quality driven tasks (e.g. PSNR, MS-SSIM) and machine vision tasks such as image understanding. However, most LIC frameworks are based on pixel domain, which requires the decoding process. In this paper, we develop a learned compressed domain framework for machine vision tasks. 1) By sending the compressed latent representation directly to the task network, the decoding computation can be eliminated to reduce the complexity. 2) By sorting the latent channels by entropy, only selective channels will be transmitted to the task network, which can reduce the bitrate. As a result, compared with the traditional pixel domain methods, we can reduce about 1/3 multiply-add operations (MACS) and 1/5 inference time while keeping the same accuracy. Moreover, proposed channel selection can contribute to at most 6.8% bitrate saving.
Most existing digital image watermarking schemes tend to have the inherent conflicts between imperceptibility and robustness because watermarks are embedded by parameter modification. image watermark techniques resolv...
详细信息
ISBN:
(纸本)9781479903085
Most existing digital image watermarking schemes tend to have the inherent conflicts between imperceptibility and robustness because watermarks are embedded by parameter modification. image watermark techniques resolve this dilemma by extracting invariant features from the image as "embedded" watermark. In this paper we propose an image zero watermark scheme based on visual attention regions of images. In the proposed scheme visual attention model carefully selects top-N salient areas, where a set of selected Scale Invariant Feature transform (SIFT) descriptors are extracted as a watermark. The distance of each pair of SIFT descriptors from the reference and test images are calculated by Kullback-Leibler (KL) divergence after mapping into the high dimensional space respectively. The final distance of two sets of SIFT descriptor are determined by ensemble similarity. The experimental results indicate that the proposed scheme outperforms image zero watermark scheme based on color and edge histograms (CEH) and is robust to attacks of geometric distortion, contrast/luminance distortion and JPEG compression.
Recent advances in sensor technology and wide deployment of visual sensors lead to a new application whereas compression of images are not mainly for pixel recovery for human consumption, instead it is for communicati...
详细信息
ISBN:
(纸本)9781728185514
Recent advances in sensor technology and wide deployment of visual sensors lead to a new application whereas compression of images are not mainly for pixel recovery for human consumption, instead it is for communication to cloud side machine vision tasks like classification, identification, detection and tracking. This opens up new research dimensions for a learning based compression that directly optimizes loss function in vision tasks, and therefore achieves better compression performance vis-a-vis the pixel recovery and then performing vision tasks computing. In this work, we developed a learning based compression scheme that learns a compact feature representation and appropriate bitstreams for the task of visual object detection. Variational Auto-Encoder (VAE) framework is adopted for learning a compact representation, while a bridge network is trained to drive the detection loss function. Simulation results demonstrate that this approach is achieving a new state-of-the-art in task driven compression efficiency, compared with pixel recovery approaches, including both learning based and handcrafted solutions.
Many image compression standards such as JPEG, MPEG or H.263 are based on the discrete cosine transform (DCT), quantization, and Huffman coding. Quantization error is the major source of image quality degradation. The...
详细信息
ISBN:
(纸本)0819452114
Many image compression standards such as JPEG, MPEG or H.263 are based on the discrete cosine transform (DCT), quantization, and Huffman coding. Quantization error is the major source of image quality degradation. The current dequantization method assumes the uniform distribution of DCT coefficients. Therefore the reconstruction value is the center of each quantization interval. However DCT coefficients are regarded to follow Laplacian probability density function (pdf). We derive an optimal reconstruction value in closed form assuming Laplacian pdf, and show the effect of the correction on image quality. We estimate the Laplacian pdf parameter for each DCT coefficient, and obtain a correction for reconstruction value from the proposed theoretical predictions. The corrected value depends on the Laplacian pdf parameter and the quantization step size Q. The effect of PSNR improvement due to the change in dequantization value is about 0.2 similar to 0.4 dB. We also analyze the reason for the limited improvements.
In this paper, we present a novel hybrid image coding scheme for real-time applications of computer screen video transmission. Based on the Mixed Raster Content (MRC) multilayer imaging model, the background picture i...
详细信息
ISBN:
(纸本)0819452114
In this paper, we present a novel hybrid image coding scheme for real-time applications of computer screen video transmission. Based on the Mixed Raster Content (MRC) multilayer imaging model, the background picture is compressed with lossy JPEG, and the foreground layer consisting of text and graphics is compressed with a block-based lossless coding algorithm, which integrates shape-based coding, palette-based coding, palette reuse, and LZW algorithm. The key technique is to extract text and graphics from background pictures accurately and with low complexity. Shape primitives, such as lines, rectangles, and isolated pixels with prominent colors, are found to be significant clues for textual and graphical contents. The shape-based coding in our lossless algorithm provides intelligence to extract the computer-generated text and graphics elegantly and easily. Experimental results demonstrate the efficiency and low complexity of our proposed hybrid image coding scheme.
Synthesizing images from text is an important problem and has various applications. Most of the existing studies of text-to-image generation utilize supervised methods and rely on a fully-labeled dataset, but detailed...
详细信息
ISBN:
(纸本)9781728180687
Synthesizing images from text is an important problem and has various applications. Most of the existing studies of text-to-image generation utilize supervised methods and rely on a fully-labeled dataset, but detailed and accurate descriptions of images are onerous to obtain. In this paper, we introduce a simple but effective semi-supervised approach that considers the feature of unlabeled images as " Pseudo Text Feature". Therefore, the unlabeled data can participate in the following training process. To achieve this, we design a Modality-invariant Semanticconsistent Module which aims to make the image feature and the text feature indistinguishable and maintain their semantic information. Extensive qualitative and quantitative experiments on MNIST and Oxford-102 flower datasets demonstrate the effectiveness of our semi-supervised method in comparison to supervised ones. We also show that the proposed method can be easily plugged into other visual generation models such as image translation and performs well.
Advances in cameras and web technology have made it easy to capture and share large amounts of face videos over to an unknown audience with uncontrollable purposes. These raise increasing concerns about unwanted ident...
详细信息
ISBN:
(纸本)9781728185514
Advances in cameras and web technology have made it easy to capture and share large amounts of face videos over to an unknown audience with uncontrollable purposes. These raise increasing concerns about unwanted identity-relevant computer vision devices invading the characters's privacy. Previous de-identification methods rely on designing novel neural networks and processing face videos frame by frame, which ignore the data feature in redundancy and continuity. Besides, these techniques are incapable of well-balancing privacy and utility, and per-frame evaluation is easy to cause flicker. In this paper, we present deep motion flow, which can create remarkable de-identified face videos with a good privacy-utility tradeoff. It calculates the relative dense motion flow between every two adjacent original frames and runs the high quality image anonymization only on the first frame. The de-identified video will be obtained based on the anonymous first frame via the relative dense motion flow. Extensive experiments demonstrate the effectiveness of our proposed de-identification method.
While saliency detection for images has been extensively studied during the past decades, only a little work explores the influence of different viewing devices (i.e., tablet computer, mobile phone) towards human visu...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
While saliency detection for images has been extensively studied during the past decades, only a little work explores the influence of different viewing devices (i.e., tablet computer, mobile phone) towards human visual attention behavior. The lack of research in this field hinders the research progress in cross-device image saliency detection. In this paper, we first establish a novel cross-device saliency detection (CDSD) database based on eye-tracking experiments and investigate subjects' visual attention behavior when using different viewing devices. Then, we evaluate several classic saliency detection models using the CDSD database and the evaluation results indicate that the cross-device performance of these models need further improvement. Finally, some meaningful discussions are provided which might enlighten the design of cross-device saliency detection model. The proposed CDSD database will be made publicly available.
Applying encryption technology to image retrieval can ensure the security and privacy of personal images. The related researches in this field have focused on the organic combination of encryption algorithm and artifi...
详细信息
ISBN:
(纸本)9781665475921
Applying encryption technology to image retrieval can ensure the security and privacy of personal images. The related researches in this field have focused on the organic combination of encryption algorithm and artificial feature extraction. Many existing encrypted image retrieval schemes cannot prevent feature leakage and file size increase or cannot achieve satisfied retrieval performance. In this paper, a new end-to-end encrypted image retrieval scheme is presented. First, images are encrypted by using block rotation, new orthogonal transforms and block permutation during the JPEG compression process. Second, we combine the triplet loss and the cross entropy loss to train a network model, which contains gMLP modules, by end-to-end learning for extracting cipher-images' features. Compared with manual features extraction such as extracting color histogram, the end-to-end mechanism can economize on manpower. Experimental results show that our scheme has good retrieval performance, while can ensure compression friendly and no feature leakage.
This paper proposes a fully digital auto-focusing algorithm for restoring the image with differently out-of-focused objects, which can restore background as well as all objects. In this paper, we assume that out-of-fo...
详细信息
ISBN:
(纸本)0819452114
This paper proposes a fully digital auto-focusing algorithm for restoring the image with differently out-of-focused objects, which can restore background as well as all objects. In this paper, we assume that out-of-focus blur is isotropic such as circle of confusion (COC) or two-dimensional Gaussian blur. Therefore, the proposed algorithm can segment and estimate the point spread function (PSF) by using the size of ramp in the one-dimensional step response. The proposed algorithm can be developed by object-based image segmentation and restoration algorithm. Experimental results show that the proposed object-based image restoration algorithm can efficiently remove the space-variant out of focus blur from the image with multiple blurred objects.
暂无评论