Recent landing marches on Mars have enabled the access to Martian surface images, which act as an important vehicle to demystify the evolution and habitability of Mars, in terms of climate, geography, etc. Transmittin...
详细信息
Recent landing marches on Mars have enabled the access to Martian surface images, which act as an important vehicle to demystify the evolution and habitability of Mars, in terms of climate, geography, etc. Transmitting Martian images thus calls for efficient compression methods to ensure the high-quality reconstruction from distant communication, in which the research is yet to start. To address this issue, we propose in this letter a learned structure-based hybrid (LSH) framework to compress Martian images. More specifically, we first observe that the structural consistency exists across Martian images, which motivates us to propose a structural compression network (SCN). The aim of SCN is to compactly represent the structural information of Martian images, thus allowing for the compression at extremely low bit-rates. Then, we propose a detail compensation network (DCN) to reconstruct the missing details when we restore from the structural information, which benefits from improved compression efficiency by reduced bit-rates. The experimental results have verified the superior performances of our LSH method on compressing Martian images, against existing state-of-the-art methods.
Compressed-domain visual task schemes, where visual processing or computer vision are directly performed on the compressed-domain representations, were shown to achieve a higher computational efficiency during trainin...
详细信息
Compressed-domain visual task schemes, where visual processing or computer vision are directly performed on the compressed-domain representations, were shown to achieve a higher computational efficiency during training and deployment by avoiding the need to decode the compressed visual information while resulting in a competitive or even better performance as compared to corresponding spatial-domain visual tasks. This work is concerned with learning-based compressed-domain image classification, where the image classification is performed directly on compressed-domain representations, also known as latent representations, that are obtained using a learning-based visual encoder. In this paper, a compressed-domain Vision Transformer (cViT) is proposed to perform image classification in the learning-based compressed-domain. For this purpose, the Vision Transformer (ViT) architecture is adopted and modified to perform classification directly in the compressed-domain. As part of this work, a novel feature patch embedding is introduced leveraging the within- and cross-channel information in the compressed-domain. Also, an adaptation training strategy is designed to adopt the weights from the pre-trained spatial-domain ViT and adapt these to the compressed-domain classification task. Furthermore, the pre-trained ViT weights are utilized through interpolation for position embedding initialization to further improve the performance of cViT. The experimental results show that the proposed cViT outperforms the existing compressed-domain classification networks in terms of Top-1 and Top-5 classification accuracies. Moreover, the proposed cViT can yield competitive classification accuracies with a significantly higher computational efficiency as compared to pixel-domain approaches.
As a speciality, radiology produces the highest volume of medical images in clinical establishments compared to other commonly employed imaging modalities like digital pathology, ophthalmic imaging, etc. Archiving thi...
详细信息
As a speciality, radiology produces the highest volume of medical images in clinical establishments compared to other commonly employed imaging modalities like digital pathology, ophthalmic imaging, etc. Archiving this massive quantity of images with large file sizes is a major problem since the costs associated with storing medical images continue to rise with an increase in cost of electronic storage devices. One of the possible solutions is to compress them for effective storage. The prime challenge is that each modality is distinctively characterized by dynamic range and resolution of the signal and its spatial and statistical distribution. Such variations in medical images are different from camera-acquired natural scene images. Thus, conventional natural imagecompression algorithms such as J2K and JPEG often fail to preserve the clinically relevant details present in medical images. We address this challenge by developing a modality-specific compressor and a modality-agnostic generic decompressor implemented using a deep neural network (DNN) and capable of preserving clinically relevant image information. Architecture of the DNN is obtained through design space exploration (DSE) with the objective to feature the least computational complexity at the highest compression and a target high-quality factor, thereby leading to a low power requirement for computation. The neural compressed bitstream is further compressed using the lossless Huffman encoding to obtain a variable bit length and high-density compression (20 x -400x). Experimental validation is performed on X-ray, CT and MRI. Through quantitative measurement and clinical validation with a radiologist in the loop, we experimentally demonstrate our approach's performance superiority over traditional methods like JPEG and J2K operating at matching compression factors.
learning-basedimage coding has shown promising results during recent years. Unlike the traditional approaches to imagecompression, learning-based codecs exploit deep neural networks for reducing dimensionality of th...
详细信息
ISBN:
(数字)9781510645233
ISBN:
(纸本)9781510645233;9781510645226
learning-basedimage coding has shown promising results during recent years. Unlike the traditional approaches to imagecompression, learning-based codecs exploit deep neural networks for reducing dimensionality of the input at the stage where a linear transform would be typically applied previously. The signal representation after this stage, called latent space, carries the information in such a way that it can be interpreted by other deep neural networks without the need of decoding it. One of the tasks that can benefit from the above-mentioned possibility is super resolution. In this paper, we explore the possibilities and propose an approach for super resolution that is applied in the latent space. We focus on the fixed compression model, where the encoder part of the network is frozen and an enhanced decoder is learned. Additionally, we assess the performance of the proposed approach.
Nowadays, deep-learningimage coding solutions have shown similar or better compression efficiency than conventional solutions based on hand-crafted transforms and spatial prediction techniques. These deep-learning co...
详细信息
ISBN:
(纸本)9781665492577
Nowadays, deep-learningimage coding solutions have shown similar or better compression efficiency than conventional solutions based on hand-crafted transforms and spatial prediction techniques. These deep-learning codecs require a large training set of images and a training methodology to obtain a suitable model (set of parameters) for efficient compression. The training is performed with an optimization algorithm which provides a way to minimize the loss function. Therefore, the loss function plays a key role in the overall performance and includes a differentiable quality metric that attempts to mimic human perception. The main objective of this paper is to study the perceptual impact of several image quality metrics that can be used in the loss function of the training process, through a crowdsourcing subjective image quality assessment study. From this study, it is possible to conclude that the choice of the quality metric is critical for the perceptual performance of the deep-learning codec and that can vary depending on the image content.
Recent advances in sensor technology and wide deployment of visual sensors lead to a new application whereas compression of images are not mainly for pixel recovery for human consumption, instead it is for communicati...
详细信息
ISBN:
(纸本)9781728185514
Recent advances in sensor technology and wide deployment of visual sensors lead to a new application whereas compression of images are not mainly for pixel recovery for human consumption, instead it is for communication to cloud side machine vision tasks like classification, identification, detection and tracking. This opens up new research dimensions for a learningbasedcompression that directly optimizes loss function in vision tasks, and therefore achieves better compression performance vis-a-vis the pixel recovery and then performing vision tasks computing. In this work, we developed a learningbasedcompression scheme that learns a compact feature representation and appropriate bitstreams for the task of visual object detection. Variational Auto-Encoder (VAE) framework is adopted for learning a compact representation, while a bridge network is trained to drive the detection loss function. Simulation results demonstrate that this approach is achieving a new state-of-the-art in task driven compression efficiency, compared with pixel recovery approaches, including both learningbased and handcrafted solutions.
Recently, learning-based image compression has attracted a lot of attention, leading to the development of a new JPEG AI standard based on neural networks. Typically, this type of coding solution has much lower encodi...
详细信息
ISBN:
(纸本)9781728198354
Recently, learning-based image compression has attracted a lot of attention, leading to the development of a new JPEG AI standard based on neural networks. Typically, this type of coding solution has much lower encoding complexity compared to conventional coding standards such as HEVC and VVC (Intra mode) but has much higher decoding complexity. Therefore, to promote the wide adoption of learning-based image compression, especially to resource-constrained (such as mobile) devices, it is important to achieve lower decoding complexity even if at the cost of some coding efficiency. This paper proposes a complexity scalable decoder that can control the decoding complexity by proposing a novel procedure to learn the filters of the convolutional layers at the decoder by varying the number of channels at each layer, effectively having simple to more complex decoding networks. A regularization loss is employed with pruning after training to obtain a set of scalable layers, which may use more or fewer channels depending on the complexity budget. Experimental results show that complexity can be significantly reduced while still allowing a competitive rate-distortion performance.
The drastic growth of research in imagecompression, especially deep learning-based image compression techniques, poses new challenges to objective image quality assessment (IQA). Typical artifacts encountered in the ...
详细信息
The drastic growth of research in imagecompression, especially deep learning-based image compression techniques, poses new challenges to objective image quality assessment (IQA). Typical artifacts encountered in the emerging image codecs are significantly different from that produced by traditional block-based codecs, leading to inapplicability of the existing objective IQA algorithms. Towards advancing the development of objective IQA algorithms for recent compression artifacts, we built a learning-based compressed image quality assessment (LCIQA) database involving traditional block-basedimage codecs, hybrid neural network basedimage codecs, convolutional neural network based and generative adversarial network (GAN) based end-to-end optimized image coding approaches. Our study confirms the statistical difference and human perception difference between reconstructions of learned compression and traditional block-basedcompression. We propose a two-step deep learning model for learning-based compressed image quality assessment. Extensive experiments on LCIQA database demonstrate that our proposed model performs better than other counterparts on learning-based compressed images, especially on GAN compressed images, and achieves competitive performance to the state-of-the-art IQA metrics on traditional compressed images.
暂无评论