The large amount of floating-point data generated by scientific applications makes datacompression essential for 110 performance and efficient storage. However, floating-point data is difficult to compress losslessly...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
The large amount of floating-point data generated by scientific applications makes datacompression essential for 110 performance and efficient storage. However, floating-point data is difficult to compress losslessly, and most compression algorithms are only effective on some files. In this paper, we study the benefit of compressing each file with a potentially different algorithm. For this purpose, we created AdaptiveFC, which is based on a tool that can chain data transformations together to generate millions of compression algorithms. AdaptiveFC uses a genetic algorittm to quickly identify an effective compressor in this vast search space for a given file. A comparison of AdaptiveFC to 15 leading lossless CPU compressors on 77 files from 6 datasets in the STAIDench suite shows that per-file compression yields higher compression ratios on average than any individual algorithm.
Light Fields (LFs) are characterized by high dimension, complex structure and large amount of data. Therefore, the efficient compression of LF videos faces challenges. Existing methods use the traditional multi-view v...
详细信息
ISBN:
(纸本)9798350390155;9798350390162
Light Fields (LFs) are characterized by high dimension, complex structure and large amount of data. Therefore, the efficient compression of LF videos faces challenges. Existing methods use the traditional multi-view video encoder to compress LF videos, but the bitrates are still high. To this end, we propose to only encode sparse key view sequences in LF video for low bitrates. The proposed similarity-based prediction structure fully exploits the spatial-angular-temporal correlations in LF videos. At the decoder side, we propose to reconstruct the uncoded non-key view sequences by a two-step refinement reconstruction network. To avoid error propagation, the proposed reconstruction network firstly refines the coded key view sequences by reducing the compression artifacts. With regard to the distortion caused by warping, the network further refines the texture details of the synthesized non-key view sequences. The experiment results prove that the proposed algorithm performs superior at low bitrates.
Today, image and video data is not only viewed by humans, but also automatically analyzed by computer vision algorithms. However, current coding standards are optimized for human perception. Emerging from this, resear...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Today, image and video data is not only viewed by humans, but also automatically analyzed by computer vision algorithms. However, current coding standards are optimized for human perception. Emerging from this, research on video coding for machines tries to develop coding methods designed for machines as information sink. Since many of these algorithms are based on neural networks, most proposals for video coding for machines build upon neural compression. So far, optimizing the compression by applying the task loss of the analysis network, for which ground truth data is needed, is achieving the best coding performance. But ground truth data is difficult to obtain and thus an optimization without ground truth is preferred. In this paper, we present an annotation-free optimization strategy for video coding for machines. We measure the distortion by calculating the task loss of the analysis network. Therefore, the predictions on the compressed image are compared with the predictions on the original image, instead of the ground truth data. Our results show that this strategy can even outperform training with ground truth data with rate savings of up to 7.5 %. By using the non-annotated training data, the rate gains can be further increased up to 8.2 %.
Image compression is increasingly important in applications like intelligent driving and smart surveillance systems. This study presents a novel cross view capture distributed image compression network (CVCDIC) to imp...
详细信息
ISBN:
(纸本)9798350384581;9798350384574
Image compression is increasingly important in applications like intelligent driving and smart surveillance systems. This study presents a novel cross view capture distributed image compression network (CVCDIC) to improve the compression quality by using decoder side information. The CVCDIC's decoder utilizes feature extraction networks to extract features from both the primary image and the side information. Furthermore, a multi-level cross view attention module is designed to capture interrelated details between images at multiple hierarchical levels. Finally, a spatial refinement module, constructed on the foundation of information distillation networks, is designed to further refine the quality of reconstructed images. The results show that CVCDIC can achieve an MS-SSIM of 0.978 at 0.15 bpp, surpassing DSIN (0.925), NDIC (0.956), and ATN (0.955) on the KITTI Stereo dataset.
Learning-based point cloud compression has achieved tremendous progress in recent years. However, existing methods often train an optimal occupancy distribution predictor for the entire train dataset in an amortizatio...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Learning-based point cloud compression has achieved tremendous progress in recent years. However, existing methods often train an optimal occupancy distribution predictor for the entire train dataset in an amortization sense, which struggles to handle point clouds with unique characteristics. In this work, we focus on the lossless point cloud compression, and propose a novel context-adaptive entropy model to achieve adaptive occupancy prediction. Specifically, given a baseline entropy model and a point cloud, we firstly integrate adapters into diverse feature extraction modules. These adapters are then trained to be specifically attuned to the input cloud. Finally, the trained adapter parameters are encoded and transmitted along with the point cloud bitstream, which allow us to recover the integrated model in decoder. The experimental results demonstrate that our method can enhance the performance of the entropy model, especially improving the compression performance of data that performs poorly in conventional methods.
This paper explores the possibility of extending the capability of pre-trained neural image compressors (e.g., adapting to new data or target bitrates) without breaking back-ward compatibility, the ability to decode b...
ISBN:
(纸本)9798350353006
This paper explores the possibility of extending the capability of pre-trained neural image compressors (e.g., adapting to new data or target bitrates) without breaking back-ward compatibility, the ability to decode bitstreams encoded by the original model. We refer to this problem as continual learning of image compression. Our initial findings show that baseline solutions, such as end-to-end fine-tuning, do not preserve the desired backward compatibility. To tackle this, we propose a knowledge replay training strategy that effectively addresses this issue. We also design a new model architecture that enables more effective continual learning than existing baselines. Experiments are conducted for two scenarios: data-incremental learning and rate-incremental learning. The main conclusion of this paper is that neural image compressors can be fine-tuned to achieve better performance (compared to their pre-trained version) on new data and rates without compromising backward compatibility. The code is publicly available online.
Deep neural networks (DNNs) achieve state-of-theart performance in video anomaly detection. However, the usage of DNNs is limited in practice due to their computational overhead, generally requiring significant resour...
详细信息
ISBN:
(纸本)9798350351439;9798350351422
Deep neural networks (DNNs) achieve state-of-theart performance in video anomaly detection. However, the usage of DNNs is limited in practice due to their computational overhead, generally requiring significant resources and specialized hardware. Further, despite recent progress, current evaluation criteria of video anomaly detection algorithms are flawed, preventing meaningful comparisons among algorithms. In response to these challenges, we propose (1) a compression-based technique referred to as Spatio-Temporal N-Gram Prediction by Partial Matching (STNG PPM) and (2) simple modifications to current evaluation criteria for improved interpretation and broader applicability across algorithms. STNG PMM does not require specialized hardware, has few parameters to tune, and is competitive with DNNs on multiple benchmark data sets in video anomaly detection.
Compared to traditional image compression methods, learned image compression (LIC) methods have demonstrated increasingly superior rate-distortion performance. However, LIC networks are often regarded as black boxes, ...
详细信息
ISBN:
(纸本)9798350390155;9798350390162
Compared to traditional image compression methods, learned image compression (LIC) methods have demonstrated increasingly superior rate-distortion performance. However, LIC networks are often regarded as black boxes, still lacking a theoretical understanding. Sparse coding provides the sparse and interpretable modeling for analyzing or synthesizing natural images in various signal and image processing applications. Therefore, we introduce convolutional sparse coding (CSC) into transform network for enhancing the interpretability of LIC methods. In this paper, we first employ CSC layers to achieve certain theoretical modeling for LIC network, and adopt a weight sharing strategy in encoderdecoder pair and attention mechanism to balance the complexity and performance. Additionally, we analyze the model robustness against data input perturbations and consider the impact of sparsity trade-off parameter in the CSC layer optimization process. Experimental results demonstrate that our method achieves comparable performance with the corresponding baseline, and our model is more robust.
Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences using traditional video codecs for efficient encoding. However, this lossy compression scheme introduces artifacts...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Video-based point cloud compression (V-PCC) converts the dynamic point cloud data into video sequences using traditional video codecs for efficient encoding. However, this lossy compression scheme introduces artifacts that degrade the color attributes of the data. This paper introduces a framework designed to enhance the color quality in the V-PCC compressed point clouds. We propose the lightweight de-compression Unet (LDC-Unet), a 2D neural network, to optimize the projection maps generated during V-PCC encoding. The optimized 2D maps will then be back-projected to the 3D space to enhance the corresponding point cloud attributes. Additionally, we introduce a transfer learning strategy and develop a customized natural image dataset for the initial training. The model was then fine-tuned using the projection maps of the compressed point clouds. The whole strategy effectively addresses the scarcity of point cloud training data. Our experiments, conducted on the public 8i voxelized full bodies long sequences (8iVSLF) dataset, demonstrate the effectiveness of our proposed method in improving the color quality.
datacompression is critical in modern technological systems, enhancing memory storage efficiency, reducing transmission loads, improving device performance, and advancing data processing methodologies. In signal proc...
详细信息
暂无评论