This paper provides a brief overview on the innovative problem of devising and implementing big OLAP data cube compression algorithms in column-oriented Cloud/Edge data infrastructures, an emerging need for next-gener...
详细信息
Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to conve...
详细信息
ISBN:
(纸本)9781665478939
Current methods which compress multisets at an optimal rate have computational complexity that scales linearly with alphabet size, making them too slow to be practical in many real-world settings. We show how to convert a compression algorithm for sequences into one for multisets, in exchange for an additional complexity term that is quasi-linear in sequence length. This allows us to compress multisets of independent and identically distributed symbols at an optimal rate, with computational complexity decoupled from the alphabet size. The key insight is to avoid encoding the multiset directly, and instead compress a proxy sequence, using a technique called `bits-back coding'. We demonstrate the method experimentally on two tasks which are intractible with previous optimal-rate methods: compression of multisets of images and JavaScript Object Notation (JSON) files. Code for our experiments is available at https://***/facebookresearch/multiset-compression.
data encoding is a key step in almost all applications of quantum computation, but one that is not very well understood in general. Extensively researched in quantum machine learning, it is pivotal to exploiting the s...
详细信息
ISBN:
(纸本)9798331541378
data encoding is a key step in almost all applications of quantum computation, but one that is not very well understood in general. Extensively researched in quantum machine learning, it is pivotal to exploiting the speedups offered by quantum algorithms. Failure to efficiently represent data risks weakening or nullifying prospective quantum advantages. While naive methods, such as basis encoding, will almost always work, they are extremely inefficient in terms of qubit use, rendering them infeasible for near-term applications. We focus on data encoding for genomics, a deeply data-driven field with computationally intensive algorithms throughout, from the initial DNA sequencing to pangenome assembly and more. We first examine the compressibility of genomic data from an information-theoretic perspective and how compression affects the quantum state preparation complexity. Here, there are two opposing forces at work: compression reduces data size but at the cost of the beneficial structure within the data. We review some standard approaches to data encoding in the context of genomics data from considering both the preparation complexity in terms of gate counts, qubit use, and depth, as well as the usefulness of the resulting states. We also present some new data encoding methods designed specifically for genomics that have a low-depth preparation circuits while still maintaining the essential features the algorithms require.
We propose a new structured pruning framework for compressing Deep Neural Networks (DNNs) with skip-connections, based on measuring the statistical dependency of hidden layers and predicted outputs. The dependence mea...
详细信息
ISBN:
(纸本)9781665478939
We propose a new structured pruning framework for compressing Deep Neural Networks (DNNs) with skip-connections, based on measuring the statistical dependency of hidden layers and predicted outputs. The dependence measure defined by the energy statistics of hidden layers serves as a model-free measure of information between the feature maps and the output of the network. The estimated dependence measure is subsequently used to prune a collection of redundant and uninformative layers. Extensive numerical experiments on various architectures show the efficacy of the proposed pruning approach with competitive performance to state-of-the-art methods.
Training deep learning (DL) models often takes a significant amount of time and is thus typically performed on expensive GPUs to speed up the process. However, data loading has recently been identified as one of the m...
详细信息
Multimedia compression is a fundamental and significant research topic in the industrial field in the past several decades attempting to improve compression techniques. It is always a trade-off between size and qualit...
详细信息
Semantic communications aim to convey precise meanings efficiently rather than transmitting bits accurately. Inspired by lossy compression with side information, we propose a communication scheme that transmits only t...
详细信息
In video compression, coding efficiency is improved by reusing pixels from previously decoded frames via motion and residual compensation. We define two levels of hierarchical redundancy in video frames: 1) first-orde...
详细信息
ISBN:
(纸本)9781665493468
In video compression, coding efficiency is improved by reusing pixels from previously decoded frames via motion and residual compensation. We define two levels of hierarchical redundancy in video frames: 1) first-order: redundancy in pixel space, i.e., similarities in pixel values across neighboring frames, which is effectively captured using motion and residual compensation, 2) second-order: redundancy in motion and residual maps due to smooth motion in natural videos. While most of the existing neural video coding literature addresses first-order redundancy, we tackle the problem of capturing second-order redundancy in neural video codecs via predictors. We introduce generic motion and residual predictors that learn to extrapolate from previously decoded data. These predictors are lightweight, and can be employed with most neural video codecs in order to improve their rate-distortion performance. Moreover, while RGB is the dominant colorspace in neural video coding literature, we introduce general modifications for neural video codecs to embrace the YUV420 colorspace and report YUV420 results. Our experiments show that using our predictors with a well-known neural video codec leads to 38% and 34% bitrate savings in RGB and YUV420 colorspaces measured on the UVG dataset.
To enhance the efficiency and accuracy of direction of arrival (DOA) estimation, a novel self-calibrating algorithm is introduced, which incorporates channel compression to address array amplitude-phase errors. By ini...
详细信息
Convolutional neural networks (CNNs) possess strong expressive capabilities, but with the increasing size and cost of modern CNNs, deploying large-scale CNNs in resource-limited environments remains a challenge, makin...
详细信息
暂无评论