A generalization of compression via substringe numeration (CSE) for k-th order Markov sources with a finite alphabet is proposed, and an upper bound of the code word length of the proposed method is presented. We anal...
详细信息
A generalization of compression via substringe numeration (CSE) for k-th order Markov sources with a finite alphabet is proposed, and an upper bound of the code word length of the proposed method is presented. We analyze the worst case maximum redundancy of CSE for k-th order Markov sources with a finite alphabet. The compression ratio of the proposed method asymptotically converges to the optimal one fork-th order Markov sources with a finite alphabet if the length n of a source string tends to infinity.
In this paper, we investigate the redundancy in the universal compression of finite-length smooth parametric sources. Rissanen demonstrated that for a smooth parametric source with $d$ unknown parameters, the expected...
详细信息
ISBN:
(纸本)9780769543529
In this paper, we investigate the redundancy in the universal compression of finite-length smooth parametric sources. Rissanen demonstrated that for a smooth parametric source with $d$ unknown parameters, the expected redundancy for regular codes is asymptotically given by $frac{d}sourcelog n + o(log n)$ for almost all sources. Clarke and Barron derived the "minimax expected redundancy" for memoryless sources, which is the maximum redundancy of the best code over the space of source parameters. However, the minimax redundancy is for a particular parameter value, which does not provide much insight about different source parameters. We derived a lower bound on the compression of finite-length memoryless sequences using a probabilistic treatment. In this paper, we extend our analysis to smooth parametric sequences. We focus on two-part codes with an asymptotic $O(1)$ extra redundancy. We also require that the length function be regular, which is not restrictive since all codes that we know are regular. We derive a lower bound on the probability that the source is compressed with redundancy greater than any redundancy level $R_0$, i.e., we find a lower bound on $mb{P}[R_n(l_{2p},theta)>R_0]$, where $R_n(l_{2p},theta)$ is the redundancy in the compression of a parametric sequence of length $n$ using a two-part length function $l_{2p}$ for the source parameter $theta$.
A generalized latent semantic analysis framework using a universal source coding algorithm for content-based image retrieval is proposed. By the multidimensional incremental parsing algorithm which is considered as a ...
详细信息
ISBN:
(纸本)9781424417650
A generalized latent semantic analysis framework using a universal source coding algorithm for content-based image retrieval is proposed. By the multidimensional incremental parsing algorithm which is considered as a multidimensional extension of the Lempel-Ziv data compression method, a given image is compressed at a moderate bitrate while constructing the dictionary which implicitly embeds source statistics. Instead of concatenating all the corresponding dictionaries of an image corpus, we sequentially compress images using a previously constructed dictionary and end up with a visual lexicon which contains the least number of visual words covering all the images in the corpus. From the latent semantic analysis of the co-occurrence pattern of visual words over the images, a similarity between a given query and an image from the corpus is measured. An application of the proposed technique on a database of 20,000 natural scene images has demonstrated that the performance of the proposed system is favorable to that of existing approaches.
We consider a novel variant of lossy coding in which the distortion measure is revealed only to the encoder and only at run-time, as well as an extension of it in which the distortion constraint is also revealed at ru...
详细信息
ISBN:
(纸本)9781665421607;9781665421591
We consider a novel variant of lossy coding in which the distortion measure is revealed only to the encoder and only at run-time, as well as an extension of it in which the distortion constraint is also revealed at run-time. Two forms of rate redundancy are used to analyze the performance, and achievability results of both a pointwise and minimax nature are demonstrated. One proof uses appropriate quantization of the space of distortion measures while another uses ideas from VC dimension and growth functions.
We show the existence of universal, variable-rate rate-distortion codes that meet the distortion constraint almost surely and approach the rate-distortion function uniformly with respect to an unknown source distribut...
详细信息
ISBN:
(纸本)9781665421607;9781665421591
We show the existence of universal, variable-rate rate-distortion codes that meet the distortion constraint almost surely and approach the rate-distortion function uniformly with respect to an unknown source distribution and a distortion measure that is only revealed to the encoder and only at run-time. If the convergence only needs to be uniform with respect to the source distribution and not the distortion measure, then we provide an explicit bound on the minimax rate of convergence. Our construction combines conventional random coding with a zero-rate uncoded transmission scheme. The proof uses exact asymptotics from large deviations, acceptance-rejection sampling, the VC dimension of distortion measures, and the identification of an explicit, code-independent, finite-blocklength quantity, which converges to the rate-distortion function, that controls the performance of the best codes.
We analyze the relationship between a Minimum Description Length (MDL) estimator (posterior mode) and a Bayes estimator for exponential families. We show the following results concerning these estimators: a) Both the ...
详细信息
ISBN:
(纸本)0780324536
We analyze the relationship between a Minimum Description Length (MDL) estimator (posterior mode) and a Bayes estimator for exponential families. We show the following results concerning these estimators: a) Both the Bayes estimator with Jeffreys prior and the MDL estimator with the uniform prior with respect to the expectation parameter are nearly equivalent to a bias-corrected maximum-likelihood estimator with respect to the canonical parameter, b) Both the Bayes estimator with the uniform prior with respect to the canonical parameter and the MDL estimator with Jeffreys prior are nearly equivalent to the maximum-likelihood estimator (MLE), which is unbiased with respect to the expectation parameter, These results together suggest a striking symmetry between the two estimators, since the canonical and the expectation parameters of an exponential family form a dual pair from the point of view of information geometry. Moreover, a) implies that we can approximate a Bayes estimator with Jeffreys prior simply by deriving an appropriate MDL estimator or an appropriate bias-corrected MLE. This is important because a Bayes mixture density with Jeffreys prior is known to be maximin in universalcoding [7].
We consider the universal source coding problem for first-order stationary, irreducible and aperiodic Markov sources for short blocklengths. Achievability is derived based on the previously introduced algorithm for un...
详细信息
ISBN:
(纸本)9781467377041
We consider the universal source coding problem for first-order stationary, irreducible and aperiodic Markov sources for short blocklengths. Achievability is derived based on the previously introduced algorithm for universal compression of memoryless sources in the finite blocklengths, the Type Size Code, which encodes strings based on type class size. We derive the third-order asymptotic coding rate of the Type Size code for this model class. We also present a converse on the third-order coding rate for the general class of fixed-to-variable codes and show the optimality of Type Size codes for such Markov sources.
We propose a variation of the Context Tree Weighting algorithm for tree source modified such that the growth of the context resembles Lempel-Ziv parsing. We analyze this algorithm, give a concise upper bound to the in...
详细信息
We propose a variation of the Context Tree Weighting algorithm for tree source modified such that the growth of the context resembles Lempel-Ziv parsing. We analyze this algorithm, give a concise upper bound to the individual redundancy for any tree source, and prove the asymptotic optimality of the data compression rate for any stationary and ergodic source.
The universal lossless sourcecoding problem is one of the most important problem in communication systems. The aim of sourcecoding is to compress data to reduce costs in digital communication. Traditional universal ...
详细信息
The universal lossless sourcecoding problem is one of the most important problem in communication systems. The aim of sourcecoding is to compress data to reduce costs in digital communication. Traditional universal source coding schemes are usually designed for stationary sources. Recently, some universal codes for nonstationary sources have been proposed. Independent piecewise identically distributed (i.p.i.d.) sources are simple nonstationary sources that parameter changes discontinuously. In this paper, we assume new i.p.i.d. sources class, and we prove that Bayes codes minimize the mean redundancy when parameter transition pattern is known and parameter is unknown.
暂无评论