检索结果-内蒙古大学图书馆

On the MDL principle for i.i.d. sources with large alphabets

IEEE TRANSACTIONS ON INFORMATION THEORY 2006年第5期52卷 1939-1955页

作者： Shamir, GI Univ Utah Dept Elect & Comp Engn Salt Lake City UT 84112 USA

Average case universal compression of independent and identically distributed (i.i.d.) sources is investigated, where the source alphabet is large, and may be sublinear in size or even larger than the compressed data sequence length n. In particular, the well-known results, including Rissanen's strongest sense lower bound, for fixed-size alphabets are extended to the case where the alphabet size k is allowed to grow with n. It is shown that as long as k = o(n), instead of the coding cost in the fixed-size alphabet case of 0.5 log n extra code bits for each one of the k - I unknown probability parameters, the cost is now 0.5 log(n/k) code bits for each unknown parameter. This result is shown to be the lower bound in the minimax and maximin senses, as well as for almost every source in the class. Achievability of this bound is demonstrated with two-part codes based on quantization of the maximum-likelihood (ML) probability parameters, as well as by using the well-known Krichevsky-Trofimov (KT) low-complexity sequential probability estimates. For very large alphabets, k >> n, it is shown that an average minimax and maximin bound on the redundancy is essentially (to first order) log(k/n) bits per symbol. This bound is shown to be achievable both with two-part codes and with a sequential modification of the KT estimates. For k = Theta(n), the redundancy is Theta(1) bits per symbol. Finally, sequential codes are designed for coding sequences in which only m < min{k, n} alphabet symbols occur.

关键词： independent and identically distributed (i.i.d.) sources maximin redundancy minimax redundancy minimum description length (MDL) redundancy redundancy for most sources redundancy-capacity theorem sequential codes universal coding

来源：评论

学校读者我要写书评

暂无评论

Natural type selection in adaptive lossy compression

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2001年第1期47卷 99-111页

作者： Zamir, R Rose, K Tel Aviv Univ Dept Elect Engn Syst IL-69978 Tel Aviv Israel Univ Calif Santa Barbara Dept Elect & Comp Engn Santa Barbara CA 93106 USA

Consider approximate (lossy) matching of a source string similar toP, With a random codebook generated from reproduction distribution Q, at a specified distortion d. Recent work determined the minimum coding rate R-1 = R(P, Q, d) for this setting. We observe that for large word length End,vith high probability, the matching codeword is typical with a distribution Q(1) which is different from Q. If a new random codebook is generated similar toQ(1), then the source string will favor codewords which are typical,vith a new distribution Q(2), resulting in minimum coding rate R-2 = R(P, Q1, d),and so on, We show that the sequences of distributions Q1, Q(2),.., and rates R-1, R-2, ..., generated by this procedure, converge to an optimum reproduction distribution Q*, and the rate-distortion function R(P, d), respectively. We also derive a fixed rate-distortion slope version of this natural type selection process,In the latter case, an iteration of the process stochastically simulates an iteration of the Blahut-Arimoto (BA) algorithm for rate-distortion function computation (without recourse to prior knowledge of the underlying source distribution). To strengthen these limit statements, we also characterize the steady-state error of these procedures when iterating at a finite String length. Implications of the main results provide fresh insights into the workings of lossy variants of the Lempel-Ziv algorithm for adaptive compression.

关键词： adaptive compression alternating optimization approximate string matching Blahut-Arimoto algorithm Lempel-Ziv coding rate-distortion typical sequences universal coding

来源：评论

学校读者我要写书评

暂无评论

universal variable-to-fixed length source codes

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2001年第4期47卷 1461-1472页

作者： Visweswariah, K Kulkarni, SR Verdú, S Princeton Univ Dept Elect Engn Princeton NJ 08544 USA

A universal variable-to-fixed length algorithm for binary memoryless sources which converges to the entropy of the source at the optimal rate is known. We study the problem of universal variable-to-fixed length coding for the class of Markov sources with finite alphabets. We give an upper bound on the performance of the code for large dictionary sizes and show that the code is optimal in the sense that no codes exist that have better asymptotic performance, The optimal redundancy is shown to be H log log M/log M where H is the entropy rate of the source and M is the code size, This result is analogous to Rissanen's result for fixed-to-variable length codes. We investigate the performance of a variable-to-fixed coding method which does not need to store the dictionaries, either at the coder or the decoder, We also consider the performance of both these source codes on individual sequences. For individual sequences we bound the performance in terms of the best code length achievable by a class of coders, All the codes that we consider are prefix-free and complete.

关键词： data compression entropy minimum description length Tunstall algorithm universal coding variable-fixed length codes

来源：评论

学校读者我要写书评

暂无评论

A generalization of B.S. Clarke and A.R. Barron's asymptotics of Bayes codes for FSMX sources

引用

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES 1998年第10期E81A卷 2123-2132页

作者： Gotoh, M Matsushima, T Hirasawa, S Waseda Univ Sch Sci & Engn Dept Ind & Management Syst Engn Tokyo 1698555 Japan

We shall generalize B.S. Clarke and A.R. Barron's analysis of the Bayes method for the FSMX sources. The FSMX source considered here is specified by the set of all states and its parameter value. At first, we show the asymptotic codelengths of individual sequences of the Bayes codes for the FSMX sources. Secondly, we show the asymptotic expected codelengths. The Bayesian posterior density and the maximum likelihood estimator satisfy asymptotic normality for the finite ergodic Markov source, and this is the key of our analysis.

关键词： Bayes code source coding universal coding universal modeling

来源：评论

学校读者我要写书评

暂无评论

Error Exponents for Asynchronous Multiple Access Channels, Controlled Asynchronism May Outperform Synchronism

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2021年第12期67卷 7684-7707页

作者： Csiszar, Imre Farkas, Lorant Koi, Tamas Alfred Renyi Inst Math H-1053 Budapest Hungary Budapest Univ Technol & Econ Inst Math Dept Anal H-1111 Budapest Hungary Budapest Univ Technol & Econ Inst Math Dept Stochast H-1111 Budapest Hungary

Exponential error bounds achievable by universal coding and decoding are derived for frame-asynchronous discrete memoryless multiple access channels with two senders, via the method of subtypes, a refinement of the method of types. An empirical entropy decoder is employed. A key tool is an improved packing lemma, that overcomes the technical difficulty caused by codeword repetitions via an induction based new argument. The asymptotic form of the bounds admits numerical evaluation. This demonstrates that error exponents achievable by synchronous transmission can be superseded via controlled asynchronism, i.e. a deliberate shift of the codewords.

关键词： Decoding Entropy Codes Reliability Delays Encoding Tools Asynchronous multiple access error exponents method of subtypes packing lemma universal coding

来源：评论

学校读者我要写书评

暂无评论

The asymptotic redundancy of Bayes rules for Markov chains

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 1999年第6期45卷 2104-2109页

作者： Atteson, K Yale Univ Dept Biol New Haven CT 06520 USA

We derive the asymptotics of the redundancy of Bayes rules for Markov chains of fixed order over a finite alphabet, extending the work of Barren and Clarke on independent and identically distributed (i.i.d.) sources. The asymptotics are derived when the actual source is the class of phi-mixing sources which strictly includes Markov chains. These results can be used to derive minimax asymptotic: rates of convergence for universal codes when a Markov chain of fixed order is used as a model.

关键词： asymptotics Bayesian statistics Markov chains universal coding

来源：评论

学校读者我要写书评

暂无评论

The method of types

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 1998年第6期44卷 2505-2523页

作者： Csiszar, I Hungarian Acad Sci Inst Math H-1364 Budapest Hungary

The method of types is one of the key technical tools in Shannon Theory, and this tool is valuable also in other fields. In this paper, some key applications will be presented in sufficient detail enabling an interested nonspecialist to gain a working knowledge of the method, and a wide selection of further applications will be surveyed. These range from hypothesis testing and large deviations theory through error exponents for discrete memoryless channels and capacity of arbitrarily varying channels to multiuser problems. While the method of types is suitable primarily for discrete memoryless models, its extensions to certain models with memory will also be discussed.

关键词： arbitrarily varying channels choice of decoder counting approach error exponents extended type concepts hypothesis testing large deviations multiuser problems universal coding

来源：评论

学校读者我要写书评

暂无评论

Grammar-based codes: A new class of universal lossless source codes

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2000年第3期46卷 737-754页

作者： Kieffer, JC Yang, EH Univ Minnesota Dept Elect & Comp Engn Minneapolis MN 55455 USA Univ Waterloo Dept Elect & Comp Engn Waterloo ON N2L 3G1 Canada

We investigate a type of lossless source code called a grammar-based code, which, in response to any input data string a: over a fixed finite alphabet, selects a contest-free grammar G(x) representing x in the sense that x is the unique string belonging to the language generated by G(x), Lossless compression of a: takes place indirectly,ia compression of the production rules of the grammar G(x), It is shown that, subject to some mild restrictions, a grammar-based code is a universal code with respect to the family of finite-state information sources over the finite alphabet. Redundancy bounds for grammar-based codes are established. Reduction rules for designing grammar-based codes are presented.

关键词： Chomsky hierarchy context-free grammars entropy Kolmogorov complexity lossless coding redundancy universal coding

来源：评论

学校读者我要写书评

暂无评论

Randomizing functions: Simulation of a discrete probability distribution using a source of unknown distribution

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 2006年第11期52卷 4965-4976页

作者： Pae, Sung-il Loui, Michael C. Univ Illinois Dept Comp Sci Urbana IL 61801 USA Univ Illinois Coordinated Sci Lab Urbana IL 61801 USA

In this paper, we characterize functions that simulate independent unbiased coin flips from independent coin flips of unknown bias. We call such functions randomizing. Our characterization of randomizing functions enables us to identify the functions that generate the largest average number of fair coin flips from a fixed number of biased coin flips. We show that these optimal functions are efficiently computable. Then we generalize the characterization, and we present a method to simulate an arbitrary rational probability distribution optimally (in terms of the average number of output digits) and efficiently (in terms of computational complexity) from outputs of many-faced dice of unknown distribution. We also study randomizing functions on exhaustive prefix-free sets.

关键词： coin flipping random number generation randomizing function universal coding

来源：评论

学校读者我要写书评

暂无评论

ON universal QUANTIZATION BY RANDOMIZED UNIFORM LATTICE QUANTIZERS

引用

IEEE TRANSACTIONS ON INFORMATION THEORY 1992年第2期38卷 428-436页

作者： ZAMIR, R FEDER, M Department of Electrical Engineering-Systems Faculty of Engineering Tel-Aviv University Tel-Aviv Israel

Uniform quantization with dither, or lattice quantization with dither in the vector case, followed by a universal lossless source encoder (entropy coder), is a simple procedure for universal coding with distortion of a source that may take continuously many values. The rate of this universal coding scheme is examined, and a general expression is derived for it. An upper bound for the redundancy of this scheme, defined as the difference between its rate and the minimal possible rate, given by the rate distortion function of the source, is derived. This bound holds for all distortion levels. Furthermore, a composite upper bound on the redundancy as a function of the quantizer resolution that leads to a tighter bound in the high rate (low distortion) case is presented.

关键词： UNIFORM AND LATTICE QUANTIZATION RANDOMIZED QUANTIZATION universal coding RATE-DISTORTION PERFORMANCE

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：