检索结果-内蒙古大学图书馆

On the implementation of minimum redundancy prefix codes

IEEE TRANSACTIONS ON COMMUNICATIONS 1997年第10期45卷 1200-1207页

作者： Moffat, A Turpin, A Department of Computer Science University of Melbourne Parkville VIC Australia

Minimum redundancy coding (also known as Huffman coding) is one of the enduring techniques of data compression. Many efforts have been made to improve the efficiency of minimum redundancy coding, the majority based on the use of improved representations for explicit Huffman trees. In this paper, we examine how minimum redundancy coding can be implemented efficiently by divorcing coding from a code tree, with emphasis on the situation when n is large, perhaps on the order of 10(6). We review techniques for devising minimum redundancy codes, and consider in detail how encoding and decoding should be accomplished. In particular, we describe a modified decoding method that allows improved decoding speed, requiring just a few machine operations per output symbol (rather than for each decoded bit), and uses just a few hundred bytes of memory above and beyond the space required to store an enumeration of the source alphabet.

关键词： canonical code Huffman code length-limited code minimum redundancy code prefix code text compression

来源：评论

学校读者我要写书评

暂无评论

Decoding prefix codes

引用

SOFTWARE-PRACTICE & EXPERIENCE 2006年第15期36卷 1687-1710页

作者： Liddell, Mike Moffat, Alistair Univ Melbourne Dept Comp Sci & Software Engn Melbourne Vic 3010 Australia

Minimum-redundancy prefix codes have been a mainstay of research and commercial compression systems since their discovery by David Huffman more than 50 years ago. In this experimental evaluation we compare techniques for decoding minimum-redundancy codes, and quantify the relative benefits of recently developed restricted codes that are designed to accelerate the decoding process. We find that table-based decoding techniques offer fast operation, provided that the size of the table is kept relatively small, and that approximate coding techniques can offer higher decoding rates than Huffman codes with varying degrees of loss of compression effectiveness. Copyright (c) 2006 John Wiley & Sons, Ltd.

关键词： Huffmann code minimum-redundancy code prefix code canonical code table-based decoding

来源：评论

学校读者我要写书评

暂无评论

Housekeeping for prefix coding

引用

IEEE TRANSACTIONS ON COMMUNICATIONS 2000年第4期48卷 622-628页

作者： Turpin, A Moffat, A Univ Melbourne Dept Comp Sci & Software Engn Parkville Vic 3052 Australia Univ Melbourne Dept Comp Sci & Software Engn Parkville Vic 3010 Australia

We consider the problem of constructing and transmitting the prelude for Huffman coding. With careful organization of the;required operations and an appropriate representation for the prelude, it is possible to make semistatic coding efficient even when S, the size of the source alphabet, is of the same magnitude as m, the length of the message being coded. The proposed structures are of direct relevance in applications that mimic one pass operation through the use of semistatic compression on a block-by block basis.

关键词： adaptive coding canonical code Huffman code minimum-redundancy code prefix code

来源：评论

学校读者我要写书评

暂无评论

Open Babel: An open chemical toolbox

引用

JOURNAL OF CHEMINFORMATICS 2011年第1期3卷 1-14页

作者： O'Boyle, Noel M. Banck, Michael James, Craig A. Morley, Chris Vandermeersch, Tim Hutchison, Geoffrey R. Univ Pittsburgh Dept Chem Pittsburgh PA 15217 USA Univ Coll Cork Analyt & Biol Chem Res Facil Cork Ireland Tech Univ Munich Dept Chem D-85747 Garching Germany eMolecules Inc Solana Beach CA 92075 USA

Background: A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendor-neutral formats. Results: We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license from http://***.

关键词： File Format Atom Type Chemical Markup Language canonical code Bond Connectivity

来源：评论

学校读者我要写书评

暂无评论

Efficient Algorithms for Association Finding and Frequent Association Pattern Mining 15th

Efficient Algorithms for Association Finding and Frequent As...

引用

15th International Semantic Web Conference (ISWC)

作者： Cheng, Gong Liu, Daxin Qu, Yuzhong Nanjing Univ Natl Key Lab Novel Software Technol Nanjing Jiangsu Peoples R China

ISBN: (纸本)9783319465234;9783319465227

Finding associations between entities is a common information need in many areas. It has been facilitated by the increasing amount of graph-structured data on the Web describing relations between entities. In this paper, we define an association connecting multiple entities in a graph as a minimal connected subgraph containing all of them. We propose an efficient graph search algorithm for finding associations, which prunes the search space by exploiting distances between entities computed based on a distance oracle. Having found a possibly large group of associations, we propose to mine frequent association patterns as a conceptual abstract summarizing notable subgroups to be explored, and present an efficient mining algorithm based on canonical codes and partitions. Extensive experiments on large, real RDF datasets demonstrate the efficiency of the proposed algorithms.

关键词： Association finding canonical code Distance oracle Frequent association pattern mining Graph search

来源：评论

学校读者我要写书评

暂无评论

Space-efficient Huffman codes revisited

引用

INFORMATION PROCESSING LETTERS 2023年 179卷

作者： Grabowski, Szymon Koppl, Dominik Lodz UnivTechnol Inst Appl Comp Sci Al Politech 11 PL-90924 Lodz Poland Tokyo Med & Dent Univ M&D Data Sci Ctr Tokyo 1138510 Japan

A canonical Huffman code is an optimal prefix-free compression code whose codewords enumerated in the lexicographical order form a list of binary words in non-decreasing lengths. Gagie et al. (2015) gave a representation of this coding capable of encoding and decoding a symbol in constant worst-case time. It uses slg sigma max+ o(s) + O(iota 2max) bits of space, where sand iota maxare the alphabet size and maximum codeword length, respectively. We refine their representation to reduce the space complexity to slg sigma max(1 + o(1)) bits while preserving the constant encode and decode times. Our algorithmic idea can be applied to any canonical code. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the

关键词： Data structures Data compression Huffman code canonical code Compact representation

来源：评论

学校读者我要写书评

暂无评论

Computing Ka and Ks with a consideration of unequal transitional substitutions

引用

BMC EVOLUTIONARY BIOLOGY 2006年第1期6卷 44-44页

作者： Zhang, Zhang Li, Jun Yu, Jun Chinese Acad Sci Inst Comp Technol Beijing 100080 Peoples R China Chinese Acad Sci Beijing Genom Inst Beijing 101300 Peoples R China Chinese Acad Sci Grad Sch Beijing 100039 Peoples R China Zhejiang Univ James D Watson Inst Genome Sci Hangzhou Genom Inst Key Lab Genom Bioinformat Zhejiang Prov Hangzhou 310027 Peoples R China

Background: Approximate methods for estimating nonsynonymous and synonymous substitution rates (Ka and Ks) among protein-coding sequences have adopted different mutation (substitution) models. In the past two decades, several methods have been proposed but they have not considered unequal transitional substitutions (between the two purines, A and G, or the two pyrimidines, T and C) that become apparent when sequences data to be compared are vast and significantly diverged. Results: We propose a new method (MYN), a modified version of the Yang-Nielsen algorithm (YN), for evolutionary analysis of protein-coding sequences in general. MYN adopts the Tamura-Nei Model that considers the difference among rates of transitional and transversional substitutions as well as factors in codon frequency bias. We evaluate the performance of MYN by comparing to other methods, especially to YN, and to show that MYN has minimal deviations when parameters vary within normal ranges defined by empirical data. Conclusion: Our comparative results deriving from consistency analysis, computer simulations and authentic datasets, indicate that ignoring unequal transitional rates may lead to serious biases and that MYN performs well in most of the tested cases. These results also suggest that acquisitions of reliable synonymous and nonsynonymous substitution rates primarily depend on less biased estimates of transition/transversion rate ratio.

关键词： Codon Position Transition Probability Matrix Codon Frequency Transitional Rate canonical code

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：