A method is proposed for configuring the compact code for extended binary memoryless sources with probabilistic deviations in symbol appearance. In an n-th order extended information source for binary memoryless sourc...
详细信息
A method is proposed for configuring the compact code for extended binary memoryless sources with probabilistic deviations in symbol appearance. In an n-th order extended information source for binary memoryless sources, messages with identical appearance probabilities are generated abundantly. If these messages are grouped for treatment, no rearrangement of the messages encompassing the groups is generated in degenerate operations by the Huffman algorithm for low-entropy information sources where the appearance probability of the superior symbols approaches arbitrarily close to 1. That is the situation is used in this paper. It is reported that the code length and sets of code words, average code length, and maximum code length of the compact code can be derived by a simple equation without forming a code tree. (C) 1999 Scripta Technica.
A method is proposed for configuring the compact code for extended binary memoryless sources with probabilistic deviations in symbol appearance. In an n -th order extended information source for binary memoryless sour...
详细信息
An instantaneous D-ary code which minimizes the average codeword length for an information source is called a compact code. It is known that for a D-ary compact code with codeword lengths l(1),l(2), ..., l(n) (where n...
详细信息
ISBN:
(纸本)9781424427505
An instantaneous D-ary code which minimizes the average codeword length for an information source is called a compact code. It is known that for a D-ary compact code with codeword lengths l(1),l(2), ..., l(n) (where n is the form of n = k (D - 1) + 1 for some positive integer k), we have Sigma D(-l1) = 1. Since construction of n D-ary codewords given codeword lengths l(1),l(2), ... ,l(n) is a straight forward task, we generate all possible codeword length sequences. In this paper, we present an algorithm which gets all compact codes with k(D-1)+1 codewords and generates all compact codes with (k+1)(D-1)+1 codewords. Based on this algorithm, ST(n)(D), the Supertree or all D-ary compact codes, is introduced. Any node in the m-th level of ST(n)(D) is associated with a unique compact code with 2D-1+m(D-1) codewords. Following the proposed approach, any D-ary compact code with n codewords can be represented by [(n - 1)/(D - 1)] - 2 bits.
The sparse coding algorithm has served as a model for early processing in mammalian vision. It has been assumed that the brain uses sparse coding to exploit statistical properties of the sensory stream. We hypothesize...
详细信息
The sparse coding algorithm has served as a model for early processing in mammalian vision. It has been assumed that the brain uses sparse coding to exploit statistical properties of the sensory stream. We hypothesize that sparse coding discovers patterns from the data set, which can be used to estimate a set of stimulus parameters by simple readout. In this study, we chose a model of stereo vision to test our hypothesis. We used the Locally Competitive Algorithm (LCA), followed by a naive Bayes classifier, to infer stereo disparity. From the results we report three observations. First, disparity inference was successful with this naturalistic processing pipeline. Second, an expanded, highly redundant representation is required to robustly identify the input patterns. Third, the inference error can be predicted from the number of active coefficients in the LCA representation. We conclude that sparse coding can generate a suitable general representation for subsequent inference tasks. (c) 2020 Published by Elsevier Ltd.
With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing ...
详细信息
With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods.
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional da...
详细信息
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations that encode the cluster index in each subspace have been shown to provide impressive accuracy. In this paper, we explore a simple question: is it best to use all the bit-budget for encoding a cluster index? We have found that as data points are located farther away from the cluster centers, the error of estimated distance becomes larger. To address this issue, we propose a novel compact code representation that encodes both the cluster index and quantized distance between a point and its cluster center in each subspace by distributing the bit-budget. We also propose two distance estimators tailored to our representation. We further extend our method to encode global residual distances in the original space. We have evaluated our proposed methods on benchmarks consisting of GIST, VLAD, and CNN features. Our extensive experiments show that the proposed methods significantly and consistently improve the search accuracy over other tested techniques. This result is achieved mainly because our methods accurately estimate distances.
A number of recent attempts have been made to describe early sensory coding in terms of a general information processing strategy. In this paper, two strategies are contrasted. Both strategies take advantage of the re...
详细信息
A number of recent attempts have been made to describe early sensory coding in terms of a general information processing strategy. In this paper, two strategies are contrasted. Both strategies take advantage of the redundancy in the environment to produce more effective representations. The first is described as a ''compact'' coding scheme. compact code performs a transform that allows the input to be represented with a reduced number of vectors (cells) with minimal RMS error. This approach has recently become popular in the neural network literature and is related to a process called Principal Components Analysis (PCA). A number of recent papers have suggested that the optimal ''compact'' code for representing natural scenes will have units with receptive field profiles much like those found in the retina and primary visual cortex. However, in this paper, it is proposed that compact coding schemes are insufficient to account for the receptive field properties of cells in the mammalian visual pathway. In contrast, it is proposed that the visual system is near to optimal in representing natural scenes only if optimality is defined in terms of ''sparse distributed'' coding. In a sparse distributed code, all cells in the code have an equal response probability across the class of images but have a low response probability for any single image. In such a code, the dimensionality is not reduced. Rather, the redundancy of the input is transformed into the redundancy of the firing pattern of cells. It is proposed that the signature for a sparse code is found in the fourth moment of the response distribution (i.e., the kurtosis). In measurements with 55 calibrated natural scenes, the kurtosis was found to peak when the bandwidths of the visual code matched those of cells in the mammalian visual cortex. codes resembling ''wavelet transforms'' are proposed to be effective because the response histograms of such codes are sparse (i.e., show high kurtosis) when presented with n
This article explores the application of thermodynamic and statistical thermodynamic formalism to information theory problems. In particular, the applicability of the transformation theory of thermodynamics is investi...
详细信息
暂无评论