Wyner-Ziv coding has been recognized as the most popular method up to now. For traditional WZC, side information is generated from intra-coded frames for use in the decoding of WZ frames. The unit for intra-coding is ...
详细信息
Wyner-Ziv coding has been recognized as the most popular method up to now. For traditional WZC, side information is generated from intra-coded frames for use in the decoding of WZ frames. The unit for intra-coding is a frame and the distance between key-frames is kept constant. In this paper, the unit for intra-coding is a block, and the temporal distance between two consecutive key blocks can varying with time. A block is assigned a mode (WZ or intra-coded), depending on the result of spatio-temporal analysis, and encoded in an alternative manner. This strategy improves the overall coding efficiency, while maintaining a low encoder complexity. The performance gain can achieve up to 6 dB with respect to the traditional pixel-domain WZC.
Summary form only given. This work uses the example quantization table described in the JPEG drafts as the basis for comparing the amount of compression attributable to each of several distinct lossless compression te...
详细信息
Summary form only given. This work uses the example quantization table described in the JPEG drafts as the basis for comparing the amount of compression attributable to each of several distinct lossless compression techniques. It takes the JPEG algorithm apart into its constituent pieces. The result shows that most of the JPEG compression performance is achieved from coefficient quantization that replaces coefficient values comprising many bits with values of few bits. Additional compression is achieved by a combination of runlength coding, predictive coding, and either Huffman or arithmetic coding.< >
This study investigates the emergence of compositionality and generalization within Emergent Communication (EmCom) systems, focusing on emergent language using the Metropolis-Hastings naming game (MHNG). Although the ...
详细信息
ISBN:
(数字)9798350348552
ISBN:
(纸本)9798350348569
This study investigates the emergence of compositionality and generalization within Emergent Communication (EmCom) systems, focusing on emergent language using the Metropolis-Hastings naming game (MHNG). Although the MHNG has been used in previous EmCom research, this is the first study to explore compositionality, the ability to construct complex expressions by combining simpler elements, and generalization, the ability to apply learned patterns to new situations, in emergent language within this language game. We introduce the novel Inter-VAE+VAE model, which equips each agent with dual variational autoencoders (VAEs), specifically designed for cognitive tasks that mirror human perception and language processing. This model, facilitating predictive coding, allows agents to refine their world models through collective experiences, a process rooted in the collective predictive coding hypothesis and differing from isolated learning approaches. Our model was evaluated against baseline models, including the
$\beta- \text{VAE}$
and
$\beta-\text{TCVAE}$
, and was further compared with implementations of the Lewis signaling game using the dSprites and 3Dshapes datasets. The results from these evaluations indicate an improvement in agent communication within the MH naming game and underscore the model's potential in replicating key aspects of human language, particularly compositionality and generalization, in artificial systems.
Stochastic coders provide a way of encoding the excitation to the synthesis filter at bit rates of about 2 kbit/s, thus leading to fhe possibility of high quality speech coding at 4.8 kbit/s. In these coders, the exci...
详细信息
Stochastic coders provide a way of encoding the excitation to the synthesis filter at bit rates of about 2 kbit/s, thus leading to fhe possibility of high quality speech coding at 4.8 kbit/s. In these coders, the excitation is encoded as an index into a codebook of random excitation waveforms and the coder transmits the parameters of a short-term filter (LPC all-pole predictor), the parameters of a long-term filter (pitch predictor) and the excitation gain to the receiver. Although the coders give excellent speech quality with unquantized parameters, the output degrades significantly when the filter parameters are coarsely quantized. For a 4.8 kbit/s coder, the short-term filter parameters have to be quantized at 1 kbit/s or less and conventional scalar quantizers at this bit rate result in severe degradation of output speech. In this paper we describe the performance of stochastic coders when the short-term filter parameters are quantized using direct vector quantization and vector quantization with predictive coding and eigenvector rotation. Our results indicate that good performance can be achieved with relatively small codebooks for the quantizers and that predictive coding with eigenvector rotation gives a small but consistent improvement over direct vector quantization.
A speech-coding algorithm based on the introduction of a vector quantizer into an ADPCM (adaptive differential pulse code modulation) configuration is presented. This vector ADPCM (VADPCM) algorithm is directed toward...
详细信息
A speech-coding algorithm based on the introduction of a vector quantizer into an ADPCM (adaptive differential pulse code modulation) configuration is presented. This vector ADPCM (VADPCM) algorithm is directed toward low-complexity, low-delay (0-5 ms), 16-kb/s applications. An analysis-by-synthesis configuration is used to allow the vector quantizer to operate with the usual scalar linear predictor. Performance/complexity tradeoffs are described. Methods for reducing the implementation complexity to the level of the standard 32-kb/s CCITT algorithm are indicated. Noise feedback, postfiltering, and gain-adaptive vector quantization are used to improve the performance while maintaining low complexity.< >
In this paper, We propose an efficient compression method to encode the geometry of 3D mesh sequences of objects sharing the same connectivity. Our approach is based on the clustering of the input mesh geometry into g...
详细信息
In this paper, We propose an efficient compression method to encode the geometry of 3D mesh sequences of objects sharing the same connectivity. Our approach is based on the clustering of the input mesh geometry into groups of vertices following the same affine motion. The proposed algorithm uses a scan-based temporal wavelet filtering geometrically compensated. The wavelet coefficients are encoded by an efficient coding scheme that includes a bit allocation process, whereas the displacement vectors are lossless entropy encoded. Simulation results provides good compression performances compared to some state of the art coders.
Distributed visual analysis applications, such as mobile visual search or Visual Sensor Networks (VSNs) require the transmission of visual content on a bandwidth-limited network, from a peripheral node to a processing...
详细信息
ISBN:
(纸本)9781479983407
Distributed visual analysis applications, such as mobile visual search or Visual Sensor Networks (VSNs) require the transmission of visual content on a bandwidth-limited network, from a peripheral node to a processing unit. Traditionally, a "Compress-Then-Analyze" approach has been pursued, in which sensing nodes acquire and encode the pixel-level representation of the visual content, that is subsequently transmitted to a sink node in order to be processed. This approach might not represent the most effective solution, since several analysis applications leverage a compact representation of the content, thus resulting in an inefficient usage of network resources. Furthermore, coding artifacts might significantly impact the accuracy of the visual task at hand. To tackle such limitations, an orthogonal approach named "Analyze-Then-Compress" has been proposed [1]. According to such a paradigm, sensing nodes are responsible for the extraction of visual features, that are encoded and transmitted to a sink node for further processing. In spite of improved task efficiency, such paradigm implies the central processing node not being able to reconstruct a pixel-level representation of the visual content. In this paper we propose an effective compromise between the two paradigms, namely "Hybrid-Analyze-Then-Compress" (HATC) that aims at jointly encoding visual content and local image features. Furthermore, we show how a target tradeoff between image quality and task accuracy might be achieved by accurately allocating the bitrate to either visual content or local features.
In this paper, we introduce a new predictive image compression scheme that compresses an image by a set of parameters computed for individual blocks of different types. These parameters include the average and differe...
详细信息
ISBN:
(纸本)9781424460793
In this paper, we introduce a new predictive image compression scheme that compresses an image by a set of parameters computed for individual blocks of different types. These parameters include the average and difference of the representative intensities of an image block, together with the index of a pattern associated with the block visual activity. The block representative gray values are computed through a histogram analysis of the block residuals and a pattern matching technique is employed to find the best match for the block bit-pattern from a pre-defined pattern book. To further reduce the bit rate, a predictive technique selectively predicts the parameters based on the corresponding values in the neighboring blocks. The simulation results confirm that the proposed technique can provide a high compression ratio with acceptable image quality of the compressed images.
A VQ/DCT (vector quantization/discrete cosine transform) coding scheme for color images is examined. In this coding scheme, VQ is first employed to code a 24-bits/pixel color image, which is composed of three componen...
详细信息
A VQ/DCT (vector quantization/discrete cosine transform) coding scheme for color images is examined. In this coding scheme, VQ is first employed to code a 24-bits/pixel color image, which is composed of three components, R, G, B, into an 8-bits/pixel index image. Then the DCT is applied to the index image. One major problem in this coding scheme is that the codebook produced by the VQ process is randomly arranged. Using this codebook to VQ-code color images destroys the correlation present in the original images. Some techniques for producing a codebook which retains a reasonable degree of correlation are proposed.< >
暂无评论