Large vocabulary continuous speech recognition can benefit from an efficient data Structure for representing a large number of acoustic hypotheses compactly. Word graphs or lattices have been Chosen as Such an efficie...
详细信息
Large vocabulary continuous speech recognition can benefit from an efficient data Structure for representing a large number of acoustic hypotheses compactly. Word graphs or lattices have been Chosen as Such an efficient interface between acoustic recognition engines and Subsequent language processing modules. This paper first investigates the effect of pruning during acoustic decoding oil the quality of word lattices and shows that by combining different pruning options (at the model level and word level), we can obtain word lattices with comparable accuracy to the original lattices and a manageable size. In order to use the word lattices as the input for a post-processing language module, they should preserve the target hypotheses and their scores while being as small as possible. In this paper, we introduce a word graph compression algorithm that significantly reduces the number of words in the graphical representation without eliminating utterance hypotheses or distorting their acoustic scores. We compare this word graph compression algorithm with several other lattice size-reducing approaches and demonstrate the relative strength of the new word graph compression algorithm for decreasing the number of words in the representation. Experiments are conducted across corpora and vocabulary sizes to determine the consistency of the pruning and compression results. (C) 2003 Elsevier Science Ltd. All rights reserved.
The paper addresses the problem of collaborative video over "heterogeneous" networks. Current standards for video compression are not designed to deal with this problem. We define an additional set of metric...
详细信息
The paper addresses the problem of collaborative video over "heterogeneous" networks. Current standards for video compression are not designed to deal with this problem. We define an additional set of metrics (ie., in addition to the standard rate versus distortion measure) to evaluate compression algorithms for this application. We also present an efficient algorithm and corresponding architectures for video compression in such an environment. The algorithm is a unique combination of the discrete wavelet transform and hierarchical vector quantization. It is unique in that both the encoder and the decoder are implemented with only table lookups. This makes both the software and hardware implementations very efficient and cheap.< >
Instruction and data address traces are widely used by computer designers for quantitative evaluations of new architectures and workload characterization, as well as by software developers for program optimization, pe...
详细信息
Instruction and data address traces are widely used by computer designers for quantitative evaluations of new architectures and workload characterization, as well as by software developers for program optimization, performance tuning, and debugging. Such traces are typically very large and need to be compressed to reduce the storage, processing, and communication bandwidth requirements. However, preexisting general-purpose and trace-specific compression algorithms are designed for software implementation and are not suitable for runtime compression. Compressing program execution traces at runtime in hardware can deliver insights into the behavior of the system under test without any negative interference with normal program execution. Traditional debugging tools, on the other hand, have to stop the program frequently to examine the state of the processor. Moreover, software developers often do not have access to the entire history of computation that led to an erroneous state. In addition, stepping through a program is a tedious task and may interact with other system components in such a way that the original errors disappear, thus preventing any useful insight. The need for unobtrusive tracing is further underscored by the development of computer systems that feature multiple processing cores on a single chip. In this paper, we introduce a set of algorithms for compressing instruction and data address traces that can easily be implemented in an on-chip trace compression module and describe the corresponding hardware structures. The proposed algorithms are analytically and experimentally evaluated. Our results show that very small hardware structures suffice to achieve a compression ratio similar to that of a software implementation of gzip while being orders of magnitude faster. A hardware structure with slightly over 2 KB of state achieves a compression ratio of 125.9 for instruction address traces, whereas gzip achieves a compression ratio of 87.4. For data addr
Since the introduction of the Sanger sequencing technology in 1977 by Frederic Sanger and his colleagues, we observe an explosion of sequence data. The cost of storage, processing, and analyzing the data is getting ex...
详细信息
Since the introduction of the Sanger sequencing technology in 1977 by Frederic Sanger and his colleagues, we observe an explosion of sequence data. The cost of storage, processing, and analyzing the data is getting excessively high. As a result, it is extremely important that we develop efficient data compression and data reduction techniques. But standard data compression tools are not suitable to compress biological data since they contain many repetitive regions. There could exist high similarities among the sequences. In this context we need specialized algorithms to effectively compress biological data. In this paper we propose novel algorithms for compressing FASTQ files. We have done extensive and rigorous experiments that reveal that our proposed algorithm is indeed competitive and performs better than the best known algorithms for this problem.
Due to the high bandwidth requirements of up to 2 Mbit/s in 3rd-generation mobile communication systems, efficient data compression approaches are necessary to reduce communication and storage costs. The status of rec...
详细信息
ISBN:
(纸本)0769512062
Due to the high bandwidth requirements of up to 2 Mbit/s in 3rd-generation mobile communication systems, efficient data compression approaches are necessary to reduce communication and storage costs. The status of recent VLSI technologies promises complete system-on-a-chip (SoC) solutions for both mobile and network-based communication systems, including new compression algorithms based on the Burrows-Wheeler transform (BWT). The most complex task of the BWT algorithm is its lexicographic sorting of n cyclic rotations of a given string of n characters. This paper discusses the feasibility and VLSI implementation of this scalable BWT architecture in simulating and prototyping its systolic, highly utilized hardware structure with Virtex FPGAs.
Many compression algorithms consist of quantizing the coefficients of an image in a linear basis. This introduces compression noise that often look like ringing. Recently some authors proposed variational methods to r...
详细信息
Many compression algorithms consist of quantizing the coefficients of an image in a linear basis. This introduces compression noise that often look like ringing. Recently some authors proposed variational methods to reduce those artifacts. They consists of minimizing a regularizing functional in the set of antecedents of the compressed image. In this paper we propose a fast algorithm to solve that problem. Our experiments lead us to the conclusion that these algorithms effectively reduce oscillations but also reduce contrasts locally. To handle that problem, we propose a fast contrast enhancement procedure. Experiments on a large dataset suggest that this procedure effectively improves the image quality at low bitrates.
This paper describes a hybrid technique based on the combination of wavelet transform and linear prediction to achieve very effective electrocardiogram (ECG) data compression. First, the ECG signal is wavelet transfor...
详细信息
This paper describes a hybrid technique based on the combination of wavelet transform and linear prediction to achieve very effective electrocardiogram (ECG) data compression. First, the ECG signal is wavelet transformed using four different discrete wavelet transforms (Daubechies, Coiflet, Biorthogonal and Symmlet). All the wavelet transforms are based on dyadic scales and decompose the ECG signals into five detailed levels and one approximation. Then. the wavelet coefficients are linearly predicted, where the error corresponding to the difference between these coefficients and the predicted ones is minimized in order to get the best predictor. In particular, the residuals of the wavelet coefficients are uncorrelated and hence can he represented with fewer bits compared to the original signal. To further increase the compression rate, the residual sequence obtained after linear prediction is coded using a newly developed coding technique. As a result, a compression ratio (Cr) of 20 to 1 is achieved with percentage root-mean square difference (PRD) less than 4%. The algorithm is compared to an alternative compression algorithm based on the direct use of wavelet transforms. Experiments on selected records from the MIT-BIH arrhythmia database reveal that the proposed method is significantly more efficient in compression. The proposed compression scheme may find applications in digital Holter recording, in ECG signal archiving and in ECG data transmission through communication channels. (C) 2001 IPEM. Published by Elsevier Science Ltd. All rights reserved.
Coding algorithms are usually designed to faithfully reconstruct images, which limits the expected gains in compression. A new approach based on generative models allows for new compression algorithms that can reach d...
详细信息
Coding algorithms are usually designed to faithfully reconstruct images, which limits the expected gains in compression. A new approach based on generative models allows for new compression algorithms that can reach drastically lower compression rates. Instead of pixel fidelity, these algorithms aim at faithfully generating images that have the same high-level interpretation as their inputs. In that context, the challenge becomes to set a good representation for the semantics of an image. While text or segmentation maps have been investigated and have shown their limitations, in this paper, we ask the following question: do powerful foundation models such as CLIP provide a semantic description suited for compression? By suited for compression, we mean that this description is robust to traditional compression tools and, in particular, quantization. We show that CLIP fulfills semantic robustness properties. This makes it an interesting support for generative compression. To make that intuition concrete, we propose a proof-of-concept for a generative codec based on CLIP. Results demonstrate that our CLIP-based coder beats state-of-the-art compression pipelines at extremely low bitrates (0.0012 BPP), both in terms of image quality (65.3 for MUSIQ) and semantic preservation (0.86 for the Clip score).
The intelligent vehicle terminal T-BOX of intelligent connected vehicles (ICVs) is directly connected to the vehicle interface and constantly exchanges real-time data inside and outside the vehicle, which significantl...
详细信息
The intelligent vehicle terminal T-BOX of intelligent connected vehicles (ICVs) is directly connected to the vehicle interface and constantly exchanges real-time data inside and outside the vehicle, which significantly increases the communication pressure on the in-vehicle network. Excessive data will increase the load on the channel due to the limited bandwidth of the vehicle. Although the existing data compression algorithm can effectively reduce the communication pressure, it requires significant driving data to count the number of signal changes, which increases the preliminary labor cost. Meanwhile, it is difficult to meet the stable compression rate using a static configuration when a large amount of CAN data is transmitted with fluctuating data changes, which cannot satisfy the requirement of reducing the bus load rate, and satisfying the real-time requirements of ICV is challenging. Thus, we propose a real-time dynamic data compression algorithm for vehicles to satisfy such requirements. The experiments show that for the frequently changing signals, the proposed algorithm can improve the compression rate by about 10% compared with the existing compression algorithm. The execution time of the compression process is 4.55 mu s at a clock frequency of 400 MHz and the decompression process execution time is 86.48 mu s. Compared with the bus load when the original message is transmitted, the proposed algorithm can reduce the bus load by 19% on average. In the future, a stable compression rate will be crucial when loading authentication codes on the CAN to ensure its security.
In the burgeoning realm of Internet of Things (IoT) applications on edge devices, data stream compression has become increasingly pertinent. The integration of added compression overhead and limited hardware resources...
详细信息
In the burgeoning realm of Internet of Things (IoT) applications on edge devices, data stream compression has become increasingly pertinent. The integration of added compression overhead and limited hardware resources on these devices calls for a nuanced software-hardware co-design. This paper introduces CStream, , a pioneering framework crafted for parallelizing stream compression on multicore edge devices. CStream grapples with the distinct challenges of delivering a high compression ratio, high throughput, low latency, and low energy consumption. Notably, CStream distinguishes itself by accommodating an array of stream compression algorithms, a variety of hardware architectures and configurations, and an innovative set of parallelization strategies, some of which are proposed herein for the first time. Our evaluation showcases the efficacy of a thoughtful co-design involving a lossy compression algorithm, asymmetric multicore processors, and our novel, hardware-conscious parallelization strategies. This approach achieves a 2.8x compression ratio with only marginal information loss, 4.3x throughput, 65% latency reduction and 89% energy consumption reduction, compared to designs lacking such strategic integration.
暂无评论