Identity test is a hypothesis test defined over the class of stationary and ergodic sources,to decide whether a sequence of random variables has originated from a known source or from an unknown source. For an identi...
详细信息
Identity test is a hypothesis test defined over the class of stationary and ergodic sources,to decide whether a sequence of random variables has originated from a known source or from an unknown source. For an identity test proposed by Ryabko and Astola in 2005, that makes use of an arbitrary pointwise universal compression algorithm and π, the null distribution to define the critical region, we have studied the rate at which type-2 error goes to zero as sample size goes to infinity. A formal link is established between this rate and the redundancy rate of the compression algorithm in use for the class of Markov processes by an application of the method of types.
In this paper, a new algorithm is proposed to improve the performance of the existing JPEG image compression algorithm. The algorithm proposes a recursive pruned discrete cosine transform (DCT) to reduce the bit rate ...
详细信息
In this paper, a new algorithm is proposed to improve the performance of the existing JPEG image compression algorithm. The algorithm proposes a recursive pruned discrete cosine transform (DCT) to reduce the bit rate and two different methods to reduce the image artifacts arising at these lower bit rates. The recursive algorithm computes the pruned DCT using a structure that allows the generation of higher order pruned DCT from two identical, lower order pruned DCT's. A variance dependent pruning method determines the pruning based on the statistical properties of the image block. A linear filtering method performs low pass filtering on the block boundaries for smoothing the discontinuities.
Here we propose GP-zip3, a system which uses Genetic Programming to find optimal ways to combine standard compression algorithms for the purpose of compressing files and archives. GP-zip3 evolves programs with multipl...
详细信息
Here we propose GP-zip3, a system which uses Genetic Programming to find optimal ways to combine standard compression algorithms for the purpose of compressing files and archives. GP-zip3 evolves programs with multiple components. One component analyses statistical features extracted from the raw data to be compressed (seen as a sequence of 8-bit integers) to divide the data into blocks. These blocks are then projected onto a two-dimensional Euclidean space via two further (evolved) program components. K-means clustering is applied to group similar data blocks. Each cluster is then labelled with the optimal compression algorithm for its member blocks. Once a program that achieves good compression is evolved, it can be used on unseen data without the requirement for any further evolution. GP-zip3 is similar to its predecessor, GP-zip2. Both systems outperform a variety of standard compression algorithms and are faster than other evolutionary compression techniques. However, GP-zip2 was still substantially slower than off-the-shelf algorithms. GP-zip3 alleviates this problem by using a novel fitness evaluation strategy. More specifically, GP-zip3 evolves and then uses decision trees to predict the performance of GP individuals without requiring them to be used to compress the training data. As shown in a variety of experiments, this speeds up evolution in GP-zip3 considerably over GP-zip2 while achieving similar compression results, thereby significantly broadening the scope of application of the approach.
The authors use Kolmogorov complexity and compression algorithms to study DOS-DNA (DOS: defined ordered sequence). This approach gives quantitative and qualitative explanations of the regularities of apparently regula...
详细信息
The authors use Kolmogorov complexity and compression algorithms to study DOS-DNA (DOS: defined ordered sequence). This approach gives quantitative and qualitative explanations of the regularities of apparently regular regions. The authors present the problem of the coding of approximate multiple tandem repeats in order to obtain compression. Then the authors describe an algorithm that allows one to find efficiently approximate multiple tandem repeats. Finally, the authors briefly describe some of their results.< >
Nowadays a large number of DNA sequences is being stored on online databases. To reduce this quantity of information, researchers have been trying to implement new DNA sequences compression techniques based on the LOS...
详细信息
ISBN:
(纸本)9781467368001
Nowadays a large number of DNA sequences is being stored on online databases. To reduce this quantity of information, researchers have been trying to implement new DNA sequences compression techniques based on the LOSSLESS algorithms. In this article we will compare two algorithms using the binary representation of DNA sequences. Those two algorithms are characterized by their ease of implementation. The first one transforms the DNA sequence into extended-ASCII representation while the second algorithm transforms it into a hexadecimal representation. Thereafter, we will apply the RLE technique to further enhance the compression of entire genomes.
Spatial data compression is particularly important for the development of Mobile GIS. This paper gives a review on vector data compression of Mobile GIS. Firstly, the definition and classification of vector data compr...
详细信息
Spatial data compression is particularly important for the development of Mobile GIS. This paper gives a review on vector data compression of Mobile GIS. Firstly, the definition and classification of vector data compression algorithms are introduced. Secondly, the traditional, new and optimal algorithms are analyzed separately, and their principles, research status, strengths and weaknesses are further discussed in detail. In the end, this paper makes an important comparison among Douglas-Peucker algorithm, wavelet-based algorithm and dynamic programming-based algorithm.
Sleep is associated with a variety of chronic diseases as well as most psychiatric, addiction and mood disorders. To analyze sleep patterns in rodents, researchers analyze polysomnogram data containing electroencephal...
详细信息
ISBN:
(纸本)9781509067145
Sleep is associated with a variety of chronic diseases as well as most psychiatric, addiction and mood disorders. To analyze sleep patterns in rodents, researchers analyze polysomnogram data containing electroencephalographs (EEG) and electromyographs (EMG). However, the analysis is performed manually by a expert human scorer, which is a slow, time consuming, and expensive process that is also subject to known human error and inter-scorer inconsistency [1]. To address this, researchers have developed a variety of techniques to automatically classify rodent sleep states using features extracted from EEG and EMG signals [2]. In many approaches, researchers extract a variety of heuristic features from explicitly chosen spectral bands of the EEG and EMG signals [3]. However, human designed, heuristic features often do not capture complete salient sleep-state information, which leads to inferior classification performance.
Recent processors utilize a variety of parallel processing technologies to boost its performance, and thus it is required that multimedia applications can be efficiently parallelized and can be easily implemented on s...
详细信息
Recent processors utilize a variety of parallel processing technologies to boost its performance, and thus it is required that multimedia applications can be efficiently parallelized and can be easily implemented on such a processor with parallel processing features. We implemented parallel algorithms for VQ compression on a shared-memory parallel environment and evaluated the effectivess of the parallel algorithms. On such a system, we evaluate two parallel algorithms for the codebook generation of the VQ compression: parallel LBG and parallel tPNN and find that the parallel tPNN is superior in terms of space complexity, whereas the parallel LBG is superior in terms of time complexity and parallelism. On the other hand, for a codeword search, the p-dist approach and the c-dist approach with the aggregation of synchronizations are suitable for a small codebook, and the c-dist approach and the p-dist approach with the ADM or the strip-mining method are suitable for a large codebook. However, since the aggregation of synchronizations and the strip-mining method increases the space complexity of the algorithm, the p-dist approach and the c-dist approach are more suitable for a small codebook and for a large codebook, respectively
In this article, we rigorously compare compressive sampling (CS) to four state of the art, on-mote, lossy compression algorithms (K-run-length encoding (KRLE), lightweight temporal compression (LTC), wavelet quantizat...
详细信息
In this article, we rigorously compare compressive sampling (CS) to four state of the art, on-mote, lossy compression algorithms (K-run-length encoding (KRLE), lightweight temporal compression (LTC), wavelet quantization thresholding and run-length encoding (WQTR), and a low-pass filtered fast Fourier transform (FFT)). Specifically, we first simulate lossy compression on two real-world seismic data sets, and we then evaluate algorithm performance using implementations on real hardware. In terms of compression rates, recovered signal error, power consumption, and classification accuracy of a seismic event detection task (on decompressed signals), results show that CS performs comparable to (and in many cases better than) the other algorithms evaluated. The main benefit to users is that CS, a lightweight and non-adaptive compression technique, can guarantee a desired level of compression performance (and thus, radio usage and power consumption) without subjugating recovered signal quality. Our contribution is a novel and rigorous comparison of five state of the art, on-mote, lossy compression algorithms in simulation on real-world data sets and implemented on hardware.
In the last few years, the classification of IP traffic flows according to application level protocols has emerged as a key issue for the design and development of multiservice IP networks. Several proposals have been...
详细信息
In the last few years, the classification of IP traffic flows according to application level protocols has emerged as a key issue for the design and development of multiservice IP networks. Several proposals have been made to address this problem and to enhance such systems, but to the best of our knowledge none of them is suitable for reliably identifying the application that generated the IP traffic. This paper introduces a new approach, based on the use of compression algorithms, which can be used to classify applications running over TCP. Three different compression algorithms are taken into consideration and applied to real traffic traces, so as to demonstrate the effectiveness of the proposed method and to evaluate the performance.
暂无评论