The three-step search algorithm has been widely used in block matching motion estimation due to its simplicity and effectiveness. The sparsely distributed checking points pattern in the first step is very suitable for...
详细信息
ISBN:
(纸本)0780376633
The three-step search algorithm has been widely used in block matching motion estimation due to its simplicity and effectiveness. The sparsely distributed checking points pattern in the first step is very suitable for searching large motion. However, for quasi-stationary blocks it will easily lead the search to be trapped into a local minimum. In this paper we propose a modification on the three-step search algorithm which employs a small diamond pattern in the first step, and the unrestricted search step is used to search the center area. Experimental results show that the proposed algorithm performs better than new three-step search in terms of MSE and requires less computation by up to 15% on average.
In this paper, an improved lattice filter structure is presented to model a two-dimensional (2-D) signal such as an image. The proposed structure generates a forward and a backward prediction error field at each stage...
详细信息
In this paper, an improved lattice filter structure is presented to model a two-dimensional (2-D) signal such as an image. The proposed structure generates a forward and a backward prediction error field at each stage of lattice structure, unlike other lattice structures [S.R. Parker et al., 1984][N. Tulu Onuk et al., 1994] wherein three or more backward prediction error fields are generated at each stage. This method is computationally efficient and possesses all the advantages of lattice algorithm. Simulation results show that the proposed lattice method results in better compression with lower computational cost than other lattice methods in literature [H.K. Kwan et al., 2001].
In predictive image coding, the least squares (LS)-based adaptive predictor is noted as an efficient method to improve prediction result around edges. However pixel-by-pixel optimization of the predictor coefficients ...
详细信息
In predictive image coding, the least squares (LS)-based adaptive predictor is noted as an efficient method to improve prediction result around edges. However pixel-by-pixel optimization of the predictor coefficients leads to a high coding complexity. To reduce computational complexity, we activate the LS optimization process only when the coding pixel is around an edge or when the prediction error is large. We propose a simple yet effective edge detector using only causal pixels. The system can look ahead to determine if the coding pixel is around an edge and initiate the LS adaptation to prevent the occurrence of a large prediction error. Our experiments show that the proposed approach can achieve a noticeable reduction in complexity with only a minor degradation in the prediction results
The hexagon-based search pattern (HEXBS) algorithm yields fewer search points required for motion estimation, compared to square-shaped and diamond-shape patterns. In this paper, we propose a fast motion estimation al...
详细信息
The hexagon-based search pattern (HEXBS) algorithm yields fewer search points required for motion estimation, compared to square-shaped and diamond-shape patterns. In this paper, we propose a fast motion estimation algorithm to further reduce the search points demanded by HEXBS algorithm. After exploiting the statistical property of motion vectors of the neighboring blocks, the number of selected candidate points on the hexagon endpoints will be lower than the original HEXBS algorithm. Thus, the motion estimation efficiency can be improved in such a way. Experimental results show that the proposed algorithm decreases 57.62% average search points, compared to HEXBS algorithm, with only slight quality degradation.
This paper describes a family of low rate, low complexity speech coding algorithms known as time domain voicing cutoff (TDVC). TDVC is a predictive coding algorithm that employs a single transition frequency dividing ...
详细信息
This paper describes a family of low rate, low complexity speech coding algorithms known as time domain voicing cutoff (TDVC). TDVC is a predictive coding algorithm that employs a single transition frequency dividing voiced and unvoiced excitation. It provides the voicing flexibility of a frequency domain algorithm with lower complexity and rate overhead. TDVC has been previously subjected to a DAM test and received scores of 63.4 and 60.1 at 2.0 and 1.5 kb/sec, respectively.
This paper presents a new generalized particle model (GPM) to generate the prediction coding for lossless data compression. Local rules for particle movement in GPM, parallel algorithm and its implementation structure...
详细信息
This paper presents a new generalized particle model (GPM) to generate the prediction coding for lossless data compression. Local rules for particle movement in GPM, parallel algorithm and its implementation structure to generate the desired predictive coding are discussed. The proposed GPM approach has advantages in terms of encoding speed, parallelism, scalability, simplicity, and easy hardware implementation over other sequential lossless compression methods
This work presents the design of a computational charge-based circuit to be part of a focal plane compression chip. The image compression scheme pursued is predictive coding. The proposed circuit computes the predicti...
详细信息
This work presents the design of a computational charge-based circuit to be part of a focal plane compression chip. The image compression scheme pursued is predictive coding. The proposed circuit computes the prediction error at every pixel. It carries out the computations by integrating the photocurrents of the pixels in a small neighborhood. The prediction weights for every pixel can be changed by changing the switching timing of the circuit making possible the use of adaptive prediction algorithms. The circuit is compact and can be integrated at the pixel level.
This paper discusses a matrix quantizer design algorithm for image encoding problems. The design algorithm is aimed at producing a codebook of matrices which are, at least, locally optimum with respect to a distortion...
详细信息
This paper discusses a matrix quantizer design algorithm for image encoding problems. The design algorithm is aimed at producing a codebook of matrices which are, at least, locally optimum with respect to a distortion measure. We have considered the squared error distortion measure in this work and generated codebooks based on a training sequence consisting of a number of pictures of different bit rates. The preliminary results show promise for further work in this direction.
Our previous research showed promising results when transferring features learned from speech to train emotion recognition models for music. In this context, we implemented a denoising autoencoder as a pretraining app...
详细信息
Our previous research showed promising results when transferring features learned from speech to train emotion recognition models for music. In this context, we implemented a denoising autoencoder as a pretraining approach to extract features from speech in two languages (English and Mandarin). From that, we performed transfer and multi-task learning to predict classes from the arousal-valence space of music emotion. We tested and analyzed intra-linguistic and cross-linguistic settings, depending on the language of speech and lyrics of the music. This paper presents additional investigation on our approach, which reveals that: (1) performing pretraining with speech in a mixture of languages yields similar results than for specific languages - the pretraining phase appears not to exploit particular language features, (2) the music in Mandarin dataset consistently results in poor classification performance - we found low agreement in annotations, and (3) novel methodologies for representation learning (Contrastive predictive coding) may exploit features from both languages (i.e., pretraining on a mixture of languages) and improve classification of music emotions in both languages. From this study we conclude that more research is still needed to understand what is actually being transferred in these type of contexts.
A method for coding of video sequences based on semantic decomposition into motion homogeneous regions is presented. The set of regions-the spatio-temporal segmentation-is intitialized for the first couple of frames o...
详细信息
A method for coding of video sequences based on semantic decomposition into motion homogeneous regions is presented. The set of regions-the spatio-temporal segmentation-is intitialized for the first couple of frames of the sequence and then, tracked along the time axis, that allows to maintain its stability. Based on the spatio-temporal segmentation a predictive coding scheme is developed. Motion parameter vector and boundary description are encoded for each spatio temporal region. The bit-rate obtained for motion parameters is very low (<0.01 bits/pixel). The bit-rate for the boundary description strongly depends on the stability of the segmentation and varies around 0.1 bits/pixel. A costless transmission mode is choosen to encode the boundary component. The high visual quality of predicted frames with almost absolute absence of artifacts and a specific structure of prediction error allows the use of a selective coding of error signal.
暂无评论