In this work, a new method for estimating the time-varying AR model of speech is presented. Here, the time-varying parameters are modeled as stationary processes. Both the time-varying parameters and their correspondi...
详细信息
In this work, a new method for estimating the time-varying AR model of speech is presented. Here, the time-varying parameters are modeled as stationary processes. Both the time-varying parameters and their corresponding stationary process are modeled through a common Gauss-Markov model whose state-vector can be estimated through the extended Kalman Filter (EKF) algorithm. The proposed algorithm is different from the earlier methods which use the EKF algorithm. Simulation studies are carried out for both voiced and unvoiced speech. It is shown that the proposed method has less mean-square prediction error than that obtained through the LPC method.
Many coding methods are more efficient with certain types of images than others. In particular, run-length coding is very useful for coding areas of little changes. Adaptive predictivecoding achieves high coding effi...
详细信息
Many coding methods are more efficient with certain types of images than others. In particular, run-length coding is very useful for coding areas of little changes. Adaptive predictivecoding achieves high coding efficiency for fast changing areas like edges. In this paper, we propose a switching coding scheme that will combine the advantages of both run-length and adaptive linear predictive coding (RALP) for lossless compression of images. For pixels in slowly varying areas, run-length coding is used; otherwise LS (least square)-adapted predictivecoding is used. Instead of performing LS adaptation in a pixel-by-pixel manner, we adapt the predictor coefficients only when an edge is detected so that the computational complexity can be significantly reduced. For this, we propose an edge detector using only causal pixels. This way, the predictor can look ahead if the coding pixel is around an edge and initiate the LS adaptation in advance to prevent the occurrence of a large prediction error. With the proposed switching structure, very good prediction results can be obtained in both slowly varying areas and pixels around boundaries as we will see in the experiments.
We describe in this paper a code-excited linearpredictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion. Each sample of the inn...
详细信息
We describe in this paper a code-excited linearpredictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion. Each sample of the innovation sequence is filtered sequentially through two time-varying linear recursive filters, one with a long-delay (related to pitch period) predictor in the feedback loop and the other with a short-delay predictor (related to spectral envelope) in the feedback loop. We code speech, sampled at 8 kHz, in blocks of 5-msec duration. Each block consisting of 40 samples is produced from one of 1024 possible innovation sequences. The bit rate for the innovation sequence is thus 1/4 bit per sample. We compare in this paper several different random and deterministic code books for their effectiveness in providing the optimum innovation sequence in each block. Our results indicate that a random code book has a slight speech quality advantage at low bit rates. Examples of speech produced by the above method will be played at the conference.
This paper describes a hybrid input/output block adaptation scheme for the quantization of spectral parameters for low-delay CELP coders. In this scheme, line spectrum pair (LSP) coefficients are used as parameters fo...
详细信息
ISBN:
(纸本)0780331923
This paper describes a hybrid input/output block adaptation scheme for the quantization of spectral parameters for low-delay CELP coders. In this scheme, line spectrum pair (LSP) coefficients are used as parameters for spectral adaptation and they are partitioned into two segments; LSPs in the first segment which corresponding to low frequency regions are backward adapted from the reproduction speech while LSPs in the second segment are adapted from input speech and quantized as side information to the decoder. The rationale behind this scheme is to compensate for the spectrum distortion introduced by conventional backward adaptation schemes operating at low bit rates. Simulation results show that the hybrid LSP adaptation scheme achieves transparent quantization at 12 bits for each analysis frame of 5 ms.
In this paper, a new network echo canceller based on the practical adaptive filter is proposed. The proposed adaptive filter practically modifies the lattice transversal joint (LTJ) adaptive filter. It takes advantage...
详细信息
In this paper, a new network echo canceller based on the practical adaptive filter is proposed. The proposed adaptive filter practically modifies the lattice transversal joint (LTJ) adaptive filter. It takes advantage of information in the speech decoder and coefficients of the transversal filter part in the LTJ adaptive filter are updated every other sample instead of every sample. Total complexity of the proposed adaptive filter is lower than that of the transversal filter. And the residual echo signal is decreased by residual echo cancellation using the lattice predictor whose order is less than 10. Computational complexity of the proposed echo canceller is lower than that of the transversal filter but the convergence speed is faster than that of the transversal filter. The performance of the proposed network echo canceller was verified by the experiments using the real speech signal.
We present a low bit rate speech coder based on a long-term model (LTM) for voiced speech, and on the WI coder. In the LTM, a periodic input signal undergoes a time-varying spectral shaping representing the evolution ...
详细信息
We present a low bit rate speech coder based on a long-term model (LTM) for voiced speech, and on the WI coder. In the LTM, a periodic input signal undergoes a time-varying spectral shaping representing the evolution of the pitch-cycle waveform. The resulting signal, which has a fixed pitch period but a time-varying pitch-cycle waveform, is multiplied by a time-varying gain function that represents the variation in speech loudness. The resulting signal then undergoes a time-axis warping, which represents the evolution of the pitch period, yielding the output speech signal. The spectral shaping in the proposed coder is based on WI. In WI, speech (or LPC residual) is observed as a continuously evolving sequence of pitch cycle waveforms. A subset of these waveforms is extracted and coded. In the decoder, after inverse quantization, missing waveforms are synthesized by interpolation. The extracted waveforms are normalized to a fixed length and sequentially aligned using a cyclical shift. Then, a two-dimensional surface, called prototype waveform surface or characteristic waveform (CW) is produced from these waveforms.
In many vocoders LSFs (line spectrum frequencies) are used to encode the linear predictive coding (LPC) parameters. An interframe differential coding scheme is presented for LSFs. The LSFs of the current speech frame ...
详细信息
In many vocoders LSFs (line spectrum frequencies) are used to encode the linear predictive coding (LPC) parameters. An interframe differential coding scheme is presented for LSFs. The LSFs of the current speech frame are predicted by using both the LSFs of the previous frame and some of the LSFs of the current frame. Then the difference vector resulting from prediction is vector quantized. The proposed scheme is computationally efficient and easy to implement, and can be used in low-bit-rate vocoders.< >
This paper describes the design of a toll-quality 4-kbit/s speech coder based on phase-adaptive PSI-CELP. This adaptation method not only gives pitch periodicity to the random excitation but also synchronizes the basi...
详细信息
This paper describes the design of a toll-quality 4-kbit/s speech coder based on phase-adaptive PSI-CELP. This adaptation method not only gives pitch periodicity to the random excitation but also synchronizes the basic point of the stored random vector with the pitch phase. We further improve the proposed coder by introducing a backward gain prediction scheme. In subjective evaluation experiments, there is no significant difference between the quality of ITU-T G.726 32-kbit/s coder and that of the proposed 4-kbit/s coder under the conditions of normal and low input levels, tandem connection for clean speech. In noisy environments, there are also no significant differences between G.726 and 4-kbit/s coders from MOS results of the ACR test.
In this paper we report on a study of a technique for 32-band subband/transform coding at 16 kb/s. This approach occupies the middle range of algorithm complexities and frequency resolution between that of Sub-Band Co...
详细信息
In this paper we report on a study of a technique for 32-band subband/transform coding at 16 kb/s. This approach occupies the middle range of algorithm complexities and frequency resolution between that of Sub-Band coding (SBC) and Adaptive Transform coding (ATC). Two designs for 16 kb/s 32-band coders have been simulated on a laboratory computer. The results of informal listening tests indicate that the new designs offer performance comparable to existing ATC techniques while having complexities roughly three times that of existing 4 and 5 band sub-band coders.
暂无评论