In this work, based on the MP-CELP speech coding with HPDR technique, fine granularity scalability (FGS) is introduced by adjusting the amount of transmitted fixed excitation information. The FGS feature aim at changi...
详细信息
In this work, based on the MP-CELP speech coding with HPDR technique, fine granularity scalability (FGS) is introduced by adjusting the amount of transmitted fixed excitation information. The FGS feature aim at changing the bit rate of the conventional coding more finely and more smoothly. Through performance analysis and computer simulation, the quality of scalability of the MP-CELP coding is presented with an improvement from conventional scalable MP-CELP. The HPDR technique is also applied to the MP-CELP to use for tonal language, meanwhile it can support the core coding rate of 4.2, 5.5, 7.5 kbps and additional scaled bit rates.
The use of a linear periodic controller (LPC) has been proposed as a new approach in the field of model reference adaptive control. The resulting controller can handle rapid changes in plant parameters, and it provide...
详细信息
The use of a linear periodic controller (LPC) has been proposed as a new approach in the field of model reference adaptive control. The resulting controller can handle rapid changes in plant parameters, and it provides smooth transient behavior for a closed-loop system. Moreover, the LPC generates control signals, which are modes in size when measured using the infinity norm. Although the LPC has these advantages, it suffers from poor noise tolerance. The smaller the sampling time is the less noise tolerant the controller is. In this work, to alleviate this drawback, we apply a probing signal with a larger size. The probing size is inversely proportional to the sampling time. The proposed method has significantly better noise rejection but larger control signal.
A new version of the Residual Excited linearpredictive (RELP) vocoder has been simulated. The objective has been to reduce the data rate required for good quality speech to 4.8 kbps. Results have indicated that it is...
详细信息
A new version of the Residual Excited linearpredictive (RELP) vocoder has been simulated. The objective has been to reduce the data rate required for good quality speech to 4.8 kbps. Results have indicated that it is possible to remove the hoarseness currently associated with low data rate RELP speech. Development of a pitch predictive ADPCM residual encoder and preliminary results on new harmonic generation techniques are discused. Taped demonstrations will be played at the conference.
The LD-CELP (code excited linear prediction) algorithm was adopted by the CCITT, as a Recommendation G.728 for the coding of speech with toll quality at 16 kbit/s. The operation of the LD-CELP algorithm at 12.8 kbit/s...
详细信息
The LD-CELP (code excited linear prediction) algorithm was adopted by the CCITT, as a Recommendation G.728 for the coding of speech with toll quality at 16 kbit/s. The operation of the LD-CELP algorithm at 12.8 kbit/s is described, and its performance is assessed both with voice and nonvoice signals in single and interconnected network configurations. The 12.8 kbit/s LD-CELP codec is found to perform equivalently to 24 kbit/s ADPCM (adaptive differential pulse code modulation). It is also shown that 12.8 kbit/s LD-CELP is acceptably transparent to network signaling. As a result, it can be concluded that the operation of the LD-CELP algorithm at 12.8 kbit/s presents a viable option for inclusion within 16-kbit/s-based DCME (digital circuit multiplication equipment) or PCME (packet circuit multiplication equipment) overload strategies.< >
Instead of using the fuzzy membership input with class membership desired output among training procedures as proposed by several researchers, we used the fuzzy membership input with conventional binary desired output...
详细信息
Instead of using the fuzzy membership input with class membership desired output among training procedures as proposed by several researchers, we used the fuzzy membership input with conventional binary desired output. This can reduce the mistaken training, decrease the training time and also improve the recognition ability. The system was tested on the recognition of ten Thai numerals from zero to nine. The error rate for speaker-independent tests achieved 9.2% compared with 14% error rate for conventional neural network systems while the error rate of the system using class membership desired output is somewhat higher because of mistaken training.
In this work, a new method for estimating the time-varying AR model of speech is presented. Here, the time-varying parameters are modeled as stationary processes. Both the time-varying parameters and their correspondi...
详细信息
In this work, a new method for estimating the time-varying AR model of speech is presented. Here, the time-varying parameters are modeled as stationary processes. Both the time-varying parameters and their corresponding stationary process are modeled through a common Gauss-Markov model whose state-vector can be estimated through the extended Kalman Filter (EKF) algorithm. The proposed algorithm is different from the earlier methods which use the EKF algorithm. Simulation studies are carried out for both voiced and unvoiced speech. It is shown that the proposed method has less mean-square prediction error than that obtained through the LPC method.
A new method for recognizing the start and the end of each word in a Chinese continuous sentence is discussed. We define a new recognition characteristic called periodic gradual change (PGC). A continuous speech sente...
详细信息
ISBN:
(纸本)0780374886
A new method for recognizing the start and the end of each word in a Chinese continuous sentence is discussed. We define a new recognition characteristic called periodic gradual change (PGC). A continuous speech sentence can be separated into many single words by a combination of the new method of PGC and other characteristics such as zero crossing rate (ZCR), instantaneous swing (E characteristic) and linear predictive coding (LPC) parameter. The recognition rate is improved for continuous speech segmentation by the new method.
We propose a hyperspectral image compressor called BH which considers its input image as being partitioned into square blocks, each lying entirely within a particular band, and compresses one such block at a time by u...
详细信息
We propose a hyperspectral image compressor called BH which considers its input image as being partitioned into square blocks, each lying entirely within a particular band, and compresses one such block at a time by using the following steps: first predict the block from the corresponding block in the previous band, then select a predesigned code based on the prediction errors, and finally encode the predictor coefficient and errors. Apart from giving good compression rates and being fast, BH can provide random access to spatial locations in the image. We hypothesize that BH works well because it accommodates the rapidly changing image brightness that often occurs in hyperspectral images. We also propose an intra-band compressor called LM which is worse than BH, but whose performance helps explain BH's performance.
The ITU-T issued the new recommendation G.729 in 1996, to realize a high-quality and low-delay speech coder at 8-kb/s. In this paper, the algorithm for conjugate-structure algebraic code-excited linear prediction (CS-...
详细信息
The ITU-T issued the new recommendation G.729 in 1996, to realize a high-quality and low-delay speech coder at 8-kb/s. In this paper, the algorithm for conjugate-structure algebraic code-excited linear prediction (CS-ACELP) is discussed, and its central aspects are analyzed in detail. Topics covered include the special codebook structure, efficient codebook search strategies and speech improvement approaches.
Vocoders compress speech by estimating model parameters at a given transmission rate over an analysis window, assuming that speech is stationary within this window. In this paper, the limits of this assumption are exp...
详细信息
Vocoders compress speech by estimating model parameters at a given transmission rate over an analysis window, assuming that speech is stationary within this window. In this paper, the limits of this assumption are explored with regard to the spectral envelope parameters in the form of line spectral frequency (LSF) parameters. It is shown that all LSF parameters have considerable variations over time, regardless of LSF vector extraction and transmission rates. LSF track variations are investigated through oversampling and are shown to contain high frequency variations above the frequency corresponding to the LSF vector transmission rate. An anti-aliasing filter with cut-off frequency adequate for the chosen LSF vector transmission rate is proposed to alleviate possible spectral overlapping of the LSF parameter spectra. It is confirmed, through experiments, that the proposed method offers an advantage over the classic LSF extraction method with respect to quantisation shown by bit savings of typically 10 to 15%.
暂无评论