In this paper, circular LPC analysis, a windowless signal modeling method for periodic signals, is re-visited as a high-resolution pitch-synchronous speech spectrum analysis tool for high quality speech coding, and th...
详细信息
In this paper, circular LPC analysis, a windowless signal modeling method for periodic signals, is re-visited as a high-resolution pitch-synchronous speech spectrum analysis tool for high quality speech coding, and the constant pitch transform is re-introduced as an alternative representation for the (circular) residual signal. This combination results in a parametric speech representation that is uniquely well suited for many speech coding applications. It is shown that circular LPC requires a very accurate pitch period estimation. Thus, both techniques are formulated using an oversampled version of the speech signal for analysis. Experiments show that artifact free speech can be synthesized using the proposed method.
A pitch-synchronous split band LPC (PS-SBLPC) speech coder is proposed. In this new paradigm, harmonic analysis is carried out on individual pitch cycle waveforms (PCWs) rather than using a large window. PCWs are iden...
详细信息
A pitch-synchronous split band LPC (PS-SBLPC) speech coder is proposed. In this new paradigm, harmonic analysis is carried out on individual pitch cycle waveforms (PCWs) rather than using a large window. PCWs are identified using a trapezoidal window search performed on a modified time envelope signal. In order to achieve a fixed rate coder the PCW parameters are jointly quantised using a combined interpolation and quantisation routine. Combining interpolation and quantisation allows for high correlation between successive PCWs to be exploited, without subjecting rapid transitions to time smoothing. During speech synthesis, no interpolation is applied, as parameter smoothing is provided during the quantisation. Simulation results comparing the PS-SBLPC model with the SB-LPC model show that the quality of the PS-SBLPC speech signal is significantly better than that of the split band LPC (SB-LPC). Initial results have shown that quantisation optimisation leads to vast improvements in speech quality during speech transitions.
The paper considers a robust recursive procedure for identifying a nonstationary AR speech model based on a quadratic classifier with a heuristic decision threshold. Two versions of the robust procedure with heuristic...
详细信息
The paper considers a robust recursive procedure for identifying a nonstationary AR speech model based on a quadratic classifier with a heuristic decision threshold. Two versions of the robust procedure with heuristic decision threshold, based on a frame-based quadratic classifier and a quadratic classifier with a sliding training data set, are evaluated and compared through analyzing natural speech signals with voiced and mixed excitation segments. The results obtained show that the considered robust procedure with the quadratic classifier with sliding training data set and heuristic decision threshold achieves more accurate AR speech parameter estimation, provides improved tracking performance, and achieves better discrimination capabilities for possible application in some vowel recognition systems.
The bottleneck of the GIS-T data gathering is the heterogeneity between the information sources. The main purpose of this paper is to make clear the idea of semantic matching and to present a prototype implementation ...
详细信息
The bottleneck of the GIS-T data gathering is the heterogeneity between the information sources. The main purpose of this paper is to make clear the idea of semantic matching and to present a prototype implementation framework. After a short presentation of GIS-T characteristics and approaches of information interoperability, we focus an the establishment of the semantic matching, one of the most critical issue in the fusion. The matching problem is addressed in detail at three levels: the classes, the roles and the entities. Then we will go on to describe a proof-of-concept framework system that we are developing.
Heart sound is one of the oldest means for assessing the function of heart valves. It helps, together with echocardiograms and electrocardiographs, to give a clear and proper diagnosis of several diseases. Artificial ...
详细信息
Heart sound is one of the oldest means for assessing the function of heart valves. It helps, together with echocardiograms and electrocardiographs, to give a clear and proper diagnosis of several diseases. Artificial neural networks are used to classify several valve-related heart disorders. A library of heart sound files, recorded via the traditional stethoscope, are used to extract relevant features using several signal processing tools, e.g., discrete wavelet transform (DWT), fast Fourier transform (FFT) and linear predictive coding (LPC). The achieved recognition rates were around 95.7%.
The design of a rejection-based classifier can be made according to two well-identified strategies operating in two sequential steps: the accept-first strategy and the reject-first one. The first one is the most usual...
详细信息
The design of a rejection-based classifier can be made according to two well-identified strategies operating in two sequential steps: the accept-first strategy and the reject-first one. The first one is the most usual. Recently, we have proposed a general class of the latter classifiers using fuzzy XOR operators based on dual triples (t-norm, t-conorm, complement) (2001). In this paper, we investigate a new approach. It consists in starting with testing for ambiguity rejection, and if needed, testing for either exclusive classification or distance rejection. For this purpose, we define a new operator called the fuzzy OR-2 allowing us to propose a new class of classifiers.
Currently, speech recognition technology is in a mature condition, however, compared with other applied technologies, speech recognition technology still has limited daily live applications. One of the reasons is that...
详细信息
Currently, speech recognition technology is in a mature condition, however, compared with other applied technologies, speech recognition technology still has limited daily live applications. One of the reasons is that the lack of our knowledge regarding speech features. We performed word recognition experiments using features that are extracted by several different methods. We called them multi domain features. As shown in the results of recognition experiments, it is interesting that the multi domain parameters which represent phonemic features are very effective for word recognition and operate to promote the goodness of the recognition in cooperation with the other conditions like a distance measure.
We proposed a method to solved the problem that the output is not equal to the interval of integer times caused by pitch alteration using the PSOLA method by voice/unvoice detection. The proposed method is applied to ...
详细信息
ISBN:
(纸本)0780374908
We proposed a method to solved the problem that the output is not equal to the interval of integer times caused by pitch alteration using the PSOLA method by voice/unvoice detection. The proposed method is applied to the existing PSOLA method by voice/unvoice detection in the voice region and the deletion or interpolation method about samples which are not equal to the interval of integer times in the unvoice region. We used the SNR for an objective measure and compared each sample by the existing and proposed method for a measure of the result of the proposed method. We achieved pitch altered voice which is equal to the interval of integer times and the SNR value of the proposed method is almost equal to the existing one.
We consider a location management scheme called limited pointer forwarding from commonly visited sites (LPC) in a PCS network. LPC aims to reduce significantly the cost of updating databases in the PCS network. The st...
详细信息
We consider a location management scheme called limited pointer forwarding from commonly visited sites (LPC) in a PCS network. LPC aims to reduce significantly the cost of updating databases in the PCS network. The strategy exploits the fact that it is highly likely for mobile users to move within commonly visited sites (CVSs) (see Makki, S., Computer Commun., vol.23, p.975-9, 2000). We employ analytical models to compare the location management cost of the LPC scheme with existing schemes such as the basic HLR/VLR (home location register, visitor location register) in IS-41. We discuss the conditions under which LPC outperforms the other schemes.
This paper simply describes the use of neural networks in recognizing some Malay isolated sounds of Malay children in a speaker-independent manner. The isolated sounds are Malay plosive sounds, which are comprised of ...
详细信息
ISBN:
(纸本)9810475241
This paper simply describes the use of neural networks in recognizing some Malay isolated sounds of Malay children in a speaker-independent manner. The isolated sounds are Malay plosive sounds, which are comprised of /b/, /d/, /g/, /p/, /t/ and /k/. A three-layer Multi-layer Perceptron (MLP) is used to train and recognize the speech sounds. The MLP output layer has an output layer of 6 neurons, which correspond to the 6 isolated plosive sounds. Network parameters such as hidden neuron number and error function, were investigated to achieve the optimal performance of the MLP. The proposed system was able to achieve the highest accuracy of 84.67%.
暂无评论