This paper describes a new artificial speech signal (ASVQ: Artificial Speech by Vector Quantization technique) which reflects the average characteristics of the human voice. The ASVQ is intended for use as a test sign...
详细信息
This paper describes a new artificial speech signal (ASVQ: Artificial Speech by Vector Quantization technique) which reflects the average characteristics of the human voice. The ASVQ is intended for use as a test signal in the objective evaluation of speech coding system quality. To obtain the average characteristics, a very large speech data base is analyzed. The ASVQ generation method which reflects the extracted average characteristics of the human voice is formulated. This method applies vector quantizing analysis to the speech data base. The LPC speech synthesis circuit is used to reproduce the average characteristics. Finally, the new artificial speech signal is compared with a human voice and the estimation accuracy of the subjective quality of speech coding systems and nonlinear distortions is evaluated.
This correspondence presents quantization scheme for encoding line spectral parameters used in linear predictive coding (LPC) of speech. The scheme is based on low-dimensionality regular-point lattices. The algebraic ...
详细信息
This correspondence presents quantization scheme for encoding line spectral parameters used in linear predictive coding (LPC) of speech. The scheme is based on low-dimensionality regular-point lattices. The algebraic codebook need not be stored, and the optimum codevector is found through simple rounding of the input vector. Thus, the scheme results in significant savings of memory and reduced computational complexity when compared to traditional vector-quantizer solutions. The quantizer achieves an average spectral distortion of about 1 dB at 28 b/frame for the telephone bandwidth.
It is shown that postfiltering circuits based on higher order LPC (linear predictive coding) models can provide very low distortion in terms of special tilt. Thus, they can provide better speech enhancement than circu...
详细信息
It is shown that postfiltering circuits based on higher order LPC (linear predictive coding) models can provide very low distortion in terms of special tilt. Thus, they can provide better speech enhancement than circuits based on the backward-adaptive pole-zero predictor in ADPCM (adaptive digital pulse code modulation). Quantitative criteria for designing postfiltering circuits based on higher-order LPC models are discussed. These postfilters are particularly attractive for systems where high-order LPC analysis is an integral part of the coding algorithm. In a subjective test that used a computer-simulated version of these circuits, enhanced ADPCM obtained a mean opinion score of 3.6 at 16 kb/s.< >
linear predictive coding (LPC) analysis was used to create morphed natural tokens of English voiced stop consonants ranging from /b/ to /d/ and /d/ to /g/ in four vowel contexts (/i/, /ae/, /a/, /u/). Both vowel conso...
详细信息
linear predictive coding (LPC) analysis was used to create morphed natural tokens of English voiced stop consonants ranging from /b/ to /d/ and /d/ to /g/ in four vowel contexts (/i/, /ae/, /a/, /u/). Both vowel consonant vowel (VCV) and consonant vowel (CV) stimuli were created. A total of 320 natural-sounding acoustic speech stimuli were created, comprising 16 stimulus series. A behavioral experiment demonstrated that the stimuli varied perceptually from /b/ to /d/ to /g/, and provided useful reference data for the ambiguity of each token. Acoustic analyses indicated that the stimuli compared favorably to standard characteristics of naturally-produced consonants, and that the LPC morphing procedure successfully modulated multiple acoustic parameters associated with place of articulation. The entire set of stimuli is freely available on the Internet (http://***/similar to lholt/php/***) for use in research applications. (C) 2011 Elsevier B.V. All rights reserved.
This paper presents a novel circuit technique to improve the testability of NORA (NO RAce) CMOS circuits. It is based on the structure, properties and operations of NORA CMOS. The precharge and evaluation properties o...
详细信息
This paper presents a novel circuit technique to improve the testability of NORA (NO RAce) CMOS circuits. It is based on the structure, properties and operations of NORA CMOS. The precharge and evaluation properties of NORA CMOS enable one to design simple testing circuit for output stuck-at-zero, stuck-at-one, stuck-open and stuck-on faults. Area and time considerations, as well as the applications of this testability enhancement technique are also discussed.
Adaptive predictivecoding of digitized images using multiplicative autoregressive (MAR) models is discussed. Three MAR models, designated as nonsymmetric half plane (NSHP) (3*3), quarter plane (QP) (2*3), and NSHP (2...
详细信息
Adaptive predictivecoding of digitized images using multiplicative autoregressive (MAR) models is discussed. Three MAR models, designated as nonsymmetric half plane (NSHP) (3*3), quarter plane (QP) (2*3), and NSHP (2*3), are studied in detail. Results demonstrate that both NSHP (3*3) and QP (2*3) are very effective for coding and transmission of such images at bit rates less than one bit per pixel. Comparison with a 2-D model that has a quarter plane 2*2 region of support indicates that the performance of NSHP (3*3) and QP (2*3) either exceeds or matches that of the former. The proposed scheme has the following advantages. First, the signal-to noise ratio and the bit rate attainable with this method are comparable to those of two-dimensional (2-D) predictive techniques. Second, unlike the 2-D schemes, the stability of the predictive coder is easily guaranteed.< >
Synthetic speech can be generated with an unrestricted vocabulary by concatenating stored units such as diphone elements. When joining speech segments that were not adjacent in the original context they were taken fro...
详细信息
Synthetic speech can be generated with an unrestricted vocabulary by concatenating stored units such as diphone elements. When joining speech segments that were not adjacent in the original context they were taken from, discontinuities in the spectral envelope may arise that impair intelligibility. The method proposed here attempts to find optimum diphone boundaries in order to minimize these discontinuities, Steady-state zones of all phones carrying a diphone boundary are specified by means of a centroid vector. Based on the centroids and on an objective distance measure, hypothetical boundary cost functions are defined. Their minimization together with the evaluation of a set of additional rules determines the boundary locations. A rhyme test carried out with speech generated by concatenating diphone elements extracted according to this method yielded an intelligibility score of 96.7 percent for isolated words.
作者:
Pham, Tuan D.Univ Aizu
Res Ctr Adv Informat Sci & Technol Aizu Res Cluster Med Engn & Informat Aizu Wakamatsu Fukushima 9658580 Japan
The notion of using computational methods for evaluating cognitive stimulation therapy (CST) based on the synchronized recording of photoplethysmographic (PPG) signals of care-givers and participants offers an objecti...
详细信息
The notion of using computational methods for evaluating cognitive stimulation therapy (CST) based on the synchronized recording of photoplethysmographic (PPG) signals of care-givers and participants offers an objective and cost-effective analysis in health care to improve the patient's quality of life. While computer models are promising as a useful tool for such a purpose, a question of interest is how the model reliability, which is the degree to which an assessment tool produces stable and consistent results, can be established. This paper addresses this issue with the application of dynamic-time warping and resampling to measure the performance of two PPG features known as the largest Lyapunov exponent and linear predictive coding, which have been applied for studying the efficacy of CST. The potential success of this computerized evaluation can be a precursor to the development of a personalized e-therapy system that operates on mobile devices.
The algebraic code excited linear prediction (ACELP) algorithm, because of low complexity and high quality in its analysis-by-synthesis optimisation, has been adopted by many speech coding standards. This study propos...
详细信息
The algebraic code excited linear prediction (ACELP) algorithm, because of low complexity and high quality in its analysis-by-synthesis optimisation, has been adopted by many speech coding standards. This study proposes the unified generalised pulse replacement (UPR) search algorithm for the stochastic codebook of ACELP speech coders. The proposed UPR algorithm discusses the search breadth, the order of the search direction and the update frequency based on the pulse replacement method. In addition, there are many derivative types of UPR algorithms discussed. The proposed approaches can achieve the lowest computational complexity with imperceptible degradation of the speech quality. Furthermore, the normalised degradation ratio based on the standard subjective quality measurement is proposed to fairly compare the performance. The experimental results will verify the claims.
An LPC (linear predictive coding) cepstrum distance measure (CD) is introduced as an objective measure for estimating the subjective quality of speech signals. Good correspondence between LPC CD and the subjective qua...
详细信息
An LPC (linear predictive coding) cepstrum distance measure (CD) is introduced as an objective measure for estimating the subjective quality of speech signals. Good correspondence between LPC CD and the subjective quality, expressed in terms of both opinion equivalent Q and mean opinion score, are shown. Good repeatability of objective quality evaluation using LPC CD is also shown. A method for generating an artificial voice signal that reflects the characteristics of real speech signals is described. The LPC CD values calculated using this artificial voice are almost the same as those calculated using real speech signals. The speaker-dependency of the coded-speech quality is shown to be an important factor in low-bit-rate speech coding. Even taking this factor into consideration, LPC CD is shown to be effective for estimating the subjective quality.< >
暂无评论