A systematic method to design components of a 7-16 connector type precision short-open-load-thru (SOLT) calibration kit for the wireless industry using the High Frequency Structure Simulator (HFSS) at microwave freque...
详细信息
A systematic method to design components of a 7-16 connector type precision short-open-load-thru (SOLT) calibration kit for the wireless industry using the High Frequency Structure Simulator (HFSS) at microwave frequencies is presented. To achieve a better than 60 dB accuracy in the nominal optimized value of each individual feature, only one of the 5/spl deg/ segments of the coaxial structure was used to model the equivalent 2D longitudinal cross-section. The validation process for this simplification, together with the test results on the components using a reference TRL/LRL calibration kit for the microwave measurements, is demonstrated. A better than 1.025 VSWR for the 7 mm to 7-16 adapters up to 7.5 GHz is reported.
A content-based video indexing method is presented that aims at temporally indexing a video sequence according to the actual speaker. This is achieved by the integration of audio and visual information. Audio analysis...
详细信息
A content-based video indexing method is presented that aims at temporally indexing a video sequence according to the actual speaker. This is achieved by the integration of audio and visual information. Audio analysis leads to the extraction of a speaker identity label versus time diagram. Visual analysis includes scene cut detection, face shot determination, mouth region extraction and tracking and finally talking face shot determination. Results from both sources are combined to improve speaker dependent video indexing. Such a task enables flexible video retrieval or browsing in cases where queries according to speaker identities are imposed. Speaker recognition errors are reduced to 2%.
A method for the stabilization of stationary and time-varying autoregressive models is presented. The method is based on the hyperstability constrained LS-problem with nonlinear constraints. The problems are solved it...
详细信息
A method for the stabilization of stationary and time-varying autoregressive models is presented. The method is based on the hyperstability constrained LS-problem with nonlinear constraints. The problems are solved iteratively with Gauss-Newton type algorithm that sequentially linearizes the constraints. The proposed method is applied to simulated data in the stationary case and to real EEG data in the time-varying case.
A VLSI implementation of a low-power DSP is described, which is dedicated to the G.723.1 low bitrate speech codec. A number of sophisticated DSP microarchitectures are devised mainly on dual multiply accumulators, rou...
详细信息
A VLSI implementation of a low-power DSP is described, which is dedicated to the G.723.1 low bitrate speech codec. A number of sophisticated DSP microarchitectures are devised mainly on dual multiply accumulators, rounding and saturation mechanisms, and two-banked on-chip memory. The proposed DSP architecture has been integrated in a total area of 7.75 mm/sup 2/ by using a 0.35 /spl mu/m CMOS technology, which can operate at 10 MHz with the dissipation of 45 mW from a single 3 V supply.
In this paper a new method for formant frequency estimation of noisy speech is proposed based on linear prediction analysis. Usually the linear prediction analysis based algorithms can extract the formant frequencies ...
详细信息
In this paper a new method for formant frequency estimation of noisy speech is proposed based on linear prediction analysis. Usually the linear prediction analysis based algorithms can extract the formant frequencies effectively for clean speech. When speech is corrupted by noise, however, their performance degrades seriously. It is well known that the autocorrelation function has the property of concentrating the energy of the white noise in the vicinity of the zero lag. Utilizing this property of the autocorrelation function, the proposed method extracts the formant frequencies from the autocorrelation function of the speech instead of the speech itself. The experimental results show that the proposed method is much more robust to noise than the conventional linear prediction based algorithms.
This paper presents a variable bit rate ADP-CELP (adaptive density pulse code excited linear prediction) coder that selects one of four kinds of coding structure in each frame based on short time speech characteristic...
详细信息
This paper presents a variable bit rate ADP-CELP (adaptive density pulse code excited linear prediction) coder that selects one of four kinds of coding structure in each frame based on short time speech characteristics. To improve speech quality and reduce the average bit rate, we have developed a speech/non-speech classification method using spectrum envelope variation, which is robust for background noise. In addition, we propose an efficient pitch lag coding technique. The technique interpolates consecutive frame pitch lags and quantizes a vector of relative pitch lags consisting of variation between an estimated pitch lag and a target pitch lag in plural subframes. The average bit rate of the proposed coder was approximately 2.4 kbps for speech sources with activity factor of 60%. Our subjective testing indicates the quality of the proposed coder exceeds that of the Japanese digital cellular standard with rate of 3.45 kbps.
Speech polarity is crucial in many speech processing fields. We present a novel method to determine the polarity of speech signals from the gradient of spurious glottal waveforms. We use the iterative adaptive LPC inv...
详细信息
Speech polarity is crucial in many speech processing fields. We present a novel method to determine the polarity of speech signals from the gradient of spurious glottal waveforms. We use the iterative adaptive LPC inverse filtering to cancel the effect of vocal tract transfer function while maintaining most of the properties of source excitation. Then we take the first-derivative (gradient component) of spurious glottal waveforms to capture the sharp gradient near the glottal closure instant. By using the gradient components of the spurious glottal waveforms, we detect speech polarity, i.e., the polarity of glottal waveforms, by finding whether the glottal closure instants are located above or below the zero-line. Furthermore, a frame-based decision technique is applied to get robust results. Experimental results with a wide variety of speech utterances reveal a high performance and the computation complexity is much less than a previously proposed method.
This paper presents a new method to apply variable bit-rate predictive quantization of the variable model order LPC parameters. In addition, the method is employed to interpolate the parameters within the analysis fra...
详细信息
This paper presents a new method to apply variable bit-rate predictive quantization of the variable model order LPC parameters. In addition, the method is employed to interpolate the parameters within the analysis frame. The LPC model order selection algorithm is based on the characteristics of the input signal and on the performance of the LPC model. Hence, the variable bit-rate LPC quantization is source controlled. The number of quantized parameters needs to be identical in successive frames to be able to apply the predictive quantization and to interpolate parameters inside the frame. Therefore, the order of the LPC model of the previous frame needs to be expanded or reduced to be the same as the current frame LPC model. The advantage of variable model order LPC quantization is the lowered average bit-rate compared to a fixed rate while the speech quality remains the same.
This paper describes methods that can enhance the quality of speech signals that are severely band limited during regular telephone speech transmission. We have already proposed a spectrum widening method that utilize...
详细信息
This paper describes methods that can enhance the quality of speech signals that are severely band limited during regular telephone speech transmission. We have already proposed a spectrum widening method that utilizes aliasing in sampling rate conversion and digital filtering for spectrum shaping. This paper discusses the method using linear prediction. Speech components of the outbands of the received signal are basically generated by LPC (linear predictive coding) synthesis by analysis. Furthermore, we discuss a new spectrum widening method using a multilayer backpropagation neural network. It is shown that the proposed method has a good performance of recovering the wideband speech.
暂无评论