The fixed-lag smoothing problem with a partial lag is the problem in which the presence of the smoothing lag is allowed only in a part of estimation channels. This paper studies the effect of a partial smoothing lag o...
详细信息
The fixed-lag smoothing problem with a partial lag is the problem in which the presence of the smoothing lag is allowed only in a part of estimation channels. This paper studies the effect of a partial smoothing lag on the achievable H ∞ performance in the continuous-time case. In particular, the limit of the achievable performance is established and the saturation of the achievable performance for a finite smoothing lag is analyzed.
The non-linear nature of low-rate parametric speech coding has made it necessary to resort to formal subjective assessments for quantifying end-to-end voice quality of interconnected networks. At the same time, the ra...
详细信息
The non-linear nature of low-rate parametric speech coding has made it necessary to resort to formal subjective assessments for quantifying end-to-end voice quality of interconnected networks. At the same time, the rapid growth of cellular communications has highlighted the need to characterize transmission quality when cellular terminals are attached at the access or termination nodes of switched networks. In the paper the voice quality of interconnected North-American and Japanese digital cellular systems over public transmission facilities is quantified. From these assessments it is concluded that cellular networks using 8 kbit/s or 6.4 kbit/s VSELP may meet end-to-end quantization distortion criteria when interconnected with the switched network.
In this paper we introduce an efficient probabilistic neural networks (PNN) model-based voice activity detection (VAD) algorithm. The inputs for PNN are code excited linear prediction coder parameters, which are stabl...
详细信息
In this paper we introduce an efficient probabilistic neural networks (PNN) model-based voice activity detection (VAD) algorithm. The inputs for PNN are code excited linear prediction coder parameters, which are stable under background noise. The PNN network output is 1 or 0 to determine the nature of the period (speech or Nonspeech). Experimental results show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level. The performance compares very favorably with Adaptive MultiRate VAD, phase 2 (AMR2).
We present a novel speech encryption algorithm based on blind source separation (BSS). Our approach integrates a modified time domain scrambling scheme with an amplitude scrambling method which masks the speech signal...
详细信息
ISBN:
(纸本)0780386477
We present a novel speech encryption algorithm based on blind source separation (BSS). Our approach integrates a modified time domain scrambling scheme with an amplitude scrambling method which masks the speech signal with a random noise by specific mixing. The resulting system can securely encrypt the speech files for the purpose of storing speech messages and transmitting them over the Internet. There are two major advantages associated with this system. The first advantage is that it makes the encrypted speech sound like white noise. The second advantage is that it does not impose any restriction on the key space. Our system is systematically evaluated, and it shows a high level of security with excellent audio quality.
The paper presents a new depth-based view blending technique that avoids the problem of different fields of view corresponding to the input views that are used for synthesis of a virtual view. The idea consists in usi...
详细信息
The paper presents a new depth-based view blending technique that avoids the problem of different fields of view corresponding to the input views that are used for synthesis of a virtual view. The idea consists in using the depth associated with the input views in order to increase the quality of the finally blended view. The experiments, performed on high quality multi-view test sequences, show that the proposed method significantly improves the quality of synthesized views in systems with non-linearly arranged cameras.
speech coding is a representation of a digitized speech signal using as few bits as possible, while maintaining reasonable level of speech quality. Due to growing need for bandwidth conservation in wireless communicat...
详细信息
speech coding is a representation of a digitized speech signal using as few bits as possible, while maintaining reasonable level of speech quality. Due to growing need for bandwidth conservation in wireless communication, the research in speech coding has increased. Compressive Sensing (CS) is gaining a great interest because of its ability to recover original signals by taking only few measurements. CS is a new approach that is different from the common data acquisition methods. In this research, a new approach of speech encoding system is developed using compressive sensing. Since CS performs well in sparse signals, different sparsifying transforms is analyzed and compared using Gini coefficient. The quality of the speech coder is evaluated using Perceptual Evaluation of speech Quality (PESQ), Signal-to-Noise Ratio (SNR) and subjective listening tests. Results show that the speech coders have achieved a PESQ score of 3.16 at 4 kbps which is a good quality as listening tests confirms. The coder is also compared with Code Excited Linear Prediction (CELP) coder.
A suitable metric to characterize subjective speech quality is the mean opinion score (MOS). For a given test condition, its subjective rating is obtained for every trail as a numeric value in the following manner: Un...
详细信息
A suitable metric to characterize subjective speech quality is the mean opinion score (MOS). For a given test condition, its subjective rating is obtained for every trail as a numeric value in the following manner: Unsatisfactory=1; Poor=2; Fair=3; Good=4; and Excellent=5. The arithmetic average of these ratings over all trails constitutes the MOS of the given test condition. The measurement of MOS and the subjective quality results obtained for a specific 12 kb/s subband coder are reported.< >
Cochlear implants are an effective way to enable people with severe or profound hearing loss to be able to hear. It can help a person with profound hearing loss to function with people and places where hearing may be ...
详细信息
Cochlear implants are an effective way to enable people with severe or profound hearing loss to be able to hear. It can help a person with profound hearing loss to function with people and places where hearing may be required. Cochlear implants are a fine solution for severe to profound hearing loss, but there are problems that may accrue, and other solutions that need to be considered before a person makes the decision to get a cochlear implant for his or herself or a loved one
We propose a method of embedding data in images for secure communication of covert or sensitive information. The method employs an extension of the recent technique of imperceptible embedding in audio signals by inser...
详细信息
We propose a method of embedding data in images for secure communication of covert or sensitive information. The method employs an extension of the recent technique of imperceptible embedding in audio signals by inserting tones at perceptually masked frequencies. Instead of detecting visually masked frequencies in two-dimensions, a simpler approach was used by converting an image to a one-dimensional signal. Using the well-established audio frequency masking procedure, audibly masked frequencies at a chosen sampling frequency were determined for each segment or block. Embedding of given data was carried out by modifying the spectral power at a pair of commonly occurring masked frequencies. Preliminary results of embedding 1024 bits of random data in a 256 times 256 pixel black-and-white image show that the spectrum modification technique is viable and simple to process. The technique is useful for hiding a small amount of information on identification card images, credit card logos, etc. Payload can be increased at the cost of an acceptable level of image degradation
Many speech coders are based on linear prediction coding (LPC), nevertheless with LPC is not possible to model the nonlinearities present in the speech signal. Because of this there is a growing interest for nonlinear...
详细信息
Many speech coders are based on linear prediction coding (LPC), nevertheless with LPC is not possible to model the nonlinearities present in the speech signal. Because of this there is a growing interest for nonlinear techniques. In this paper we discuss ADPCM schemes with a nonlinear predictor based on neural nets, which yields an increase of 1-2.5dB in the SEGSNR over classical methods. This paper will discuss the block-adaptive and sample-adaptive predictions.
暂无评论