The problem of using neural networks for military vehicle classification on the basis of ground vibration is presented in this paper. One of the main elements of the system is unit called geophone. This unit allows to...
详细信息
The problem of using neural networks for military vehicle classification on the basis of ground vibration is presented in this paper. One of the main elements of the system is unit called geophone. This unit allows to measure ground vibrations in each direction for certain period of time. The value of amplitude is used to fix LPC values of each vehicle. Because the multilayer perceptron is used, the learning set has to be prepared. Please find attached the results of using neural network such as: example of learning, validation and test sets, structure of the networks and learning algorithm, learning and testing results.
We explore the use of the multi-frame GMM-based block quantiser for quantising line spectral frequencies for wideband speech coding. Its main advantages over vector quantisers are bitrate scalability and bitrate indep...
详细信息
We explore the use of the multi-frame GMM-based block quantiser for quantising line spectral frequencies for wideband speech coding. Its main advantages over vector quantisers are bitrate scalability and bitrate independent complexity. By concatenating multiple frames together, interframe correlation can be exploited by the KLT (Karhunen-Loeve transform), leading to better quantisation. A saving of up to 3 bits/frame can be achieved by switching the quantiser from memoryless mode to jointly quantising two frames, with only a moderate increase in complexity. This quantisation scheme achieves lower spectral distortion than the split-multistage vector quantiser in the AMR-WB speech codec, with transparent coding at 37 bits/frame.
The standard IEC 61024 provides information for the design of lightning protection of structures (LPS). In determining the position of the air-termination system two methods are generally suggested: the protective ang...
详细信息
The standard IEC 61024 provides information for the design of lightning protection of structures (LPS). In determining the position of the air-termination system two methods are generally suggested: the protective angle method and the rolling sphere method. The paper discusses some additional criteria for the LPS with reference to a case study of a complex of strategic buildings constituted by existing and historical structures. The paper suggests the mark map that allows to determine globally the interactions between the structures of the complex. It is a graphical method performed by rolling a sphere with a variable radius related to the protection level and to the height of the structure
It is well-known that wideband speech (0-7 kHz) provides better quality and intelligibility than narrowband speech (300-3400 Hz), but typically only narrowband speech information is available in current wireless commu...
详细信息
It is well-known that wideband speech (0-7 kHz) provides better quality and intelligibility than narrowband speech (300-3400 Hz), but typically only narrowband speech information is available in current wireless communication systems. Narrowband to wideband extension technology has been recently investigated to artificially generate wideband speech from narrowband speech for better speech quality and intelligibility. This paper presents a robust split-band narrowband to wideband extension system based on algorithmic enhancements to the codebook mapping technique for high-band parameter estimation. Numerical measurements confirm the performance improvements of the codebook mapping process, and informal listening evaluations show the potential of the system and its robustness to input distortions and non-speech input signals.
We present a novel approach to estimating the first two formants (F1 and F2) of a speech signal using graphical models. Using a graph that takes advantage of less commonly used features of Bayesian networks, both v-st...
详细信息
We present a novel approach to estimating the first two formants (F1 and F2) of a speech signal using graphical models. Using a graph that takes advantage of less commonly used features of Bayesian networks, both v-structures and soft evidence, the model presented here shows that it can learn to perform reasonably without large amounts of training data, even with minimal processing on the initial signal. It far outperforms a factorial HMM using the same assumptions and suggests that with further refinement the model may produce high quality formant tracks.
This paper proposed a fast algorithm for computing the line spectrum pairs (LSP) parameters that are widely used in speech coding systems. The first step of the proposed algorithm is to derive a quartic equation from ...
详细信息
This paper proposed a fast algorithm for computing the line spectrum pairs (LSP) parameters that are widely used in speech coding systems. The first step of the proposed algorithm is to derive a quartic equation from the 1st derivative of the given 5-degree LSP polynomial. Then the approximate extremes of the 5-degree LSP polynomial can be found by applying the proposed modified complex-free Ferrari's formula to the above quartic equation. By the use of these approximate extremes as the initial approximations, one can easily solve the roots of the 5-degree LSP polynomial via Newton's method and get the accurate LSP parameters. One of main advantages of the proposed algorithm is that the modified complex-free Ferrari's formula can rapidly determine the roots of a quartic equation and resulting in considerable computational saving. In comparison with other methods, the proposed algorithm can determine precise LSP parameters with the lowest computational complexity.
This paper describes continuous speech recognition experiments for Romanian language, by using HMM modeling. The following questions are to be discussed: the realization of a new front-end reconsidering linear predict...
详细信息
This paper describes continuous speech recognition experiments for Romanian language, by using HMM modeling. The following questions are to be discussed: the realization of a new front-end reconsidering linear prediction, the enhancement of recognition rates by context dependent modeling, the evaluation of training strategies ensuring speaker independence of the recognition process without speaker adaptation procedures, by speaker selection for training. The experiments lead to a development of the initial system with a promising front-end based on PLP coefficients, second ranked for the recognition performance obtained, near the first ranked front-end based on mel-frequency cepstral coefficients (MFCC), but far better as the last ranked, based on simple linear prediction. Concerning the implemented algorithm for context dependent modeling, it permits in all situations enhanced recognition rates. The experiments made with gender speaker selection enhanced under certain conditions the recognition rate, proving good generalization properties especially by training with the male speakers database
In this paper a mixed-split scheme is proposed in the context of 2D DPCM based LSF quantization scheme employing split vector product VQ mechanism. Experimental evaluation shows that the new scheme is successfully bei...
详细信息
In this paper a mixed-split scheme is proposed in the context of 2D DPCM based LSF quantization scheme employing split vector product VQ mechanism. Experimental evaluation shows that the new scheme is successfully being able to show better distortion performance than existing safety-net scheme for noisy channel even at considerably lower search complexity, by efficiently exploiting LSF trajectory behavior across the consecutive speech frames.
In order to deliver real time, high quality voice services, VoIP system designers must tackle the packet-loss problems that are inherent in packet-based networks. To combat the inevitable speech quality deterioration ...
详细信息
In order to deliver real time, high quality voice services, VoIP system designers must tackle the packet-loss problems that are inherent in packet-based networks. To combat the inevitable speech quality deterioration resulting from the loss of transmitted packets of speech information, techniques that provide estimates of the lost information that is needed by the speech recovery process are of considerable interest. Furthermore, in future VoIP systems employing LPC based speech coders, a significant percentage of the coded speech information will represent the values of LPC coefficients and thus a new probabilistic approach for estimating missing LPC filter coefficients is presented in this paper. This approach employs a new formulation of LSP recovery system architecture where dependent-multiple hidden Markov models with discrete densities (DM-HMM-D) operate in parallel. Each HMM processes sequences of received quantized vectors of LSP coefficients and, while allowing for the modeling of the inter-dependencies that exist between LPC coefficients, resulting maximum likelihood observation probabilities are used to provide the required estimates of missing LSPs. The proposed missing parameters estimation technique is generic and initial experimental results demonstrate its considerable potential in improving the quality of LPC based decoded speech in VoIP applications
The purpose of this work is to show the importance of an adequate generation of the excitation signal for the performance of bandwidth extension algorithms for speech signals. Two previously proposed methods of obtain...
详细信息
The purpose of this work is to show the importance of an adequate generation of the excitation signal for the performance of bandwidth extension algorithms for speech signals. Two previously proposed methods of obtaining the excitation signal are analyzed and, based on this analysis, a new method is proposed. The influence of each method in the quality of the reconstructed wideband speech signal is evaluated by quantitative parameters of speech quality.
暂无评论