This paper presents a comparison of three techniques for dimensionally reduction in feature analysis for automatic speech recognition (ASR). All three approaches estimate a linear transformation that is applied to con...
详细信息
This tutorial paper describes the methods for constructing fast algorithms for the computation of the discrete Fourier transform (DFT) of a real-valued series. The application of these ideas to all the major fast Four...
详细信息
This tutorial paper describes the methods for constructing fast algorithms for the computation of the discrete Fourier transform (DFT) of a real-valued series. The application of these ideas to all the major fast Fourier transform (FFT) algorithms is discussed, and the various algorithms are compared. We present a new implementation of the real-valued split-radix FFT, an algorithm that uses fewer operations than any other real-valued power-of-2-length FFT. We also compare the performance of inherently real-valued transform algorithms such as the fast Hartley transform (FHT) and the fast cosine transform (FCT) to real-valued FFT algorithms for the computation of power spectra and cyclic convolutions. Comparisons of these techniques reveal that the alternative techniques always require more additions than a method based on a real-valued FFT algorithm and result in computer code of equal or greater length and complexity.
Hybrid Hidden Markov Models (HMM) and Multi-Layer Perceptron (MLP) neural networks have been applied with great success in speech recognition problems. The hybrid system can be applied to sequence classification probl...
详细信息
Hybrid Hidden Markov Models (HMM) and Multi-Layer Perceptron (MLP) neural networks have been applied with great success in speech recognition problems. The hybrid system can be applied to sequence classification problems, where multiple looks at an object are used to determine class membership. This presents a utility to perform feature-level fusion in such problems. A new gradient descent algorithm is employed to find optimal parameters within the HMM/MLP model. This scheme has been applied to a data set which contains sonar backscattered signals for four underwater objects for classification as mine-like or non-mine-like.
A new method for the reduction of the number of colors in a digital image is proposed. The new method is based on the developed of a new neural network classifier that combines the advantages of the Growing Neural Gas...
详细信息
In this paper, the authors exploit a multispectral image representation to perform more accurate document image binarisation compared to previous color representations. In the first stage, image fusion is employed to ...
详细信息
In order to successfully locate and retrieve document images such as technical articles and newspapers, a text localization technique must be employed. The proposed method detects and extracts homogeneous text areas i...
详细信息
ISBN:
(纸本)9780889868243
In order to successfully locate and retrieve document images such as technical articles and newspapers, a text localization technique must be employed. The proposed method detects and extracts homogeneous text areas in document images indifferent to font types and size by using connected components analysis to detect blocks of foreground objects. Next, a descriptor that consists of a set of structural features is extracted from the merged blocks and used as input to a trained Support Vector Machines (SVM). Finally, the output of the SVM classifies the block as text or not.
Most of the existing document-binarization techniques deal with many parameters that require a priori setting of their values. Due to the unknown of the ground-truth images, the evaluation of document binarization tec...
详细信息
ISBN:
(纸本)9780889867192
Most of the existing document-binarization techniques deal with many parameters that require a priori setting of their values. Due to the unknown of the ground-truth images, the evaluation of document binarization techniques is subjective and employs human observers for the estimation of the appropriate parameter values. The selection of the appropriate values for these parameters is crucial and influences to the final binarization. However, there is no predetermined set of parameters that guarantees optimal binarization for all document images. This paper proposes a new technique that allows the estimation of proper parameters values for each one of the document binarization techniques. The proposed approach is based on a statistical performance analysis of a set of binarization results, which are obtained by applying various binarization techniques with different parameter values. The proposed statistical performance analysis can also depicts the best document binarization result obtained by a set of document binarization techniques.
In this letter, we introduce a novel noise-robust modification method for Gaussian-based models to enhance the performance of radar high-resolution range profile (HRRP) recognition under the test condition of low sign...
详细信息
This paper describes a unique cross-phoneme speaker identification experiment, using deliberately mismatched phoneme sets for training and testing. The underlying goal is to identify features that represent broad indi...
详细信息
This article introduces a novel deep-learning based framework, Super-resolution/Denoising network (SDNet), for simultaneous denoising and super-resolution of swept-source optical coherence tomography (SS-OCT) images. ...
详细信息
This article introduces a novel deep-learning based framework, Super-resolution/Denoising network (SDNet), for simultaneous denoising and super-resolution of swept-source optical coherence tomography (SS-OCT) images. The novelty of this work lies in the hybrid integration of data-driven deep-learning with a model-informed noise representation, specifically designed to address the very low signal-to-noise ratio (SNR) and low-resolution challenges in SS-OCT imaging. SDNet introduces a two-step training process, leveraging noise-free OCT references to simulate low-SNR conditions. In the first step, the network learns to enhance noisy images by combining denoising and super-resolution within noise-corrupted reference domain. To refine its performance, the second step incorporates Principle Component Analysis (PCA) as self-supervised denoising strategy, eliminating the need for ground-truth noisy image data. This unique approach enhances SDNet's adaptability and clinical relevance. A key advantage of SDNet is its ability to balance contrast-texture by adjusting the weights of the two training steps, offering clinicians flexibility for specific diagnostic needs. Experimental results across diverse datasets demonstrate that SDNet surpasses traditional model-based and data-driven methods in computational efficiency, noise reduction, and structural fidelity. The framework excels in improving both image quality and diagnostic accuracy. Additionally, SDNet shows promising adaptability for analyzing low-resolution, low-SNR OCT images, such as those from patients with diabetic macular edema (DME). This study establishes SDNet as a robust, efficient, and clinically adaptable solution for OCT image enhancement addressing critical limitations in contemporary imaging workflows.
暂无评论