Real world sounds are ubiquitous and form an important part of the edifice of our cognitive abilities. Their perception combines signatures from spectral and temporal domains, among others, yet traditionally their ana...
详细信息
Human identification using unobtrusive methods is a challenging problem that has many applications in surveillance tasks. In this work we propose a set of biometric features extracted from a footstep audio signal that...
详细信息
This article introduces a novel deep-learning based framework, Super-resolution/Denoising network (SDNet), for simultaneous denoising and super-resolution of swept-source optical coherence tomography (SS-OCT) images. ...
详细信息
This article introduces a novel deep-learning based framework, Super-resolution/Denoising network (SDNet), for simultaneous denoising and super-resolution of swept-source optical coherence tomography (SS-OCT) images. The novelty of this work lies in the hybrid integration of data-driven deep-learning with a model-informed noise representation, specifically designed to address the very low signal-to-noise ratio (SNR) and low-resolution challenges in SS-OCT imaging. SDNet introduces a two-step training process, leveraging noise-free OCT references to simulate low-SNR conditions. In the first step, the network learns to enhance noisy images by combining denoising and super-resolution within noise-corrupted reference domain. To refine its performance, the second step incorporates Principle Component Analysis (PCA) as self-supervised denoising strategy, eliminating the need for ground-truth noisy image data. This unique approach enhances SDNet's adaptability and clinical relevance. A key advantage of SDNet is its ability to balance contrast-texture by adjusting the weights of the two training steps, offering clinicians flexibility for specific diagnostic needs. Experimental results across diverse datasets demonstrate that SDNet surpasses traditional model-based and data-driven methods in computational efficiency, noise reduction, and structural fidelity. The framework excels in improving both image quality and diagnostic accuracy. Additionally, SDNet shows promising adaptability for analyzing low-resolution, low-SNR OCT images, such as those from patients with diabetic macular edema (DME). This study establishes SDNet as a robust, efficient, and clinically adaptable solution for OCT image enhancement addressing critical limitations in contemporary imaging workflows.
As we acquire more digitized copies of musical recordings, it becomes increasingly necessary to have the assistance of a computer in sorting through the information that it stores. In this paper, we propose a new syst...
详细信息
This paper presents a new memory-efficient distributed arithmetic (DA) architecture for high-order FIR filters. The proposed architecture is based on a memory reduction technique for DA look-up-tables (LUTs);it requir...
详细信息
Given a baseline speech coder and speech with an available phonetic class segmentation, a number of potential enhancements to that coder become possible. While the quality of speech segmentation by phoneme and phoneti...
详细信息
Given a baseline speech coder and speech with an available phonetic class segmentation, a number of potential enhancements to that coder become possible. While the quality of speech segmentation by phoneme and phonetic class is constantly improving, we use TIMIT to generate phonetic class segmentation as a basis for initial testing of these techniques. Using coders drawn from the MELP family, we explore specialized phonetic codebooks, phonetically-driven superframing, and improved modeling of specific phonetic classes and the transitions between them. We compare the reconstructed speech from these enhancements against the base coder using the metrics of computational cost, transmission cost, and the quality of the reconstructed speech. In most cases, we find that segmentation-based coders can produce speech with quality comparable to that of MELP, using fewer transmitted bits and at no additional computational cost. With phonetic codebooks and transition modeling, CCR tests show these segmentation-based coders produce speech of better quality than is produced by MELP.
Model based speech coders such as the mixed-excitation linear prediction (MELP) coder encode parameters of the autoregressive model for short-duration frames of the speech signal. Typically, parameters extracted from ...
详细信息
Model based speech coders such as the mixed-excitation linear prediction (MELP) coder encode parameters of the autoregressive model for short-duration frames of the speech signal. Typically, parameters extracted from successive frames by the MELP coder exhibit strong correlation. Reduction in the transmitted data-rates can be achieved if the encoders for these parameters effectively exploit this inter-frame correlation. In this paper, we apply a procedure, called dynamic codebook re-ordering (DCR) to reduce the entropy in the distribution of the symbols generated by the vector quantization encoders used in coding the MELP parameters. The entropy reduction is achieved by exploiting the correlation between the vectors of MELP parameters derived from successive speech frames. The advantages of the DCR procedure over other techniques that exploit inter-frame correlation stem from the fact that it significantly reduces the data-rates without introducing any additional coding delays or increasing the distortion and it is simple and elegant.
The theory of linguistics teaches us the existence of a hierarchical structure in linguistic expressions, from letter to word root, and on to word and sentences. By applying syntax and semantics beyond words, one can ...
详细信息
The theory of linguistics teaches us the existence of a hierarchical structure in linguistic expressions, from letter to word root, and on to word and sentences. By applying syntax and semantics beyond words, one can further recognize the grammatical relationship between among words and the meaning of a sequence of words. This layered view of a spoken language is useful for effective analysis and automated processing. Thus, it is interesting to ask if a similar hierarchy of representation of visual information does exist. A class of techniques that have a similar nature to the linguistic parsing is found in the Lempel-Ziv incremental parsing scheme. Based on a new class of multidimensional incremental parsing algorithms extended from the Lempel-Ziv incremental parsing, a new framework for image retrieval, which takes advantage of the source characterization property of the incremental parsing algorithm, was proposed recently. With the incremental parsing technique, a given image is decomposed into a number of patches, called a parsed representation. This representation can be thought of as a morphological interface between elementary pixel and a higher level representation. In this work, we examine the properties of two-dimensional parsed representation in the context of imagery information retrieval and in contrast to vector quantization;i.e. fixed square-block representations and minimum average distortion criteria. We implemented four image retrieval systems for the comparative study;three, called IPSILON image retrieval systems, use parsed representation with different perceptual distortion thresholds and one uses the convectional vector quantization for visual pattern analysis. We observe that different perceptual distortion in visual pattern matching does not have serious effects on the retrieval precision although allowing looser perceptual thresholds in image compression result poor reconstruction fidelity. We compare the effectiveness of the use of the pars
This paper discusses the application of hidden Markov models (HMMs) to solve the translational and rotational invariant automatic target recognition (TRIATR) problem associated with SAR imagery. This approach is based...
详细信息
Adaptive detection has a rich history in the radar community, and a number of other areas have borrowed heavily from constructs developed in this field. The task of target detection in hyperspectral imaging (HSI) is o...
详细信息
暂无评论