This paper presents a theoretical analysis of the structure of wide angle, ultra-wideband SAR images formed by a constant integration angle backprojection image former. It is shown that the effects of the image former...
详细信息
Real world sounds are ubiquitous and form an important part of the edifice of our cognitive abilities. Their perception combines signatures from spectral and temporal domains, among others, yet traditionally their ana...
详细信息
Human identification using unobtrusive methods is a challenging problem that has many applications in surveillance tasks. In this work we propose a set of biometric features extracted from a footstep audio signal that...
详细信息
As we acquire more digitized copies of musical recordings, it becomes increasingly necessary to have the assistance of a computer in sorting through the information that it stores. In this paper, we propose a new syst...
详细信息
This paper presents a new memory-efficient distributed arithmetic (DA) architecture for high-order FIR filters. The proposed architecture is based on a memory reduction technique for DA look-up-tables (LUTs);it requir...
详细信息
Given a baseline speech coder and speech with an available phonetic class segmentation, a number of potential enhancements to that coder become possible. While the quality of speech segmentation by phoneme and phoneti...
详细信息
Given a baseline speech coder and speech with an available phonetic class segmentation, a number of potential enhancements to that coder become possible. While the quality of speech segmentation by phoneme and phonetic class is constantly improving, we use TIMIT to generate phonetic class segmentation as a basis for initial testing of these techniques. Using coders drawn from the MELP family, we explore specialized phonetic codebooks, phonetically-driven superframing, and improved modeling of specific phonetic classes and the transitions between them. We compare the reconstructed speech from these enhancements against the base coder using the metrics of computational cost, transmission cost, and the quality of the reconstructed speech. In most cases, we find that segmentation-based coders can produce speech with quality comparable to that of MELP, using fewer transmitted bits and at no additional computational cost. With phonetic codebooks and transition modeling, CCR tests show these segmentation-based coders produce speech of better quality than is produced by MELP.
Model based speech coders such as the mixed-excitation linear prediction (MELP) coder encode parameters of the autoregressive model for short-duration frames of the speech signal. Typically, parameters extracted from ...
详细信息
Model based speech coders such as the mixed-excitation linear prediction (MELP) coder encode parameters of the autoregressive model for short-duration frames of the speech signal. Typically, parameters extracted from successive frames by the MELP coder exhibit strong correlation. Reduction in the transmitted data-rates can be achieved if the encoders for these parameters effectively exploit this inter-frame correlation. In this paper, we apply a procedure, called dynamic codebook re-ordering (DCR) to reduce the entropy in the distribution of the symbols generated by the vector quantization encoders used in coding the MELP parameters. The entropy reduction is achieved by exploiting the correlation between the vectors of MELP parameters derived from successive speech frames. The advantages of the DCR procedure over other techniques that exploit inter-frame correlation stem from the fact that it significantly reduces the data-rates without introducing any additional coding delays or increasing the distortion and it is simple and elegant.
This paper discusses the application of hidden Markov models (HMMs) to solve the translational and rotational invariant automatic target recognition (TRIATR) problem associated with SAR imagery. This approach is based...
详细信息
The theory of linguistics teaches us the existence of a hierarchical structure in linguistic expressions, from letter to word root, and on to word and sentences. By applying syntax and semantics beyond words, one can ...
详细信息
The theory of linguistics teaches us the existence of a hierarchical structure in linguistic expressions, from letter to word root, and on to word and sentences. By applying syntax and semantics beyond words, one can further recognize the grammatical relationship between among words and the meaning of a sequence of words. This layered view of a spoken language is useful for effective analysis and automated processing. Thus, it is interesting to ask if a similar hierarchy of representation of visual information does exist. A class of techniques that have a similar nature to the linguistic parsing is found in the Lempel-Ziv incremental parsing scheme. Based on a new class of multidimensional incremental parsing algorithms extended from the Lempel-Ziv incremental parsing, a new framework for image retrieval, which takes advantage of the source characterization property of the incremental parsing algorithm, was proposed recently. With the incremental parsing technique, a given image is decomposed into a number of patches, called a parsed representation. This representation can be thought of as a morphological interface between elementary pixel and a higher level representation. In this work, we examine the properties of two-dimensional parsed representation in the context of imagery information retrieval and in contrast to vector quantization;i.e. fixed square-block representations and minimum average distortion criteria. We implemented four image retrieval systems for the comparative study;three, called IPSILON image retrieval systems, use parsed representation with different perceptual distortion thresholds and one uses the convectional vector quantization for visual pattern analysis. We observe that different perceptual distortion in visual pattern matching does not have serious effects on the retrieval precision although allowing looser perceptual thresholds in image compression result poor reconstruction fidelity. We compare the effectiveness of the use of the pars
Adaptive detection has a rich history in the radar community, and a number of other areas have borrowed heavily from constructs developed in this field. The task of target detection in hyperspectral imaging (HSI) is o...
详细信息
暂无评论