The recent information explosion has led to massively increased demand for multimedia data storage and retrieval techniques. Content-based retrieval is an important alternative and complement to the traditional keywor...
详细信息
The recent information explosion has led to massively increased demand for multimedia data storage and retrieval techniques. Content-based retrieval is an important alternative and complement to the traditional keyword-based searching for multimedia data and can greatly enhance information management. For the last ten years, the Biomedical and multimedia Information Technology (BMIT) group and recently the Center for multimediasignalprocessing (CMSP) have conducted systematic studies and research activities on this topic. Some of the works relating to content-based image/video retrieval and their applications are briefly presented in this paper.
Error diffusion halftoning is a popular method of producing frequency modulated (FM) halftones. In FM halftoning the dot size and shape is fixed (equal to one pixel) and the dot frequency is varied in accordance to th...
详细信息
Error diffusion halftoning is a popular method of producing frequency modulated (FM) halftones. In FM halftoning the dot size and shape is fixed (equal to one pixel) and the dot frequency is varied in accordance to the graylevel values of the underlying grayscale image. We generalize error diffusion to produce FM halftones with user controlled dot size and shape using block quantization and a block filter in the feedback loop. We call this modified quantization and feedback process block error diffusion. The block filters are designed from well known scalar error filter prototypes and retain their properties. Further, we show that choosing a structured block filter results in an efficient parallel implementation of block error diffusion.
We optimize the noise shaping behavior of color error diffusion by designing an optimized error filter based on a proposed noise shaping model for color error diffusion and a generalized linear spatially-invariant mod...
详细信息
We optimize the noise shaping behavior of color error diffusion by designing an optimized error filter based on a proposed noise shaping model for color error diffusion and a generalized linear spatially-invariant model of the human visual system. Our approach allows the error filter to have matrix-valued coefficients and diffuse quantization error across channels in an opponent color representation. Thus, the noise is shaped into frequency regions of reduced human color sensitivity. To obtain the optimal filter, we derive a matrix version of the Yule-Walker equations which we solve by using a gradient descent algorithm.
This paper presents a new scheme for underwater target classification in a changing environment. An adaptive target classification system is developed that uses the decisions of multiple aspects of the objects. The sy...
详细信息
ISBN:
(纸本)0780370449
This paper presents a new scheme for underwater target classification in a changing environment. An adaptive target classification system is developed that uses the decisions of multiple aspects of the objects. The system employs a decision feedback mechanism to map the changed feature vector to a new feature space familiar to the classifier. Results on an acoustic backscattered data set, namely the 40 kHz data collected at Coastal Systems Station (CSS), are presented. This data set contains returns from six different objects at 72 aspect angles with 5 degrees separation and with varying signal-to-reverberation ratio (SRR). The results are then benchmarked with those of a neural network-based multi-aspect fusion system.
This paper highlights the artificial neural network (ANN) approach to perform the endpoint detection process, which involves the segmentation of speech signals from non-speech signals. Two ANN models have been propose...
详细信息
This paper highlights the artificial neural network (ANN) approach to perform the endpoint detection process, which involves the segmentation of speech signals from non-speech signals. Two ANN models have been proposed to perform endpoint detections of isolated digit utterances spoken in the Malay language: multilayer perceptron (MLP) and adaptive linear network (ADALINE). Results obtained from the ANN models are acoustically verified, visually checked and compared to the conventional method of endpoint detection. It was found that the endpoint detection accuracy using the MLP approach is very high and encouraging.
To generate high quality speech using the linear predictive coding (LPC) technique, a method for detecting pitch contour is critical since the human ear is sensitive to small pitch variation in speech. The auto-correl...
详细信息
To generate high quality speech using the linear predictive coding (LPC) technique, a method for detecting pitch contour is critical since the human ear is sensitive to small pitch variation in speech. The auto-correlation method, though simple to implement with digital signal processors (DSPs), can result in perceptible unnaturalness. This paper describes the cross-correlation technique that can be used to obtain the pitch information more accurately than the auto-correlation method for certain speech samples. Experimental results illustrate the pitch contour detected using both techniques. In general, the cross-correlation method generates less error than the auto-correlation method for pitch determination in an LPC scheme while having the advantage of requiring less computation.
This paper describes a neural network speech enhancement system using a noise cancellation technique. Clean speech signals of each uttered digit in the Malay language are sampled from a single speaker in an almost noi...
详细信息
This paper describes a neural network speech enhancement system using a noise cancellation technique. Clean speech signals of each uttered digit in the Malay language are sampled from a single speaker in an almost noise free environment. Noisy speech signals are obtained by adding random noise to the clean signals. Noise cancellation is then performed on the noisy signals by using the ADALINE. Performance evaluation of the ADALINE is based on two methods: the signal-to-noise ratio (SNR) and by visual and audio checking. The effectiveness of the ADALINE to perform noise cancellation is also compared to the multilayer perceptron (MLP) network speech enhancement system.
Functional imaging with dynamic positron emission tomography (PET) has been playing a crucial and expanding role in biomedical research and clinical diagnosis, providing image-wide quantitative and qualitative physiol...
详细信息
Functional imaging with dynamic positron emission tomography (PET) has been playing a crucial and expanding role in biomedical research and clinical diagnosis, providing image-wide quantitative and qualitative physiological functions in the human body, and supporting visualization of the distribution of these functions corresponding to anatomical structures. A number of parametric imaging algorithms have been developed. We give a brief study on some existing and our recently, developed techniques for generating parametric images. An integrated system for functional image data processing and visualization, and a Web-based application are presented.
In parallel with rapid advances in computer technology, biomedical functional imaging is having an ever-increasing impact on healthcare. Functional imaging allows us to see dynamic processes quantitatively in the livi...
详细信息
In parallel with rapid advances in computer technology, biomedical functional imaging is having an ever-increasing impact on healthcare. Functional imaging allows us to see dynamic processes quantitatively in the living human body. However, as we need to deal with four-dimensional time-varying images, space requirements and computational complexity are extremely high. This makes information management, processing, and communication difficult. Using the minimum amount of data to represent the required information, developing fast algorithms to process the data, organizing the data in such a way as to facilitate information management, and extracting the maximum amount of useful information from the recorded data have become important research tasks in biomedical information technology. For the last ten years, the Biomedical and multimedia Information Technology (BMIT) group and, recently, the Center for multimediasignalprocessing have conducted systematic studies on these topics. Some of the results relating to functional imaging data acquisition, compression, storage, management, processing, modeling, and simulation are briefly reported in this paper.
The emerging JBIG2 standard allows compliant encoders to achieve very high compression rates on bi-level images, especially when images are properly segmented into regions of line-art, halftones and text. We propose a...
详细信息
The emerging JBIG2 standard allows compliant encoders to achieve very high compression rates on bi-level images, especially when images are properly segmented into regions of line-art, halftones and text. We propose a fast method that is very effective at separating text from non-text regions, even when the regions are nonrectangular or have skew. Our method can also detect regions of reverse-coloured text. In most cases, our method increases the compression performance of the encoder. More importantly, our method can improve encoding speeds considerably, often by an order of magnitude.
暂无评论