A relative entropic thresholding approach was recently developed by Chang et al. (see Pattern Recognition, vol. 27, no. 9, p. 1275-1289, 1994). This paper extends Chang et al.'s approach to two more relative entro...
详细信息
A relative entropic thresholding approach was recently developed by Chang et al. (see Pattern Recognition, vol. 27, no. 9, p. 1275-1289, 1994). This paper extends Chang et al.'s approach to two more relative entropy-based thresholding methods, called local relative entropy thresholding (LRE) and joint relative entropy thresholding (JRE). Since relative entropy based methods are sensitive to sparse image histograms, a histogram compression and translation is suggested to compact the histogram. In order to achieve an objective assessment, uniformity and shape measures are introduced for performance evaluation. Experimental results show that when image histograms are sparse, with the proposed histogram compression and translation, JRE and LRE generally perform better than Chang et al.'s approach.
Thresholding video images is very challenging due to the fact that image background generally has low resolution and is also more complicated and highly distorted than document images. As a result, thresholding method...
详细信息
Thresholding video images is very challenging due to the fact that image background generally has low resolution and is also more complicated and highly distorted than document images. As a result, thresholding methods that work well for document images may not work effectively for video images in some applications. This paper investigates the issue of thresholding video images for text detection and further develops a relative entropy-based thresholding approach that can effectively extract text from complicated video images. In order to demonstrate its performance a comparative study is conducted among the proposed thresholding method and several thresholding techniques which are widely used for document and gray scale images. The experimental results show that thresholding video images is far more difficult than thresholding document images and simple histogram-based methods generally do not perform well.
This paper presents an operational rate-distortion (ORD) optimal approach for skeleton-based boundary encoding. The boundary information is first decomposed into skeleton and distance signals, by which a more efficien...
详细信息
This paper presents an operational rate-distortion (ORD) optimal approach for skeleton-based boundary encoding. The boundary information is first decomposed into skeleton and distance signals, by which a more efficient representation of the original boundary results. Curves of arbitrary order are utilized for approximating the skeleton and distance signals. For a given bit budget for a video frame, we solve the problem of choosing the number and location of the control points for all skeleton and distance signals and for all boundaries within a frame, so that the overall distortion is minimized. The problem is solved with the use of Lagrangian relaxation and a shortest path algorithm in a 4D directed acyclic graph (DAG) we propose. By defining a path selection pattern, we reduce the computational complexity of the 4D DAG shortest path algorithm from O(N/sup -5/) to O(N/sup -4/), where N is the number of admissible control points for a skeleton. A suboptimal solution is also presented for further reducing the computational complexity of the algorithm to O(N/sup -2/). The proposed algorithm outperforms experimentally other competing algorithms.
Voice morphing is the process of gradually transforming the voice of a given speaker to that of another. The ability to change the speaker's individual characteristics and produce high-quality voices can be used i...
详细信息
Voice morphing is the process of gradually transforming the voice of a given speaker to that of another. The ability to change the speaker's individual characteristics and produce high-quality voices can be used in many applications. For example, in multimedia and video entertainment, voice morphing is just like its visual counterpart: while seeing a face gradually changing from one person's to another's, we can simultaneously hear the voice changing as well. Another application could be in forensic voice identification: creating a voice-bank of different pitches, rates, and timbres, to assist in recognition of the suspect's voice. In this study we present a new technique, which enables the production of N intermediate voices that gradually change between voices of two speakers, or one voice signal that changes gradually. This technique is based on two components. One is creating a 3D prototype waveform interpolation (PWI) surface from the residual error ' signal, which is estimated from LPC analysis, to produce a new intermediate excitation signal. The second component is a representation of the vocal tract by a lossless tube area function, and interpolation of the two speakers' parameters.
Orthogonal subspace projection (OSP) and generalized likelihood ratio test (GLRT) have shown success in hyperspectral image classification. The OSP is derived by maximizing signal-to-noise ratio (SNR) resulting from a...
详细信息
Orthogonal subspace projection (OSP) and generalized likelihood ratio test (GLRT) have shown success in hyperspectral image classification. The OSP is derived by maximizing signal-to-noise ratio (SNR) resulting from a linear mixture model in which the noise is assumed to be white. On the other hand, the GLRT is formulated based on a signal detection model that can be described by a binary hypothesis testing problem. In order for the GLRT to derive an analytical form, the noise in the signal detection model is generally assumed to be white Gaussian noise. However, Gaussianity is generally not true in remotely sensed imagery. Interestingly, such assumption has not been investigated. This paper presents a comparative study between OSP and GLRT based on their assumptions. In particular, a detailed analysis of assumptions made on these two approaches is conducted through a series of computer simulations. Experimental results show that the OSP does not depend on Gaussian noise. By the contrast, the GLRT is affected by the Gaussian noise assumption. If it is violated, its performance is degraded.
Vertical Bell Laboratories Layered Space-Time (V-BLAST) is a promising system that realizes the enormous capacity of multiple-input multiple-output (MIMO) communications. We present an extension of V-BLAST, and propos...
详细信息
ISBN:
(纸本)0780375890
Vertical Bell Laboratories Layered Space-Time (V-BLAST) is a promising system that realizes the enormous capacity of multiple-input multiple-output (MIMO) communications. We present an extension of V-BLAST, and propose an effective transmit power allocation scheme for the extended system. The proposed transmit power allocation scheme minimizes the bit error rate (BER) averaged over all detection stages, and requires small feedback overhead from the receiver to the transmitter. Simulation results show that the extended V-BLAST system with the proposed transmit power allocation scheme provides a significant reduction in the BER compared to the conventional V-BLAST system. When the minimum mean square error (MMSE) nulling is adopted, the extended V-BLAST system is found to achieve the BER performance comparable to that of the maximum likelihood (ML) detection for the conventional V-BLAST architecture.
The majority of screening mammograms are normal. It will be beneficial if a detection system is designed to help radiologists readily identify normal regions of mammograms. In this paper, we will present a binary tree...
详细信息
The majority of screening mammograms are normal. It will be beneficial if a detection system is designed to help radiologists readily identify normal regions of mammograms. In this paper, we will present a binary tree classifier based on the use of global features extracted from different levels of a 2-D Quincunx wavelet decomposition of normal and abnormal regional images. This classifier is then used to classify whether an entire whole-field mammogram is normal. This approach is fundamentally different from other approaches that identify a particular abnormality in that is independent of the particular type of abnormality.
We propose a text scanner which detects wide text strings in a sequence of scene images. For scene text detection, we use a multiple-CAMShift algorithm on a text probability image produced by a multi-layer perceptron....
详细信息
ISBN:
(纸本)076951695X
We propose a text scanner which detects wide text strings in a sequence of scene images. For scene text detection, we use a multiple-CAMShift algorithm on a text probability image produced by a multi-layer perceptron. To provide enhanced resolution of the extracted text images, we perform the text detection process after generating a mosaic image in a fast and robust image registration method.
We compared the performance of two recently developed vector quantization algorithms using different optimization criteria for clustering, namely, the adaptive fuzzy leader clustering, a neuro-fuzzy algorithm, and the...
详细信息
ISBN:
(纸本)0780370449
We compared the performance of two recently developed vector quantization algorithms using different optimization criteria for clustering, namely, the adaptive fuzzy leader clustering, a neuro-fuzzy algorithm, and the deterministic annealing, another unsupervised clustering algorithm based on probabilistic and statistical physics frameworks, with the rate distortion criterion as a performance measure. Such a comparison is useful for evaluating the efficiency of clustering algorithms for the purpose of image vector quantization instead of the conventional misclassification evaluation. This method is extended from the analysis of image coding in a spatial domain to sample vectors in the wavelet domain with predictable distribution. These sample vectors possess a multidimensional generalized Gaussian distribution through the new multi-scale feature extraction method. Our preliminary results show much improvement on the reconstructed image quality over JPEG.
A new image representation by support vector regression (SVR) is introduced. After a grey level image is approximated as a continuous function using SVR, which maps a 2D pixel coordinate input into a 1D pixel grey lev...
详细信息
ISBN:
(纸本)0780370449
A new image representation by support vector regression (SVR) is introduced. After a grey level image is approximated as a continuous function using SVR, which maps a 2D pixel coordinate input into a 1D pixel grey level output, the image can then be expressed in terms of the extracted support vectors and their corresponding Lagrange multipliers. The image is reconstructed by a linear combination of kernels with weights equal to the values of Lagrange multipliers. With support vector representation, we can observed that: 1) it is able to remove noise from image, the denoising effect of SVR representation is implicit during image encoding, and it can be controlled by the SVR training parameters; 2) if a Gaussian RBF kernel is used in SVR representation, Gaussian smoothing can be easily implemented by increasing the variance of kernel during image reconstruction and sharpening can be done by reducing the variance.
暂无评论