This paper presents an audio visual (AV) person identification system using Linear Regression-based Classifier (LRC) for person identification. Class specific models are created by stacking q-dimensional speech and im...
详细信息
ISBN:
(纸本)9781467328210;9781467328203
This paper presents an audio visual (AV) person identification system using Linear Regression-based Classifier (LRC) for person identification. Class specific models are created by stacking q-dimensional speech and image vectors from the training data. The person identification task is considered a linear regression problem, i.e., a test (speech or image) feature vector is expressed as a linear combination of the (speech or image) model of the class it belongs to. The Euclidean distance between a test feature vector and the estimated response vectors for all the class specific models are used as matching scores. These matching scores from both modalities are normalized using the min-max score normalization technique and then combined using the the sum rule of fusion. The system was tested on 88 subjects from the AusTalk AV database. Experimental results show that the identification accuracy after AV fusion is higher compared to the identification accuracy of an individual modality.
This paper is concerned with investigating, experiencing, and validating a local adaptive threshold system with compound motion analysis. The motivation here is to analyze moving objects in outdoor/indoor video frames...
详细信息
ISBN:
(纸本)9781424471379
This paper is concerned with investigating, experiencing, and validating a local adaptive threshold system with compound motion analysis. The motivation here is to analyze moving objects in outdoor/indoor video frames with respect to: movement detection, objects segmentation, features extraction besides DFT-based velocity computation. The underlying methodology exhibits a single correlation of the behavioral-mathematical model of the examined image sequences among identifying the image as time-varying functions applicable for processing through 2D Discrete Fourier Transform (DFT). The justification of this method has been revealed through output data, human visual inspection, histogramming;showing appreciable accuracy, lower level of noise, and shorter segmentation time in comparison with some available standard techniques. The horizon of applications of the presented method may involve security control, industry and traffic control, surveillance, and general civil and military fields.
Exposure Fusion is a popular multi-exposure image fusion method which blends a set of differently exposed low dynamic range images of a scene to obtain another low dynamic range but contrast rich image. This approach ...
详细信息
ISBN:
(纸本)9781479948741
Exposure Fusion is a popular multi-exposure image fusion method which blends a set of differently exposed low dynamic range images of a scene to obtain another low dynamic range but contrast rich image. This approach carries out the integration process by using three local quality measures, namely contrast, saturation and exposedness. Our aim in this study is to extend the exposure fusion method by incorporating a novel visual saliency based quality measure. This new measure captures the parts of the scene that grabs our attention and gives more prominence to these salient regions, which is otherwise impassible 19, the previous measures in use. Our experiments show that, as compared to the exposure fusion method, our saliency-guided approach gives more vivid results and leads to sharp boundaries in the output images.
Block based texture synthesis algorithms have shown better results than others as they help to preserve the global structure. Previous research has proposed several approaches in the pixel domain, but little effort ha...
详细信息
ISBN:
(纸本)0819452114
Block based texture synthesis algorithms have shown better results than others as they help to preserve the global structure. Previous research has proposed several approaches in the pixel domain, but little effort has been taken in the synthesis of texture in a multiresolution domain. We propose a multiresolution framework in which coefficient-blocks of the spatio-frequency components of the input texture are efficiently stitched together to form the corresponding components of the output texture. We propose two algorithms to this effect. In the first, we use a constant block size throughout the algorithm. In the second, we adaptively split blocks so as to use the largest possible block size in order to preserve the global structure, while maintaining the mismatched error of the overlapped boundaries below a certain error tolerance. Special consideration is given to minimization of the computational cost, throughout the algorithm designs. We show that the adaptation of the multiresolution approach results in a fast, cost-effective, flexible texture synthesis algorithm that is capable of being used in modern, bandwidth-adaptive, real-time imaging applications. A collection of regular and stochastic test textures is used to prove the effectiveness of the proposed algorithm.
Rapid 3D reconstruction of dynamic scenes is very useful in 3D object structure analysis, accident avoidance for UAV, and other visual applications. Against dynamic scenes, coded structured light methods have been pro...
详细信息
ISBN:
(纸本)9781728180687
Rapid 3D reconstruction of dynamic scenes is very useful in 3D object structure analysis, accident avoidance for UAV, and other visual applications. Against dynamic scenes, coded structured light methods have been proposed to obtain the depth information of an object in 3D world, and most of them are based on spatial codification. A brutal truth is that two or more cameras and projectors from different viewpoints are needed to measure the dynamic scene simultaneously for rapid 3D reconstruction. However, when two traditional patterns, especially the binaries, are mutually overlapped, interference between them arises to a new challenge to 3D reconstruction. Traditional patterns can hardly be separated from each other, which surely influence the quality of the 3D reconstruction. To eliminate the interference problem, we propose a scheme of orthogonal coded multi-view structured light systems, which can obtain accurate of depth maps for a scene. Besides, we also test the stability of the orthogonal patterns by establishing three different scenes and making a comparisons to traditional patterns. New state-of-the-art results can be obtained by our scheme in the experiments.
Advances in image quality assessment have shown the potential added value of including visual attention aspects in objective quality metrics. Numerous models of visual saliency are implemented and integrated in differ...
详细信息
ISBN:
(纸本)9781479961399
Advances in image quality assessment have shown the potential added value of including visual attention aspects in objective quality metrics. Numerous models of visual saliency are implemented and integrated in different quality metrics;however, their ability of improving a metric's performance in predicting perceived image quality is not fully investigated. In this paper, we conduct an exhaustive comparison of 20 state-of-the-art saliency models in the context of image quality assessment. Experimental results show that adding computational saliency is beneficial to quality prediction in general terms. However, the amount of performance gain that can be obtained by adding saliency in quality metrics highly depends on the saliency model and on the metric.
In this study, a product suggestion engine has been developed for an e-commerce site portfolio focusing on garment items. Traditional collaborative filtering methods usually lack applicability due to high turnover rat...
详细信息
ISBN:
(纸本)9781728172064
In this study, a product suggestion engine has been developed for an e-commerce site portfolio focusing on garment items. Traditional collaborative filtering methods usually lack applicability due to high turnover rates in product lists. Thus, by focusing on visual similarity using deep leraning technique successful results were obtained and it has been concluded that application of this technique to real live e-commerce garment site will be suitable.
We propose a non-iterative, globally optimal dense motion field estimation technique based on a multiresolutional probability model. We consider the field to be estimated in terms of its wavelet coefficients and carry...
详细信息
ISBN:
(纸本)0819452114
We propose a non-iterative, globally optimal dense motion field estimation technique based on a multiresolutional probability model. We consider the field to be estimated in terms of its wavelet coefficients and carry out the estimation in the field's wavelet transform domain. Our approach models interscale dependencies of the wavelet coefficients and allows for smooth. edge, and occluded regions in the field. We obtain segmentations of the field and our results show that the field estimates yield accurate depictions of scene motion. The globally optimal nature of our estimation framework allows it to be applicable in scenes exhibiting large motion and in settings of ill-posed motion. Hence, our algorithms can also be used to determine accurate initializations for optical flow type estimation techniques, which use more sophisticated models but can only obtain locally optimal solutions that are heavily dependent on initial conditions. The performance is illustrated on several examples.
The visual navigation system for a mobile patrol robot using imageprocessing by FPGA and real-time Linux is presented. The CMOS image sensor and the stepper motors driver ICs are connected to external I/O ports of th...
详细信息
ISBN:
(纸本)9780769546001
The visual navigation system for a mobile patrol robot using imageprocessing by FPGA and real-time Linux is presented. The CMOS image sensor and the stepper motors driver ICs are connected to external I/O ports of the FPGA. The imageprocessing and motor drive circuits are implemented into the reconfigurable device as original logic. The image capture circuit applies state machine and FIFO memory buffer to adjust timing for pixel data transmission. The motor drive circuit generates clock signals for steps according to the value from processor in the FPGA. The realtime device driver has been developed for the linkage between flexible hardware circuits and real-time software applications for robot vision purpose.
Fractal image compression is computationally expensive. Therefore speedup techniques are required to achieve time demands comparable to other compression techniques. In this paper we combine sequential and parallel te...
详细信息
ISBN:
(纸本)0819424358
Fractal image compression is computationally expensive. Therefore speedup techniques are required to achieve time demands comparable to other compression techniques. In this paper we combine sequential and parallel techniques suitable for MIMD architectures which moves this compression scheme closer to real-time processing. The algorithms introduced are especially designed for memory-critical environments.
暂无评论